Wednesday, 6 February 2008

Blogger.com has changed their feed syndication

It seems that Blogger has changed the type of syndication feed they use during the last month (Jan – Feb 2008), I discovered this when my Atom feed XSLT transformation broke when I published my last post.
I originally wrote XSLT to transform the previous feed type for my homepage blog updates, which assumed the following heirachy with the Atom 0.3 namespace:

<feed>...<entry>...</entry></feed>

Whereas the latest feed has changed to use both Atom and openSearch namespaces and the following structure:

<rss>...<channel>...<item>...</item></channel></rss>

The root node seems to suggest it is the RSS 2.0 standard, using the Atom namespace, which is peculiar, notice the openSearch namespace too...

<rss xmlns:atom='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' version='2.0'>

Here's my updated XSLT to convert the new Blogger.com format.



<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/">

<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="channel">
<div id="FeedSnippets">
<xsl:apply-templates select="item" />
</div>
</xsl:template>


<xsl:template match="item" name="item">
<xsl:if test="position()<6">
<h4>
<xsl:value-of select="title"/>
</h4>
<p>
<xsl:choose>
<xsl:when test="string-length(substring-before(atom:summary,'. ')) > 0">
<xsl:value-of select="substring-before(atom:summary,'. ')" />...<br />
</xsl:when>
<xsl:when test="string-length(substring-before(atom:summary,'.')) > 0">
<xsl:value-of select="substring-before(atom:summary,'.')" />...<br />
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="substring(atom:summary,0,200)" />...<br />
</xsl:otherwise>
</xsl:choose>
<strong>Read full post: </strong>
<a href="{link}">
<xsl:value-of select="title"/>
</a>
</p>
<hr />
</xsl:if>
</xsl:template>
</xsl:stylesheet>

Labels: , , ,

Sunday, 30 September 2007

Adsense Allowed Sites Flags Up Google Cache Views As Unauthorised

When I read about the new Google Adsense feature "Allowed Sites" a couple of weeks ago, I thought I'd set it up on my account just to make sure no sites were displaying my Adsense code on their own sites, which could end up getting my account banned or flagged as suspicious due to factors outside my control.
Let's face it, if they're displaying my Adsense code, they've probably scraped or copied my site content without my consent, so who knows what else they may be up to!

Anyway I logged into Adsense recently and decided to check out the Allowed Sites page, and this is what I read...

There are unauthorized sites that have displayed ads using your AdSense publisher ID within the last week. Please click here to view them.

So I did click here, but all I got were some IP addresses:


Site URL
72.14.253.104
64.233.183.104
72.14.235.104
209.85.129.104
66.102.9.104
216.239.59.104
209.85.135.104
64.233.169.104
64.233.167.104

A little intrigued to what these IP addresses were, I decided to investigate further by issuing a trace route command to glean some more information.

C:\Documents and Settings\Nik>tracert 64.233.183.104

The trace route results resolved the IP addresses all to Google. I'm guessing that these are in my list because of people viewing my sites in Google's cached pages; So panic over!
Would be good if Google could filter out it's own IP addresses from the list though, so I don't have to check out each IP individually.

Labels: , ,

Saturday, 15 September 2007

Blogger console quirk publishes German link text

I just finished publishing a post to my blog and noticed this strange Blogger quirk, which seems to have confused the localisation of the page and published some link text in German.

Has anyone else noticed this?


Labels: ,

Google add feature to stem stolen Adsense publisher code

Google have added an "Allowed Sites" feature in the Adsense console to stem a problem that has been talked about for a while.
Lots publishers have had their site content stolen and re-purposed in an almost identical fashion on another domain, specifically to earn the criminal money from advertising without spending time and effort writing content themselves.
In some cases the HTML contained the victim's Adsense code, which when uploaded to a "junk" domain with other duplicate content, essentially associated the original publisher with a bad site in Google's eyes.
To protect Google's Adsense publishers from being associated with this crime and having their Adsense accounts potentially banned, Google has developed the "Allowed Sites" feature which allows the Adsense publisher to tell Google which domains it publishes to.

What this won't do is stop people stealing your content and code, nor will it stop people hacking into your web server and changing the Adsense account ID in the Adsense Javascript to the criminals Adsense ID, but this is definitely a step in the right direction.

Labels: , , , ,

Saturday, 2 June 2007

Migrating from ASP to ASP.NET 2.0

I've pretty much finished migrating my personal website from classic ASP to ASP.NET 2.0. In the end I decided to keep certain pages using ASP (Active Server Pages) technology (more on that in a moment), the majority of the pages however have been migrated.

Minimise Page Rank Loss

While I wanted to bring my site up-to-date I also didn't want to lose too much Google Page Rank in the process, and make people's bookmarks and RSS blog subscriptions stop functioning. The reason the pages have to change URLs is that ASP.NET pages use the extension .aspx, compared to ASP's .asp 3-digit extension. So my portfolio.asp page for example has become portfolio.aspx.

Analysing what can be Migrated

My blog area uses Google's Blogger as a CMS, so this area hasn't had to change, although prior to using Blogger I had previously built my own blog engine and this has remained as is.

The most popular part of my site is my Cisco CCNA section. Apart from the new menu page, the other pages have half-decent Page Rank and a few pages also have DMOZ entries, so those have had to remain ASP too.

Using 301 Permanent Redirects

All the other pages however have been migrated. When you now visit those old pages (from SERPS or old links) you'll get HTTP 301 redirected to the new ASP.NET pages. Because I'm on a shared server with no access to IIS (Internet Information Server), I essentially had to hard-code ASP 301 redirects on all the ASP pages that have moved, redirecting users to the new versions.

Update Robots.txt

The next step in the process was to include those old ASP pages in my robots.txt file and log-in to the Google Webmaster console to expedite the removal of those old pages from the Google index using the URL Removal tool. If you haven't already accessed Webmaster tools I highly recommend you log-in and verify your site.

Spidering as a Final Check

Next, I made sure all navigation menus and links to the old pages under my control were pointing to the new versions. This meant updating my Blogger template and republishing, updating my old ASP navigation include files and crawling my site using XENU link sleuth to check for any I had missed.

Conclusion

Moving my content over to ASP.NET has been fairly straight forward due to the small number of pages, my Tools and Portfolio pages display data stored in XML files, so it was just a case of using XmlDataSource controls to pull the information onto the pages. My homepage picks up the latest entries in my Blogger Atom feed using XSLT, and my contact form uses basic ASP.NET form and validation controls.

Increased Functionality

While migrating my content I thought I'd use the caching feature built-in to ASP.NET to allow me to display my latest ma.gnolia bookmarks on my site, so I ended-up creating a Bookmarks page, which fetches my ma.gnolia lite RSS bookmarks XML file, either from ma.gnolia.com or my cache. The cache doesn't hold my data for as long as I stipulate, but I'm assuming this is because I'm on a shared server and the cache is dropping it to free resources.

Labels: , , , , ,

Tuesday, 29 May 2007

Insert a Blogger Atom Feed into an ASP.NET web page

I've been busy recently migrated my homepage (and several others) from classic ASP to ASP.NET. My homepage displays the latest 5 posts with a summary and a link to the full blog post.
I eventually found a tutorial using XSLT explaining how to achieve this after discovering that XmlDataSource XPATH doesn't support namespaces!
I've tinkered with the XSLT that Arnaud Weil posted in his blog to achieve the following objectives:
  1. Limit the amount of posts returned by the transformation.
  2. Show a summary of the post.
  3. Show a summary that tries hard not to cut words in half when generating a snippet.
  4. Produce XHTML valid code.
Here's the source of my XSLT...

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<!--<xsl:output method="html"/>-->
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
<xsl:template match="/atom:feed">
<div id="FeedSnippets">
<xsl:apply-templates select="atom:entry" />
</div>
</xsl:template>


<xsl:template match="atom:entry" name="feed">
<xsl:if test="position()&lt;6">
<h4><xsl:value-of select="atom:title"/></h4>
<p>
<xsl:choose>
<xsl:when test="string-length(substring-before(atom:summary,'. ')) &gt; 0">
<xsl:value-of select="substring-before(atom:summary,'. ')" />...<br />
</xsl:when>
<xsl:when test="string-length(substring-before(atom:summary,'.')) &gt; 0">
<xsl:value-of select="substring-before(atom:summary,'.')" />...<br />
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="substring(atom:summary,0,200)" />...<br />
</xsl:otherwise>
</xsl:choose>
<strong>Read full post: </strong><a href="{atom:link[@rel='alternate']/@href}"><xsl:value-of select="atom:title"/></a></p>
<hr />
</xsl:if>
</xsl:template>
</xsl:stylesheet>

Labels: , , ,

Wednesday, 11 April 2007

To Feedburn or not to Feedburn?

I've decided to try out Feedburner. We use RSS to syndicate content at work and have to use server log file analysis to track them, web-beacon based web analytics packages are good for websites, but you can't add Javascript to feeds, which are pure XML. We've tried using .NET to database the hits we were getting on the feeds, but after a short while of testing we were seeing our database growing quickly in front of our eyes, not to mention consuming our precious CPU cycles.

Feedburner not only takes away the hassle of analysing web feed statistics and subscribers, but adds a lot of other functionality too.

My main initial issues with Feedburner were the following:

* What if Feedburner went bankrupt? All the sites syndicating my feed would be using the feedburner URL (unless I pay for the Pro service). How would I be able to change this back to my own URL or another Feedburner type URL? (hopefully saying goodbye to Feedburner would also still hold true if they went bankrupt?) [UPDATED: On June 1st 2007 Google purchased Feedburner, therefore making bankruptcy much less likely :-) ]

* I can't redirect any current traffic from my old Blogger Atom feed on my shared Windows hosting as I don't have access to IIS through my control panel. The file is an .xml file, and I can't use .htaccess for obvious reasons. I would need to use an ISAPI rewrite tool I suppose, which I probably wouldn't be able to get installed in a hosted environment.

* If I want to later upgrade to the Pro service, I would surely have to keep my Feedburner URL even though I could have a URL hosted on my site with this package just so I keep all my subscribers using the same feed URL. (I guess I could use the "saying goodbye to Feedburner" process above?)

Despite these issues, I've decided that the pros of knowing my subscribers etc out way the cons and I'm now syndicating through Feedburner! Read my Feedburner research.

I am wondering however, how Feedburner manage to host so many blogs. I assume they have some serious kit to handle the many requests they get. I would be interested to know what the Feedburner IT infrastructure looks like.

Labels: , , , ,