I've filed an issue based on this.

http://opensource.atlassian.com/projects/roller/browse/ROL-1145

You can watch it if interested.  It's best just to file these if you can.

--a.

p.s. A short term workaround might be to add a rel="nofollow" in the macro that generates the links; this may work for savvy crawlers.


----- Original Message ----- From: "Trygve Lie" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Monday, May 22, 2006 3:55 AM
Subject: Spider trap in Roller's calendar


Hi

The calendar in Roller can cause a small problem for search engine spiders since it's possible to page backwards in dates by the calendar. It's actually possible to page pack to the year zero...
Ex: http://rollerweblogger.org/page/roller/000104

A spider hitting such a "trap" will just continue to page backwards until it "gets tired".

There are two dangerous problems with this:
- This can cause unnecessary stress on the server running Roller (ex; Yahoos spider make big slurps and do actually not consider if the server can handle it or not). - At some point the spider will "get tired" because such paging will generate a lot of similar pages (when there are no content all pages will be similar) and the spider will then mark the site as "possible spam" due to all the similar pages.

I would like to suggest that there might be added a small check which makes the backward paging in the calendar only go back to when the first post was made in the blog. This would cause the calendar to page back to the month when the first post was added to the blog. To be able to page beyond that month does not have any actual interest.

Kind regards
Trygve Lie

_________________________________________________________________
MSN Spaces http://spaces.msn.com/?mkt=nb-no Vis hvem du er og hva du vil


Reply via email to