On Wed, 14 Jan 2009, Shane Beers wrote:

> We had an issue with our local google instance crawling our DSpace 
> installation and causing huge issues. I re-wrote the robots.txt to disallow 
> anything besides the item pages themselves - no browsing pages or search 
> pages 
> and whatnot. Here is a copy of ours:

We've had to do that for years; without it DSpace just crumbles under the 
load. I've got a small Perl script which generates a flat html file with 
links to all our item pages, and we put a link to that in the footer.

So we can block all browse pages, but not item or bitstreams, and still 
get indexed.

DSpace 1.x has major scalability issues, alas. No matter how much hardware 
you throw at it.


Best,

--
Tom De Mulder <td...@cam.ac.uk> - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
-> 14/01/2009 : The Moon is Waning Gibbous (83% of Full)

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to