I've been playing with generating site maps with "/bin/dspace 
generate-sitemaps", which works fine, but includes all of our restricted 
content.   If I'm looking at the headers correctly, the restricted urls return 
a 404 error (that seems odd to me…)  Googlebot help says "Generally, 404s don't 
harm your site's performance in search" so I suppose it's okay, but am 
wondering if there is a way to output a sitemap that only includes non 
restricted content.   Or am I overthinking this?  I suppose the bot would run 
into those pages anyway from collection page links.   I had been trying to 
restrict bot visits to certain collections via robots.txt, but it looks like 
handles aren't hierarchical so that didn't really work out.


Any thoughts on setting up appropriate indexing via sitemaps and/or robotx.txt 
appreciated. 


------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment 
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
Dspace-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-general

Reply via email to