Hi Sue, sue gregson wrote: > This is proving much more of a headache than expected. I sorted a half > way solution - ie only the browse by pages will appear in standard > google searches. I am hoping to get rid of that when the migration to > 1.4 happens. >
What do you want to appear in google searches? You can stop the browse pages appearing using a robots.txt file (which is independent of DSpace and should work just the same for 1.3.2 or 1.4). > That was ok but then just when I think I've got it sorted the content > appears on google again or rather html versions of the documents are on > google.scholars. > > Can someone enlighten me as to how to stop google scholars doing stuff > that google doesn't. It may well be an issue for others if dspace is > having authentication ignored/bypassed on crawls from google.scholars. > Of course I could be being a bit thick and missing something that is > blindingly obvious to the rest of the world. Some ideas for reasons: - 1) There's something wrong with your authentication. Are you just using password auth, or is there some stackable auth with IP restrictions that might be letting GScholar through? 2) If GScholar's just getting the metadata, then perhaps it's using your OAI feed (and the metadata isn't properly protected)? 3) If the authentication was put in place after the content was available, perhaps GScholar is using copies from its cache? Best regards, jim ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech