Hi Sue,

sue gregson wrote:
> This is proving much more of a headache than expected. I sorted a half 
> way solution - ie only the browse by pages will appear in standard 
> google searches. I am hoping to get rid of that when the migration to 
> 1.4 happens.
>   

What do you want to appear in google searches? You can stop the browse 
pages appearing using a robots.txt file (which is independent of DSpace 
and should work just the same for 1.3.2 or 1.4).

> That was ok but then just when I think I've got it sorted the content 
> appears on google again or rather html versions of the documents are on 
> google.scholars.
>
> Can someone enlighten me as to how to stop google scholars doing stuff 
> that google doesn't. It may well be an issue for others if dspace is 
> having authentication ignored/bypassed on crawls from google.scholars. 
> Of course I could be being a bit thick and missing something that is 
> blindingly obvious to the rest of the world.
Some ideas for reasons: -

1) There's something wrong with your authentication. Are you just using 
password auth, or is there some stackable auth with IP restrictions that 
might be letting GScholar through?

2) If GScholar's just getting the metadata, then perhaps it's using your 
OAI feed (and the metadata isn't properly protected)?

3) If the authentication was put in place after the content was 
available, perhaps GScholar is using copies from its cache?

Best regards,
jim

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to