Hi Sean,

GoogleBot and the rest of the bots do account for a large amount of traffic.
I would estimate that about 75% of our traffic is serving bots. But I would
also estimate that a good number of users come in through Google search
results.

Blocking Google all-together will probably have a significantly negative
impact on your ranking in a Search Engine Result Page. New content won't be
discovered, and your existing content will appear to age according to
whatever signals the search engines are using for rankings.

GoogleBot can discover content through your sitemap/htmlmap, but there is no
metadata in the sitemap, just a series of links to item/collection handles.
GoogleBot will then have to crawl the item pages anyways to get the data.
According to what I've read, and been told on the phone, GoogleBot is going
to have best success crawling your site if it can incrementally crawl your
site according to date.

For more in depth look, here's a copy of a presentation from Robert Tansley
(Google)  "De-misting DSpace and Search Engines".
https://atmire.com/labs17/handle/123456789/11796

Lastly, if your concerned about site load, you can go into webmaster tools
(Google), and tell GoogleBot to crawl your site less aggressively.

Peter Dietz



On Fri, Sep 9, 2011 at 8:46 AM, Sean Carte <sean.ca...@gmail.com> wrote:

> Two weeks ago I disabled google crawler completely by adding
> 'Disallow: /' to my robots.txt file. This has resulted in a huge
> decrease in the volume of traffic as shown by the attached graph.
>
> Previously I had my robots.txt file configured to Disallow everything
> else, including /browse as I do have sitemaps in use.
>
> What does googlebot do once its received the sitemap? Does it then
> download everything?
>
> Have I got something very wrongly configured here, or do I just accept
> that googlebot is our site's most prolific viewer?
>
> Sean
> --
> Sean Carte
> esAL Library Systems Manager
> +27 72 898 8775
> +27 31 373 2490
> fax: 0866741254
> http://esal.dut.ac.za/
>
>
> ------------------------------------------------------------------------------
> Why Cloud-Based Security and Archiving Make Sense
> Osterman Research conducted this study that outlines how and why cloud
> computing security and archiving is rapidly being adopted across the IT
> space for its ease of implementation, lower cost, and increased
> reliability. Learn more. http://www.accelacomm.com/jaw/sfnl/114/51425301/
> _______________________________________________
> DSpace-tech mailing list
> DSpace-tech@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>
>
------------------------------------------------------------------------------
Why Cloud-Based Security and Archiving Make Sense
Osterman Research conducted this study that outlines how and why cloud
computing security and archiving is rapidly being adopted across the IT 
space for its ease of implementation, lower cost, and increased 
reliability. Learn more. http://www.accelacomm.com/jaw/sfnl/114/51425301/
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to