Google's crawler obey's /robots.txt, that's where i'd start. On Mon, Jan 17, 2011 at 1:20 AM, William L. Thomson Jr. <[email protected]> wrote: > Really not sure what is up with Google or what in the wiki the crawler > finds so interesting. Not sure if I need to see about disabling history > or other things that increase the size of the wiki. Since at this time > its rather small. But Google will hit the wiki for hours and hours, and > I think it might even span days. I haven't monitored it for that long so > not sure, but its enough to where I have noticed it. > > Normally I would not care, but the only load being placed on the wiki > for the most part is coming from Google. Resulting in a consistent CPU > usage of ~20-40%. Not sure if I need to just block Google IP's or > something drastic like that. > > Its really annoying, I can stop the web server and all connections from > Google drop. Soon as I start it back up again, it Google requests start > coming in right away. Just makes no sense why Google should spend so > much time crawling a site with such little content. Short of the history > and change logs. > > Unless others have a problem with it, I am about to block Google from > crawling/indexing the Wiki. Or if anyone else has any suggestions, or > experience with Google causing unnecessary loads on wiki's. > > -- > William L. Thomson Jr. > Systems Administrator > Jacksonville Linux Users Group > > > --------------------------------------------------------------------- > Archive http://marc.info/?l=jaxlug-list&r=1&w=2 > RSS Feed http://www.mail-archive.com/[email protected]/maillist.xml > Unsubscribe [email protected] > >
--------------------------------------------------------------------- Archive http://marc.info/?l=jaxlug-list&r=1&w=2 RSS Feed http://www.mail-archive.com/[email protected]/maillist.xml Unsubscribe [email protected]

