> Hi, > I'm looking at my website logs (using the webalizer package) and > most of the hits seem to be some sort of robot (Google tops the bill > at 24% of the hits, 9.7Gbytes downloaded this month so far... but > then we see them as the chief external referring website as well) > > > How can I easily filter out all the 'bots and get an estimate of how > many *real* people are using the website? It used to be relatively > easy when there were fewer search engines, but now there seem to be > dozens. > >
robots.txt is your friend here. Gives you a lot of control over the bots that adhere to the standard. Heres a good link to peruse for info on robots.txt http://www.robotstxt.org/wc/robots.html Of course this only affects those bots that recognise robots.txt. There are some that are either broken or deliberately ignore the file. -- James Purser Producer/Presenter - Linux Australia Update http://k-sit.com - My Blog http://la-pod.k-sit.com - Linux Australia Update Podcast, Blog and Forums Skype: purserj1977 -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html