>>Once I started reviewing rankings, I found that my development
site was ranking higher for the content than my client's main site!
Not good!

Exact.
But robot control is not something trivial.
Firstly, there are true and friendly robots, like Google, secondly, 
there are bad bots, looking for
mail addresses, trying to put spam into your sites, chinese bots 
checking if your site should be
banned because they are speaking about human's right, etc.

Good bots are easy to recognize: they have a web address in the user 
agent, ex:
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Bad bots are more tricky to detect, because they don't want to look like 
a robot, then they mimic
standard browsers like MSIE, Mozilla, etc.

I've designed my own bad bot detector ("robotCop") and it takes several 
factors in account like:
- reads the robots.txt file,
- respects instructions in the robots.txt file,
- falls in click trap (some link not visible by a human visitor)
- average time spend between pages,
- reads images,
- reads javascript files,
- execute Javascript,
- support cookis,
- listed in black lists... etc.

Based on these factors, agents are granted
- full access (supposedly human browsers)
- text only (supposedly good robots)
- banned (supposedly bad or unwanted bots)
 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;160198600;22374440;w

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:298898
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

Reply via email to