>>Does that make sense?

Yeah, and thanks for taking your time on the beach ;-)
I think you have a good scheme.

My concern is a bit different since I have no guest book, no blog, no 
comment coming up in my sites.
I'm more concerned about bots reading thousands of pages looking for 
email addresses to gather,
Chinese bots checking if we are talking about human rights and if our 
sites should be banned from their continent,
bots checking for use of copyrighted images and sending illegal bills to 
my customers... you name it.
So I have no text, no form field to analyze, just the way the agent is 
doing on the site.

Then I developed "RobotCop" which evaluates agents based on their 
activities.
I have a first evaluation to determine if the agent "looks like" a 
browser or a robot based on its activities.
Then a second step evaluates if its activities are suspicious, as a 
browser or as a robot.
Depending on its resuts, the agent is recorded  and marked as "B" 
(browser), "R" (robot) or "X" (Banned).
A B agent can see the site in its integrity.
An R agent can only see text, images are not displayed.
An X agent just receives a 404 error header.

For this, I have several tests, traps, etc. like :
- ClickTrap: address of a page hidden in a display:none div and not 
clickable by a human;
- reads or not robots.txt;
- respects robots.txt commands;
- submits false form with no submit button;
- average time between requests;
- reads Javascript;
- execute javascript;
- reads style sheets;
- read images;
- support cookies;
- robot pretending to be a browser;
- etc;
(I'm about to add "executes style sheets", I know borowsers that 
deactivate Javascript and cookies and read no image, but not executing 
styles must be rare)
I also have a white list and a black list for well known good bots and a 
black list for bad bots,
based on an IP range or a regExp matching the user agent.

Thanks for sharing your approach.

-- 
_______________________________________
REUSE CODE! Use custom tags;
See http://www.contentbox.com/claude/customtags/tagstore.cfm
(Please send any spam to this address: [EMAIL PROTECTED])
Thanks.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;160198600;22374440;w

Archive: http://www.houseoffusion.com/groups/SQL/message.cfm/messageid:3024
Subscription: http://www.houseoffusion.com/groups/SQL/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.6

Reply via email to