Re: How to recognize robots
Real nice, Claude. I spent a fair amount of time playing with that htmldog script yesterday and I have it to the point where its decent-looking in Firefox, but not quite perfect. Had to play with it to get If you can make that thing into a commercial product, come up with developer pricing so I can package it into my own cms :-) -- --mattRobertson-- Janitor, MSB Web Systems mysecretbase.com ~| Find out how CFTicket can increase your company's customer support efficiency by 100% http://www.houseoffusion.com/banners/view.cfm?bannerid=49 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222875 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: How to recognize robots
Sorry, couldn't resist... -Original Message- From: Matt Robertson [mailto:[EMAIL PROTECTED] Sent: 31 October 2005 22:28 To: CF-Talk Subject: Re: How to recognize robots On 10/31/05, Snake [EMAIL PROTECTED] wrote: This might help http://www.the-robotman.com/ Help to tell me how I can get a life-size Robby the Robot? (which by the way I wouldn't mind having but its SOT :-) ) Dave, thanks for the css link. I might pick that book up. I have some very basic drop-down menu needs and this might let me just build my own. -- --mattRobertson-- Janitor, MSB Web Systems mysecretbase.com ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222876 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
One bot loaded this page 15 times before it left! I don't want to rain on your parade, but I'd be very surprised if an email harvesting bot would be intelligent enough to parse that javascript and suck up the generated output. Remember that the Javascript has to be processed client side. I might be wrong though. I know people do sometimes use JS document.write to protect email addresses from harvesters, and it could be that some of them are keeping up in the arms race. Ian ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222717 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
I think his point was that good bots such as google will obey his do not crawl command. Ok, but my point is WHY give a do not crawl command to good bots such as Google? Don't you want you site to be indexed? He is trying to annoy the scumbags who crawl websites to steal email address so they can spam people, I already have implemented some protection about this: the address coded in the mailto: is encrypted, and an onClick function decrypts it when a human clicks on it. these jerks ignore the robots file and love to follow do not follow links. I also use the revisit_after meta tag, to reduce the number of times pages are visited when I know they wont be modified often ( from 1 to 60 days) What I intent to do also, especially for those which do not obey the meta tag (Have already started) is: - update a table with all different user_agents encountered, - set some keepOut flag by hand for the ones I do not want, - CFABORT all pages in the Application.cfm for all undesirable. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222719 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
The only bots that ever actually hit it are spam harvestors. Ok, I see. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222721 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
Ok, but my point is WHY give a do not crawl command to good bots such as Google? Don't you want you site to be indexed? It's only this one page I don't want indexed. The site this is on has historically been in the top 5 or so on Google and Yahoo for the last two or three years. I already have implemented some protection about this: the address coded in the mailto: is encrypted, Yes, I have measures set up to protect real email addresses on most of my sites. From a wPoision description: It is important to note that when Wpoison is generating its randomized bogus e-mail addresses (and also its randomized pseudo-hyper-links) it uses an algorithm which makes the total number of different bogus e-mail addresses and pseudo-hyper-links essentially unlimited. In effect, Wpoison is capable of generating an infinite number of different bogus E-mail addresses! So the basic idea behind Wpoison is to trap unwary and badly engineered address harvesting web crawlers, and to fool them into adding enormous quantities of completely bogus e-mail addresses to the E-mail address data bases of the spammers, thus polluting those data bases so badly that they become essentially useless, thereby putting the spammers who are using them out of business, or at least shutting them down for a time and causing them some major headaches while they try to clean up the messes in their now-heavily-polluted e-mail address data bases. Here's what it looks like if you load it in a browser (warning, will take a few seconds to load) http://www.columbiacityjazz.com/myEmailList.cfm But the mainpoint here is, I hate spammers and I don't mind devoting a little time to give them as much trouble as I can. Besides, if I'm not alowed to have at least a little bit of evil fun, I'd go crazy. --- Les Mizzell ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222730 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: How to recognize robots
I don't want to rain on your parade, but I'd be very surprised if an email harvesting bot would be intelligent enough to parse that javascript and suck up the generated output. Remember that the Javascript has to be processed client side. I might be wrong though. I know people do sometimes use JS document.write to protect email addresses from harvesters, and it could be that some of them are keeping up in the arms race. This is pretty much the case not just for email harvesters, but for search engines in general. Very few can evaluate JavaScript for indexing purposes. The only product of which I'm aware that does this is Texis' Thunderstone. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222735 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: How to recognize robots
The email addresses don't look to be very valid. They are missing the place. But, I like the concept. ;^) M!ke -Original Message- From: Les Mizzell [mailto:[EMAIL PROTECTED] Sent: Monday, October 31, 2005 9:27 AM To: CF-Talk Subject: Re: How to recognize robots Ok, but my point is WHY give a do not crawl command to good bots such as Google? Don't you want you site to be indexed? It's only this one page I don't want indexed. The site this is on has historically been in the top 5 or so on Google and Yahoo for the last two or three years. I already have implemented some protection about this: the address coded in the mailto: is encrypted, Yes, I have measures set up to protect real email addresses on most of my sites. From a wPoision description: It is important to note that when Wpoison is generating its randomized bogus e-mail addresses (and also its randomized pseudo-hyper-links) it uses an algorithm which makes the total number of different bogus e-mail addresses and pseudo-hyper-links essentially unlimited. In effect, Wpoison is capable of generating an infinite number of different bogus E-mail addresses! So the basic idea behind Wpoison is to trap unwary and badly engineered address harvesting web crawlers, and to fool them into adding enormous quantities of completely bogus e-mail addresses to the E-mail address data bases of the spammers, thus polluting those data bases so badly that they become essentially useless, thereby putting the spammers who are using them out of business, or at least shutting them down for a time and causing them some major headaches while they try to clean up the messes in their now-heavily-polluted e-mail address data bases. Here's what it looks like if you load it in a browser (warning, will take a few seconds to load) http://www.columbiacityjazz.com/myEmailList.cfm But the mainpoint here is, I hate spammers and I don't mind devoting a little time to give them as much trouble as I can. Besides, if I'm not alowed to have at least a little bit of evil fun, I'd go crazy. ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222745 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
So the basic idea behind Wpoison is to trap unwary and badly engineered address harvesting web crawlers, and to fool them into adding enormous quantities of completely bogus e-mail addresses to the E-mail address data bases of the spammers... Ok, but if I was an address harvester, I think I would set some limit per site, no? -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222750 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
This is pretty much the case not just for email harvesters, but for search engines in general. Very few can evaluate JavaScript for indexing purposes. Yes, and this is why the best dynamic menu systems nowadays use pure UL/LI lists for the menu items and links, and JS only for the layout, so that search engines do not hit a wall. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Find out how CFTicket can increase your company's customer support efficiency by 100% http://www.houseoffusion.com/banners/view.cfm?bannerid=49 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222751 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
So a rollover script using JS would be a wall for the search engines? I've talked to few people that manage PPC accounts as their sole business (biased) and they keep telling me SEO is dying in Google for generic terms so there isn't a point in using SEO. Any truth to this? I currently use http://inventory.overture.com/d/searchinventory/suggestion/ to test out keyword demand in the PPC engines and then apply the result to SEO on the site targeting the words that have demand. But I can't get the top listings in Google unless I use much more targeted keyword/phrases in titles and links etc. Any thoughts? Matt - Original Message - From: Claude Schneegans [EMAIL PROTECTED] To: CF-Talk cf-talk@houseoffusion.com Sent: Monday, October 31, 2005 10:47 AM Subject: SPAM-LOW: Re: How to recognize robots This is pretty much the case not just for email harvesters, but for search engines in general. Very few can evaluate JavaScript for indexing purposes. Yes, and this is why the best dynamic menu systems nowadays use pure UL/LI lists for the menu items and links, and JS only for the layout, so that search engines do not hit a wall. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222752 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
So a rollover script using JS would be a wall for the search engines? Depends it if relies on JS to create and to open the links. Personnally, I use a menu with only UL, LI and A href=... elements. Only the layout and the dynamics are handled by CSS styles an JS functions, so any robot can get throught. The fun of this technique is also that you can just chang the CSS file, and you get a standard list for the site map page. I've talked to few people that manage PPC accounts as their sole business (biased) and they keep telling me SEO is dying in Google for generic terms so there isn't a point in using SEO. Any truth to this? Google is recommanding you use no SEO, it will never help. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222755 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
On 10/31/05, Claude Schneegans [EMAIL PROTECTED] wrote: Yes, and this is why the best dynamic menu systems nowadays use pure UL/LI lists for the menu items and links, and JS only for the layout, so that search engines do not hit a wall. Do you have a particular favorite in mind for this sort of menu? A drop-down that uses css, I mean. Opencube does this but the links are all javascript, such as lia href=javascript:launch_samples(6,'comet')Infinite Menus/a/li -- --mattRobertson-- Janitor, MSB Web Systems mysecretbase.com ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222790 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: How to recognize robots
Try this, which is really a resource site for a book of the same name. And look for the link to CSS-Driven Drop-Down Menus http://more.ericmeyeroncss.com/ -Original Message- From: Matt Robertson [mailto:[EMAIL PROTECTED] Sent: Monday, October 31, 2005 4:18 PM To: CF-Talk Subject: Re: How to recognize robots On 10/31/05, Claude Schneegans [EMAIL PROTECTED] wrote: Yes, and this is why the best dynamic menu systems nowadays use pure UL/LI lists for the menu items and links, and JS only for the layout, so that search engines do not hit a wall. Do you have a particular favorite in mind for this sort of menu? A drop-down that uses css, I mean. Opencube does this but the links are all javascript, such as lia href=javascript:launch_samples(6,'comet')Infinite Menus/a/li -- --mattRobertson-- Janitor, MSB Web Systems mysecretbase.com ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222792 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: How to recognize robots
This might help http://www.the-robotman.com/ Russ -Original Message- From: Dave Francis [mailto:[EMAIL PROTECTED] Sent: 31 October 2005 21:59 To: CF-Talk Subject: RE: How to recognize robots Try this, which is really a resource site for a book of the same name. And look for the link to CSS-Driven Drop-Down Menus http://more.ericmeyeroncss.com/ -Original Message- From: Matt Robertson [mailto:[EMAIL PROTECTED] Sent: Monday, October 31, 2005 4:18 PM To: CF-Talk Subject: Re: How to recognize robots On 10/31/05, Claude Schneegans [EMAIL PROTECTED] wrote: Yes, and this is why the best dynamic menu systems nowadays use pure UL/LI lists for the menu items and links, and JS only for the layout, so that search engines do not hit a wall. Do you have a particular favorite in mind for this sort of menu? A drop-down that uses css, I mean. Opencube does this but the links are all javascript, such as lia href=javascript:launch_samples(6,'comet')Infinite Menus/a/li -- --mattRobertson-- Janitor, MSB Web Systems mysecretbase.com ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222794 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
On 10/31/05, Snake [EMAIL PROTECTED] wrote: This might help http://www.the-robotman.com/ Help to tell me how I can get a life-size Robby the Robot? (which by the way I wouldn't mind having but its SOT :-) ) Dave, thanks for the css link. I might pick that book up. I have some very basic drop-down menu needs and this might let me just build my own. -- --mattRobertson-- Janitor, MSB Web Systems mysecretbase.com ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222797 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: How to recognize robots
http://www.alistapart.com/articles/dropdowns/ Also google accessible css dropdowns there are a ton. Everything in that book is out there. -Original Message- From: Matt Robertson [mailto:[EMAIL PROTECTED] Sent: Monday, October 31, 2005 5:28 PM To: CF-Talk Subject: Re: How to recognize robots On 10/31/05, Snake [EMAIL PROTECTED] wrote: This might help http://www.the-robotman.com/ Help to tell me how I can get a life-size Robby the Robot? (which by the way I wouldn't mind having but its SOT :-) ) Dave, thanks for the css link. I might pick that book up. I have some very basic drop-down menu needs and this might let me just build my own. -- --mattRobertson-- Janitor, MSB Web Systems mysecretbase.com ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222799 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
On 11/1/05, Brian Peddle [EMAIL PROTECTED] wrote: http://www.alistapart.com/articles/dropdowns/ An updated, even lighter-weight version of those menus is outlined here: http://www.htmldog.com/articles/suckerfish/dropdowns/ They are the only dropdowns we use now. -- Kay Smoljak http://kay.zombiecoder.com/ ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222800 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
they are made of metal and are full of wires and gyros and stuff (sorry...been kiling me all day not to send that...carry on) ;-) Bryan Stevenson B.Comm. VP Director of E-Commerce Development Electric Edge Systems Group Inc. phone: 250.480.0642 fax: 250.480.1264 cell: 250.920.8830 e-mail: [EMAIL PROTECTED] web: www.electricedgesystems.com ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222801 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
Do you have a particular favorite in mind for this sort of menu? Sure: my own mine! ;-) I'm using a first version here: www.fafo.on.ca I may release a commercial version for it, the only problem now is that I need to make some tools to edit the CSS file and the menu items in the database, and also make it less dependant upon my CMS. The whole menu is build with this simple code: CFQUERY NAME=getMenu DATASOURCE=#application.applicationName# SELECT * FROM menus WHERE name = 'public' /CFQUERY CFOUTPUTDIV class=#getMenu.class#/CFOUTPUT CFMODULE TEMPLATE=CSI_menuBuild.cfm PANEL=#getMenu.subPanel# /DIV This gets the menu public (there are others in the admin section), the CSI_menuBuild.cfm template builds the whole list recursively. The DIV is a container for the CSS classes and JS objects hierarchy. Opencube does this but the links are all javascript, such as... Yeah, this is why I made my own. I also found some others which claim they are 100% CSS and no javascript. Fine, but they just do not work or are ugly ;-) I hate all 100% extremists, whatever 100% of what they claim they are ;-) Javascript is not evil, it just in the links that it causses a problem. See the source: completely transparent to robots. It is tested for MSIE and Mozilla, I don't know about others, but I haven't get any complaint yet. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Find out how CFTicket can increase your company's customer support efficiency by 100% http://www.houseoffusion.com/banners/view.cfm?bannerid=49 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222813 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
An updated, even lighter-weight version of those menus is outlined here: http://www.htmldog.com/articles/suckerfish/dropdowns/ I remember I had a look at this one. What I didn't like is that you need to define CSS stuff for each level in the menu, which can get quite complicated. I solved the problem and can have an infinite number of levels with the same CSS file. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222814 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
The good ones identify themselves. The bad ones can be tracked by logging every page request while setting a cfid/cftoken. If a series of page requests from the same agent/ip is showing a different cfid/cftoken per request, then either its a user that does not support cookies (even session ones) or its a bot. There are other ways as well including hidden pages that real people would never visit and the like, but the one above is the method I use and it works rather well. Has any one a trick to detect and identify robots and search engine visiting CF pages? ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222671 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
The good ones identify themselves. Ok, and how do they do that? If a series of page requests from the same agent/ip is showing a different cfid/cftoken per request, then either its a user that does not support cookies (even session ones) or its a bot. Well, actually not really easy to implement ;-/ The reason I ask, is that I just implemented some statistics facility on some of my customers sites. This will count all hits, including those from robots, so I'd like to be able to discriminates visitors hits from others. Thanks. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222672 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: How to recognize robots
There are user agent lists out there http://www.psychedelix.com/agents1.html for example. You could run a report later and go against a table of useragents -Original Message- From: Claude Schneegans [mailto:[EMAIL PROTECTED] Sent: Sunday, October 30, 2005 5:15 PM To: CF-Talk Subject: Re: How to recognize robots The good ones identify themselves. Ok, and how do they do that? If a series of page requests from the same agent/ip is showing a different cfid/cftoken per request, then either its a user that does not support cookies (even session ones) or its a bot. Well, actually not really easy to implement ;-/ The reason I ask, is that I just implemented some statistics facility on some of my customers sites. This will count all hits, including those from robots, so I'd like to be able to discriminates visitors hits from others. Thanks. -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222673 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: How to recognize robots
The reason I ask, is that I just implemented some statistics facility on some of my customers sites. This will count all hits, including those from robots, so I'd like to be able to discriminates visitors hits from others. One way is to use an image in your page that is actually a cfm file. Use the cfm file to track stats (pass in query string info to specify what page you're looking at). You can use JS to load the image, which will block bots but also drop the fraction of users with JS disabled. Not sure what happens if you embed the image directly. Presumably Google will get counted due to its image search functionality. But perhaps others won't?? ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222675 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
One way is to use an image in your page that is actually a cfm file. Hmmm, this will also be blocked by some email readers, including my own, and I have some pages sent by email (News letter). -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222677 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
The good ones identify themselves. Ok, and how do they do that? The HTTP_User_Agent. Google and Yahoo will show up with their name in the agent. Others will as well. On the other hand, when you see something that looks like IE but doesn't act like a human would, then it's time to think it may be a bot. If a series of page requests from the same agent/ip is showing a different cfid/cftoken per request, then either its a user that does not support cookies (even session ones) or its a bot. Well, actually not really easy to implement ;-/ Are you running CF 7 enterprise or pro? If the former, I've got some code for you for logging that will deal with it nicely. If the latter, I still have some code for you but it's not as 'clean and tight'. The reason I ask, is that I just implemented some statistics facility on some of my customers sites. This will count all hits, including those from robots, so I'd like to be able to discriminates visitors hits from others. I've been meaning to write an interface to my logs to allow people to download a list of bot agents/ips. Now that the hardware for HoF is moved over I can try to get onto that. (yeh, right). :) ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222678 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
The HTTP_User_Agent. Yeah, I should have gessed! ;-) I think, I'll just add the user agent in my stats and work something around it. Are you running CF 7 enterprise or pro? This particular site is still under CF 5 -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Find out how CFTicket can increase your company's customer support efficiency by 100% http://www.houseoffusion.com/banners/view.cfm?bannerid=49 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222679 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
I have a page I've been using on a site just to annoy harvester bots. This page is listed in my robots file as a do not crawl page, which the harvester bots love to visit! The page contains a javascript version of wpoison written by D.K.Merriman. To a bot, the page in question looks like a guest book with thousands of entires. It also generates random links that actually link back to the same page, so once a harvester bot hits the page, it will sit there for ever slurping up bogus email addresses. At the top of the same page, I've included the below that emails me and then I can go see where the little bastard came from: cfmail to=[EMAIL PROTECTED] from=Some Spammer subject=HARVESTER ALERT address = #cgi.remote_addr#br host = #cgi.remote_host#br referer = #cgi.http_referer#br agent = #cgi.http_user_agent#br page = #cgi.script_name# /cfmail Damn good fun! -- --- Les Mizzell ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222692 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
I have a page I've been using on a site just to annoy harvester bots. I think you're having a weird idea of what the robots are doing. I certainly do not want to annoy Google's bot or others ;-) -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222694 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
I think his point was that good bots such as google will obey his do not crawl command. He is trying to annoy the scumbags who crawl websites to steal email address so they can spam people, these jerks ignore the robots file and love to follow do not follow links. I for one applaud his efforts and would be very interested in seeing the code if you don't mind Les On 10/30/05, Claude Schneegans [EMAIL PROTECTED] wrote: I have a page I've been using on a site just to annoy harvester bots. I think you're having a weird idea of what the robots are doing. I certainly do not want to annoy Google's bot or others ;-) -- ___ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: [EMAIL PROTECTED]) Thanks. ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222695 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
I think you're having a weird idea of what the robots are doing. I certainly do not want to annoy Google's bot or others ;-) No, the page I wrote about is specifically EXCLUDED from the googlebot or any other friendly bot. The only bots that ever actually hit it are spam harvestors. -- --- Les Mizzell ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222696 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: How to recognize robots
I for one applaud his efforts and would be very interested in seeing the code if you don't mind Les One bot loaded this page 15 times before it left! That's 75,000 email addresses it harvested. So, the spammer then uses this list to send spam, and if the network admin of the system being used to send the spam is an alert individual, the returns will quickly show him something is going on and they can find the source and plug it. pWe'd very much like to thank the following people on our email list for subscribing!/strong/p script type=text/javascript !-- // // a javascript version of wpoison - free to use as long as I get the credit :-) // // by D.K.Merriman // var r = Math.random(); // let's mix upper and lower case, and scatter some numbers around while we're at it var Alphabet = abc8efgChi1jklm0opqrs3uvwxyz2T57S9ZYXWVU3RQPONMLKJItHG6FED4BA; // How many addresses to generate var Addresses = 5000; r = Math.random(); // length of the address 'name' var NLength = Math.round(r * 20); r = Math.random(); // length of the address 'place' ([EMAIL PROTECTED] var PLength = Math.round(r * 50); // start with a blank name and place; var Name = ; var Place = ; // an array of traditional domains (modify to suit yourself) var Domain = new Array(com,org,mil,edu,net,com,org,mil,edu,net,uk, su,af, al, dz, as, ad, ao, ai, aq, ag, ar, am, aw, au,at, az, bs, bh, bd, bb, by, be, bz, bj, bm, bt, bo,ba, bw, bv, br, io, bn, bg, bf, bi, kh, cm, ca, cv,ky, cf, td, cl, cn, cx, cc, co, km, cg, ck, cr, ci,hr, cu, cy, cz, dk, dj, dm, do, tp, ec, eg, sv, gq, er, ee, et, fk, fo, fj, fi, fr, fx, gf, pf, tf, ga,gm, ge, de, gh, gi, gr, gl, gd, gp, gu, gt, gn, gw,gy, ht, hm, hn, hk, hu, is, in, id, ir, iq, ie, il,it, jm, jp, jo, kz, ke, ki, kp, kr, kw, kg, la, lv,lb, ls, lr, ly, li, lt, lu, mo, mk, mg, mw, my, mv, ml, mt, mh, mq, mr, mu, yt, mx, fm, md, mc, mn, ms,ma, mz, mm, na, nr, np, nl, an, nc, nz, ni, ne, ng,nu, nf, mp, no, om, pk, pw, pa, pg, py, pe, ph, pn,pl, pt, pr, qa, re, ro, ru, rw, kn, lc, vc, ws, sm,st, sa, sn, sc, sl, sg, sk, si, sb, so, za, gs, es,lk, sh, pm, sd, sr, sj, sz, se, ch, sy, tw, tj, tz,th, tg, tk, to, tt, tn, tr, tm, tc, tv, ug, ua, ae,gb, us, um, uy, uz, vu, va, ve, vn, vg, vi, wf, eh,ye, yu, zr, zm, zw) // for each desired address... for (i = 1; i = Addresses; i++) { // generate the random-length name for (j = 1; j = NLength; j++) { r = Math.random(); // and fill it with garbage Name = Name + Alphabet.charAt(Math.round(r * 62)); } // then generate a random-length place for (k=1; k = PLength; k++) { r = Math.random(); // full of garbage Place = Place + Alphabet.charAt(Math.round(r * 62)); } r = Math.random(); // then write the bogus address to the web page document.write(p + Name + @ + Place + . + Domain[Math.round(r * 200)]); // and clear out the bogusness :-) Name = ; Place=; } // -- /script -- --- Les Mizzell ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:222697 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54