> Are you suggesting a robot is checking that string against > its UA??? I find > that hard to believe, but assuming that is the case such a > robot would be > allowed unrestricted access to looksmart.com, including all > their Pay Per > Click (PPC) URLs. I'm thinking that many robots are reading > looksmart.com, > some with permission from robots.txt, and some without.
Yes, that would be the case. For some unknown reason Looksmart allows recognized robots/crawlers/spider and other non-standard user-agents unlimited access according to the the robots.txt - all others are excluded. I'd guess the weird looking "java" user-agent originates from an Java application running on a platform/JVM unable to set the user-agent property. The guys at Looksmart probably detected it in their logfiles... > I'm looking for reasons why advertisers can't reconcile > clickthroughs with > the figures provided by Looksmart. > > One suggestion is that if Looksmart aren't checking the User Agents of > clients accessing the PPC URLs, that would be a reason why > many more clicks > were being seen by Looksmart than by advertisers. The robot > either may not > follow the redirect, or may follow without providing a > referrer, or may be > silently filtered by an advertiser's stats package because it > is a robot not > a human visitor. > > Another suggestion is that some robots will be masquerading > as browsers, but > still may not follow redirects or send a referrer allowing > the clicks to be > reconciled. That would be true, unless some sort of server-side mechanism ensures that these well-known (probably non-human) users are provided with a different content than "normal" users, i.e. pages without PPC URLs. You could check this theory by creating a simple "robot", using one of the user-agents in the robots.txt file, and comparing the server output with the output given to a normal user (IE, Mozilla, Opera...). > I'm trying to gather likelihood on the possibility of each > scenario, so I'm > looking for > > a) how many robots, given www.looksmart.com/robots.txt, would > read those > looksmart.com PPC URLs? > b) how many of those robots would be recognisable as robots, > i.e. use a > unique User Agent? > > Do we all agree that if a robot masquerades as a browser, > ignores robots.txt > and incurs clickthrough fees for advertisers, then the > advertisers' beef (if > any) should be with the robot owner rather than the PPC > provider? But if a > robot sends a recognisable UA, complies with robots.txt and > advertisers > still incur clickthrough fees, the advertisers' beef (if any) > should be with > the PPC provider? eh...beef? > Alan Perkins > CTO, e-Brand Management Limited > http://www.ebrandmanagement.com/ > > > -------------------------------------------------------------- Rasmus T. Mohr Direct : +45 36 910 122 Application Developer Mobile : +45 28 731 827 Netpointers Intl. ApS Phone : +45 70 117 117 Vestergade 18 B Fax : +45 70 115 115 1456 Copenhagen K Email : mailto:[EMAIL PROTECTED] Denmark Website : http://www.netpointers.com "Remember that there are no bugs, only undocumented features." --------------------------------------------------------------