I suggest it wouldn't be too much of a stretch for a spambot to do a bit of regex matching... I'll put it in php because that's what I know. First you'd preprocess the page using a javascript processor. This is an effort but quite possible - perhaps steal some firefox code to do this, which would conveniently send a request indistinguishable fro real fx. Then you save the "rendered" code into the variable $code. $code = preg_replace('/<.+alt=(['"]|).*($1)>/siU', '$1', $code); preg_match_all('/[a-z0-9_.+-]+(\b*(.+|@)\b*[a-z0-9_.-]+)+/iUs'...
That should match most email addresses obscured most ways, I'd say. You might get SOME garbage, but I don't think that necessarily bothers spammers. The best protection seems to be slashdot-esque translation, though this is by no means unbypassable, just not worth bypassing ;). You don't have to run faster than the bear, just faster than the other guy. E&OE, by the way. -michael On 6/9/06, James Laugesen <[EMAIL PROTECTED]> wrote:
No worries Mike, keep at it, if it's a good solution for you, then it's a good solution =8-) Personally I use captcha validated forms to send email internally. At least filtering on user agent is another level, and you might end up expanding on that. Bots will just make http requests and parse the output looking for anything matching an email address. One human can then scout discussions like this and websites looking for new implementations and usually just chuck in a new regular expression; matching [at] or <span>[at]</span> is not a big stretch. J On 09/06/06, Mike at Green-Beast.com <[EMAIL PROTECTED]> wrote: > Lachlan Hunt <[EMAIL PROTECTED]> wrote: "[...] Why do you think it's safe to assume that bad bots won't send User-Agent headers claiming to be IE, Firefox or any other browser they like? Even if some bots do send different UA strings, this script relies on a false assumption and, thus, provides a false sense of security. [...]" --- Hello Lachan, I'm not assuming anything (I try not to do that). I do know that 'bots can mask themselves, this has been on my mind. But I don't know how often this is done or what the risk levels are. It's not like I'm "distributing" this or billing it as a fool-proof method. It is an experiment; a test. I'm trying to make something useful. And I do have a disclaimer. That entire site, mikecherim.com, is just for experiments. A sandbox if you will. It is for this reason I have posted here with the WSG: to test it in the field and to get feedback. To discover the problems and possible loopholes. I have that email address on one place on the web and that's on that page, so that is part of the test as well. My mailbox is waiting to see what happens. The sad part is, even if it can be made fully capable of its assigned task and become a popular and accessible solution, new spam-bot builds would probably have a work-around built into their new versions within months. Unfortunately, if people are allowed to communicate with us or post to our sites, we can only hope to slow down or stay just slightly ahead to the bad guys. Thanks for your feedback. Sincerely, Mike Cherim http://green-beast.com/ http://accessites.org/ http://graybit.com/ ****************************************************** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help ****************************************************** ****************************************************** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help ******************************************************
-- http://mine.mjec.net/ ****************************************************** The discussion list for http://webstandardsgroup.org/ See http://webstandardsgroup.org/mail/guidelines.cfm for some hints on posting to the list & getting help ******************************************************