On Fri, 19 Nov 2010, Daniel McDonald wrote: > On 11/19/10 2:51 PM, "Bowie Bailey" <bowie_bai...@buc.com> wrote: > > > rawbody FR_3TAG_3TAG > > m'<[abcefghijklmnoqstuvwxz]{3}></[abcefghijklmnoqstuvwxz]{3}>'i > > > > It looks for an html tag containing exactly three characters followed by > > a closing tag which also contains exactly three characters. > > But no instances of d,p,r or y. I'm sure that's a really clever trick for > something, I just don't have a clue as to what it might be....
It was an attempt to find obfsucated HTML junk that spamers were using to break up spammy words such as "male medications" EG: via<sqz></sqz>gra The idea was that most all legit 3 character HTML tags such as '<div>' contained at least one of those letters ([dpry]) in them. So a purported tag that had none of them was not legit and thus probably bogus spammer spoor. With the evolution of HTML (xml, etc) that's no longer a safe asumption, so that rule probably FPs. -- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{