On Tue, 17 Feb 2004, Charles Gregory wrote: > > (smug look) Gee, I'm getting good at this..... :-) > You've already got 433 of those in your corpus? They only started > a couple of days ago.... (shake head) > > - Charles > > On Tue, 17 Feb 2004, Robert Menschel wrote: > > Hello Charles, > > Monday, February 16, 2004, 8:38:10 AM, you wrote: > > > > CG> Seeing a new run of spam with: > > CG> {a hrefstringhref=http://bogus.url href="http://real.url"} > > > > CG> I think they are hoping to fool a primitive scan for 'href=' but it > > CG> just makes for a really unambiguous spamsign. I'm scoring it high. > > CG> We'll probably see some variations on this soon, with other things in > > CG> front of href..... > > > > CG> rawbody LOC_HTMLBADHREF /href[a-z]*href/i > > CG> describe LOC_HTMLBADHREF href(string)href in link > > CG> score LOC_HTMLBADHREF 2.5 > > > > LOC_HTMLBADHREF -- 433s/0h of 100794 corpus (82099s/18695h) 02/16/04 > > Bob Menschel
No, I've been seeing that junk for several weeks now. I wrote a similar rule that is a little less discriminating but seems to work for me. rawbody L_FAKE_HREF /\w\whref=http:/i describe L_FAKE_HREF Faked href to hide spammer URLs score L_FAKE_HREF 1.7 -- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{