(okay, second try, now that the list seems to be working again)

> s> rawbody L_Text_Padding_In_Html      /<(title>)?[ '-.,?!\w]{50,}>/
> s> describe L_Text_Padding_In_Html  Text padding within brackets or HTML
> s> title to fool bayesian filter
rawbody L_Text_Padding_In_Html      /<(title>)?[- '\.,\?\!\w]{100,}>/

> s> rawbody L_Very_Long_Title  /<title>[ '-.,?!\w]{80,}<\/title>/
> s> describe L_Very_Long_Title HTML title longer than 80 characters to fool
> s> bayesian filter
> I tested your rules against my corpus:
> L_Text_Padding_In_Html -- 985s/84h of 91714 corpus (74113s/17601h) 01/21/04
> Ham appears to be from web pages with valid comments that were sent as
> email, and from people who enclose comments or instructions in angle
> brackets, eg:
> >  <Google Search backup file date OR time OR access groupaix.htm>
> >  <snip personal reasons for separation, as immaterial>
> >  < Thankyou, [name removed]!!!  What do you think of my ramblings?>
   Thanks! I increased the threshold for both to {100,},
so hopefully that'll significantly limit false positives.
 
> L_Very_Long_Title -- 100s/1h of 91714 corpus (74113s/17601h) 01/21/04
> Sole ham was a tech support response, and the "title" apparently included
> the entire contents of my original trouble reoprt, or much of it:
> >  <title>Getting error 182 while trying to download magazine. Delivery
> >  Manager claims ...</title>
   Hm. At the moment I have that rule weighted at 1.0, since it hits
the majority of spam I get, but bayes and whatnot would probably easily
compensate for that, and no one rule should make or break a message as
spam I think.

thanks,
sckot Vokes
-- 
"I wish I had a 2 liter of Pepsi in my box of replacement
 staples, so if they needed to quench their thirst, then
 they could ride the snake." -Kefka P


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to