http://bugzilla.spamassassin.org/show_bug.cgi?id=3154





------- Additional Comments From [EMAIL PROTECTED]  2004-03-11 01:16 -------
Hmmm... those generally sound like good changes, but the patch as a
whole seems to have a somewhat negative effect on rule results.  Testing
5496 spam and 5497 ham, my spam hits changed as follows:

It seems like the additional whitespace is breaking the tests that rely
on looking for text directly before or after tags, especially the
BACKHAIR* and HTML_OBFUSCATE_* rules.

We might want to rethink *how* and *when* formatting text (like the
whitespace and such) is added, perhaps develop a better understanding of
how we expect text to be surrounded by tags, or both.

------- start of cut text --------------
21      HTML_40_50
21      LINES_OF_YELLING_2
12      HTML_10_20
12      HTML_70_80
2       HTML_IMAGE_ONLY_02
2       HTML_IMAGE_ONLY_08
2       LINES_OF_YELLING
2       __HTML_COMMENT_RATIO
1       DEAR_FRIEND
1       HTML_00_10
1       HTML_IMAGE_ONLY_04
1       HTML_IMAGE_RATIO_02
1       HTML_IMAGE_RATIO_10
1       HTML_IMAGE_RATIO_14
1       LINES_OF_YELLING_3
-1      HTML_50_60
-1      HTML_IMAGE_RATIO_06
-1      HTML_IMAGE_RATIO_12
-1      T_BACKHAIR2_2_2
-1      T_BACKHAIR2_2_3
-1      T_BACKHAIR2_2_4
-1      T_BACKHAIR2_4_7
-1      T_BACKHAIR2_5_2
-1      T_BACKHAIR2_7_6
-1      T_BACKHAIR_1_4
-1      T_BACKHAIR_1_5
-1      T_BACKHAIR_4_7
-1      T_BACKHAIR_6_5
-1      T_BACKHAIR_7_6
-1      T_DOMAIN_RATIO_001
-1      T_DOMAIN_RATIO_002
-1      T_DOMAIN_RATIO_004
-1      T_DOMAIN_RATIO_010
-1      T_DOMAIN_RATIO_012
-1      T_DOMAIN_RATIO_016
-1      T_DOMAIN_RATIO_018
-1      T_DOMAIN_RATIO_025
-1      T_DOMAIN_RATIO_029
-1      T_DOMAIN_RATIO_032
-1      T_DOMAIN_RATIO_033
-1      T_DOMAIN_RATIO_034
-1      T_DOMAIN_RATIO_046
-1      T_DOMAIN_RATIO_047
-1      T_DOMAIN_RATIO_048
-1      T_DOMAIN_RATIO_049
-1      T_RM_BPT_LONGWORDS_7_8_A
-1      UNIQUE_WORDS
-2      HTML_30_40
-2      HTML_IMAGE_ONLY_06
-2      T_BACKHAIR2_4_6
-2      T_BACKHAIR2_6_2
-2      T_BACKHAIR2_6_7
-2      T_BACKHAIR_4_4
-2      T_BACKHAIR_4_6
-2      T_BACKHAIR_6_2
-2      T_BACKHAIR_6_7
-2      T_DOMAIN_RATIO_038
-3      T_BACKHAIR2_4_2
-3      T_BACKHAIR2_4_4
-3      T_BACKHAIR2_6_1
-3      T_BACKHAIR2_6_5
-3      T_BACKHAIR_4_2
-3      T_BACKHAIR_6_1
-4      T_DOMAIN_RATIO_003
-5      T_BACKHAIR_4_5
-6      HTML_20_30
-6      T_BACKHAIR2_3_7
-9      HTML_60_70
-11     HTML_80_90
-14     FREE_PORN
-14     T_BACKHAIR2_4_5
-16     T_BACKHAIR2_1_6
-17     HTML_90_100
-32     HTML_OBFUSCATE_00_10
------- end ----------------------------

and ham changed as follows:

------- start of cut text --------------
10      HTML_OBFUSCATE_00_10
4       HTML_30_40
3       HTML_10_20
3       HTML_70_80
-1      HTML_40_50
-2      HTML_60_70
-3      HTML_20_30
-4      HTML_80_90
------- end ----------------------------




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to