On Tuesday October 23 2012 22:26:00 Axb wrote: > Spamc/Spamd's "skip size" method has made a huge *positive* difference > on FPs, and scan times. > The FNs wouldn't *ever* have been caught by a chunk method due to the > kind of content included "above" threshold.
Out of curiosity, during the last 10 days our system detected almost 200 large spam messages (manually confirmed spam) with size above 400 kB (of which SpamAssassin saw only the first 420 kB, the rest was truncated). Of these there were 55 distinct species: 17 in the 400..500 kB region 16 in the 500..700 kB region 9 in the 700..1000 kB region 10 in the 1000..2000 kB region 2 of 2.8 MB 1 of 3.6 MB Median spam score (by species) for these was Q2=15.5, quartiles score Q1=11 and Q3=27, so I'd say SpamAssassin did a good job with these. The most valuable score contributions seems to have been a mail header section (subject, RBL, bayes), attachment contents was probably less important. Mark