Re: Those "Re: good obfupills" spams (bayes scores)

2006-05-02 Thread jdow

From: "jdow" <[EMAIL PROTECTED]>


One is
from my local congressman. I figure if I include his junk phone calls
in my phone spam complaints (to him) the email should also be spam. I
doubt I'll white list him. He and I don't agree much. I am much too
libertarian for his Republican stance. If he'd start lecturing about
people being responsible for themselves and their own actions I might
be moved to white list him. But that's neither here nor there. The
wasteland concept is working.


This earns a follow-up. I checked the Bayes score on his message. I
must conclude that Bayes is pretty accurate. Since I consider virtually
anybody in office today to be a spamming gasbag having his message hit
a perfect 1. Bayes score is just too perfect.

My faith in Bayes is increased appropriately.

{^_-}


Re: Those "Re: good obfupills" spams (bayes scores)

2006-05-02 Thread jdow

From: "Michael Monnerie" <[EMAIL PROTECTED]>


67 SPAMs are 5-9.99 points,


OK, for a record with regards to spam and ham I have had four come
through between 5 and 7.99 points out of about 1600 messages in my
personal mail buckets. Two were from "always-on" which I signed up
for when Powell the Younger was the FCC commissioner pushing BPL.
As a ham radio operator I had a rather strong interest in opposition
to this critter. I more or less abandoned the account and let the
Tony Perkins email fall into the spam box. I finally got motivated
to remove that today. One other was from a mailing list some dweeb
spammed the list saying he could not read some other dweeb's base64
email. It was marginal. But it being marked as spam gave me a chance
to send a private email jab back to the first dweeb about his message
being spam. That leaves one real spam and no hams in the 5.0 to 7.99
wasteland. I have five messages between 8.0 and 10 inclusive. One is
from my local congressman. I figure if I include his junk phone calls
in my phone spam complaints (to him) the email should also be spam. I
doubt I'll white list him. He and I don't agree much. I am much too
libertarian for his Republican stance. If he'd start lecturing about
people being responsible for themselves and their own actions I might
be moved to white list him. But that's neither here nor there. The
wasteland concept is working.

And during this period no real ham has gotten a BAYES_99 rule hit.
But the sample's still a little small to say anything solid about
the 0.5% theoretical false alarm ratio, yet - maybe - if I stretch
it a little.

{^_^}   <- Joanne does ramble sometimes, doesn't she?




Re: Those "Re: good obfupills" spams (bayes scores)

2006-05-02 Thread Bart Schaefer

Incidentally, the FAQ answer for "HowScoresAreAssigned" on the SA wiki
is out of date.


Re: Those "Re: good obfupills" spams (bayes scores)

2006-05-02 Thread Michael Monnerie
On Montag, 1. Mai 2006 17:51 Matt Kettler wrote:
> Looking at my own current real-world maillogs, BAYES_99 matched 6,643
> messages last week. Of those, only 24 had total scores under 9.0.
> (with BAYES_99 scoring 3.5, it would take a message with a total
> score of less than 8.5 to drop below the threshold of 5.0 if BAYES_99
> were omitted entirely).

I've looked at a snap of 424 spams these last days, with a total of 8519 
points, making about 20 points per SPAM (average).
67 SPAMs are 5-9.99 points, 62 are 10-14.99 points, 294 are >15.

So it's those 67 SPAMs that should worry me most - some of them are 
really just >5 (2 times 5.06 points), and I would like them to score 
higher, because that's more on the safe side. Unfortunately, I don't 
have the possibility to check which rules were hit, amavisd-new doesn't 
log that.

mfg zmi
-- 
// Michael Monnerie, Ing.BSc-  http://it-management.at
// Tel: 0660/4156531  .network.your.ideas.
// PGP Key:   "lynx -source http://zmi.at/zmi3.asc | gpg --import"
// Fingerprint: 44A3 C1EC B71E C71A B4C2  9AA6 C818 847C 55CB A4EE
// Keyserver: www.keyserver.net Key-ID: 0x55CBA4EE


pgp9gBSLwtFX1.pgp
Description: PGP signature


Re: Those "Re: good obfupills" spams (bayes scores)

2006-05-01 Thread jdow

From: "Bowie Bailey" <[EMAIL PROTECTED]>


Matt Kettler wrote:

Bowie Bailey wrote:
> 
> The Bayes rules are not individual unrelated rules.  Bayes is a

> series of rules indicating a range of probability that a message is
> spam or ham.  You can argue over the exact scoring, but I can't see
> any reason to score BAYES_99 lower than BAYES_95.  Since a BAYES_99
> message is even more likely to be spam than a BAYES_95 message, it
> should have at least a slightly higher score.

No, it should not. I've given a conclusive reason why it may not
always be higher. My reason has a solid statistical reason behind it.
This reasoning is supported by real-world testing and real-world data.

You've given your opinion to the contrary, but no facts to support it
 other than declaring the rules to be related, and therefore the
score should correlate with  the bayes-calculated probability of spam.

While I don't disagree with you that BAYES_99 scoring lower than
BAYES_95 is counter-intuitive. I do not believe intuition alone is a
reason to defy reality. 


If there are other rules with better performance (ie: fewer FPs) that
consistently coincide with the hits of BAYES_99, those rules should
soak up the lions share of the score. However, if there are a lot of
spam messages with no other rules hit, BAYES_99 should get a strong
boost from those. 


The perceptron results show that the former is largely true. BAYES_99
is mostly redundant. To back it up, I'm going to verify it with my
own maillog data. 


Looking at my own current real-world maillogs, BAYES_99 matched 6,643
messages last week. Of those, only 24 had total scores under 9.0.
(with BAYES_99 scoring 
3.5, it would take a message with a total score of less than 8.5 to

drop below the threshold of 5.0 if BAYES_99 were omitted entirely).

So less than 0.37% of BAYES_99's hits actually mattered on my system
last week. 


BAYES_95 on the other hand hit 468 messages, 20 of which scored less
than 9.0. That's 4.2% of messages with BAYES_95 hits. A considerably
larger percentage. Bringing it down to 8.0 to compensate for the
score difference and you still get 17 messages, which is still a much
larger 3.6% of it's hits. 


On my system, BAYES_95 is significant in pushing mail over the spam
threshold 10 times more often than BAYES_99 is.

What are your results?

These are the greps I used, based on MailScanner log formats. Should
work for spamd users, perhaps with slight modifications.

zgrep BAYES_99 maillog.1.gz |wc -l
zgrep BAYES_99 maillog.1.gz |grep -v "score=[1-9][0-9]\." | grep -v
"score=9\." | wc -l


I think we are arguing from slightly different viewpoints.

You are saying that higher scores are not needed since the lower score
is made up for by other rules.  I have 13,935 hits for BAYES_99.  412
of them are lower than 9.0.  This seems to be caused by either AWL hits
lowering the score or very few other rules hitting.  BAYES_95 hit 469
times with 18 hits lower than 9.0.  This means that, for me, BAYES_95
is significant slightly more often, percentage-wise, than BAYES_99.
But considering volume, I would say that BAYES_99 is the more useful
rule.

However, that's not what I was arguing about to begin with.  Because
of the way the Bayes algorhytm works, I should be able to have more
confidence in a BAYES_99 hit than a BAYES_95 hit.  Therefore, it
should have a higher score.  Otherwise, you get the very strange
occurance that if you train Bayes too well and the spams go from
BAYES_95 to BAYES_99, the SA score actually goes down.

The better you train your Bayes database, the more confidence it
should have in picking out the spams.  As the scoring moves from
BAYES_50 up to BAYES_99, the SA score should increase to reflect the
higher confidence level of the Bayes engine.


Bingo - the trick that's been tickling my brain and the name not making
it through the fog of old age is the Kalman Filter. You grade inputs
per their confidence factor rather than punish them for being too good.

This might be a better way to put together the rules scores and the
Bayes scores.

{^_^}


Re: Those "Re: good obfupills" spams (bayes scores)

2006-05-01 Thread jdow

From: "Matt Kettler" <[EMAIL PROTECTED]>


Bowie Bailey wrote:

Matt Kettler wrote:

It is perfectly reasonable to assume that most of the mail matching
BAYES_99 also matches a large number of the stock spam rules that SA
comes with. These highly-obvious mails are the model after which
most SA rules are made in the first place. Thus, these mails need
less score boost, as they already have a lot of score from other
rules in the ruleset. 


However, mails matching BAYES_95 are more likely to be "trickier",
and are likely to match fewer other rules. These messages are more
likely to require an extra boost from BAYES_95's score than those
which match BAYES_99.


I can't argue with this description, but I don't agree with the
conclusion on the scores.

The Bayes rules are not individual unrelated rules.  Bayes is a series
of rules indicating a range of probability that a message is spam or
ham.  You can argue over the exact scoring, but I can't see any reason
to score BAYES_99 lower than BAYES_95.  Since a BAYES_99 message is
even more likely to be spam than a BAYES_95 message, it should have at
least a slightly higher score. 


No, it should not. I've given a conclusive reason why it may not always be
higher. My reason has a solid statistical reason behind it. This reasoning is
supported by real-world testing and real-world data.

You've given your opinion to the contrary, but no facts to support it other than
declaring the rules to be related, and therefore the score should correlate
with  the bayes-calculated probability of spam.

While I don't disagree with you that BAYES_99 scoring lower than BAYES_95 is
counter-intuitive. I do not believe intuition alone is a reason to defy reality.


Matt, as much as I respect you, which is a heck of a lot, I must insist
that your assertion is correct within a model that does not fit the real
needs of the situation, PARTICULARLY for individual Bayes databases that
are not fed carelessly. You don't want to crowd just above 5. You want
to have a score gap around five with almost all spam scoring well above
10. Now, I have managed to almost sweep that region clean, about 1 or 2%
of my spam falls between 5 and 8. Another 4% falls under 10. This makes
sweeping the spam directory for ham quite easily. (It also serves as a
wry note that some of the magazines to which I subscribe also spam me.
It's high nift that their spams are tagged and their hams are not, mostly.
When they are tagged they're still not BAYES_9x, though.)


If there are other rules with better performance (ie: fewer FPs) that
consistently coincide with the hits of BAYES_99, those rules should soak up the
lions share of the score. However, if there are a lot of spam messages with no
other rules hit, BAYES_99 should get a strong boost from those.


If there are any significant number of spams that hit ONLY BAYES_99 then
BAYES_99 should either very nearly kick them over or actually kick them
over. That said I have found that clever meta rules regarding specific
sources and the BAYES scores have allowed me to widen my wasteland of
scores between 4 and 10 lately. This may be an important trick to employ.


The perceptron results show that the former is largely true. BAYES_99 is mostly
redundant. To back it up, I'm going to verify it with my own maillog data.

Looking at my own current real-world maillogs, BAYES_99 matched 6,643 messages
last week. Of those, only 24 had total scores under 9.0. (with BAYES_99 scoring
3.5, it would take a message with a total score of less than 8.5 to drop below
the threshold of 5.0 if BAYES_99 were omitted entirely).

So less than 0.37% of BAYES_99's hits actually mattered on my system last week.


I wish I had that luck. And I have over 40 rule sets in action plus a
large bunch of my own.


BAYES_95 on the other hand hit 468 messages, 20 of which scored less than 9.0.
That's 4.2% of messages with BAYES_95 hits. A considerably larger percentage.
Bringing it down to 8.0 to compensate for the score difference and you still get
17 messages, which is still a much larger 3.6% of it's hits.

On my system, BAYES_95 is significant in pushing mail over the spam threshold 10
times more often than BAYES_99 is.

What are your results?


I don't have a script that tells me what BAYES_99 hits on singularly. I
posted what ratio of ham and spam BAYES_99 and BAYES_00 hit on the last
10 weeks. What I do NOT see is any benefit from trying to crowd close to
5 points. This is the reason I see the model itself as being broken. When
I ran with the original BAYES scores on 3.04 the system leaked like a
seive. As I upped the score the missed spams decreased. But every once
and awhile I seem to hit a lead position on a round of innovatvie spams
which hit nothing but BAYES_99. Loren responds by writing rules to catch
them. I respond by increasing Bayes. I figure 5.0 is my limit, though.
Although I figure a good ratio for mismarked ham to mismarked spam is
about 0.1:1. When I get that bad I make a new meta rul

Re: Those "Re: good obfupills" spams (bayes scores)

2006-05-01 Thread jdow

From: "Bowie Bailey" <[EMAIL PROTECTED]>


jdow wrote:

From: "Bart Schaefer" <[EMAIL PROTECTED]>
> 
> On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:

> > In SA 3.1.0 they did force-fix the scores of the bayes rules,
> > particularly the high-end. The perceptron assigned BAYES_99 a
> > score of 1.89 in the 3.1.0 mass-check run. The devs jacked it up
> > to 3.50.
> > 
> > That does make me wonder if:

> > 1) When BAYES_9x FPs, it FPs in conjunction with lots of
> > other rules due to the ham corpus being polluted with spam.
> 
> My recollection is that there was speculation that the BAYES_9x

> rules were scored "too low" not because they FP'd in conjunction
> with other rules, but because against the corpus they TRUE P'd in
> conjunction with lots of other rules, and that it therefore wasn't
> necessary for the perceptron to assign a high score to BAYES_9x in
> order to push the total over the 5.0 threshold.
> 
> The trouble with that is that users expect training on their

> personal spam flow to have a more significant effect on the
> scoring.  I want to train bayes to compensate for the LACK of
> other rules matching, not just to give a final nudge when a bunch
> of others already hit.
> 
> I filed a bugzilla some while ago suggesting that the bayes

> percentage ought to be used to select a rule set, not to adjust
> the score as a component of a rule set.

There is one other gotcha. I bet vastly different scores are
warranted for Bayes when run with per user training and rules as
compared to global training and rules.


Ack!  I missed the subject change on this thread prior to my last
reply.  Sorry about the duplication.

I think it is also a matter of manual training vs autolearning.  A
Bayes database that is consistently trained manually will be more
accurate and can support higher scores.


That may be a factor, too, Bowie. But, as igor is experiencing, the
site Bayes faces a singular problem in that one person's ham is another
person's extreme spam. When no two people can agree on what spam is
and what ham is a global Bayes becomes (relatively) ineffective very
quickly. This is why I included that afterthought which probably should
have been highlighted up front.

{^_^}


RE: Those "Re: good obfupills" spams (bayes scores)

2006-05-01 Thread Bowie Bailey
Matt Kettler wrote:
> Bowie Bailey wrote:
> > 
> > The Bayes rules are not individual unrelated rules.  Bayes is a
> > series of rules indicating a range of probability that a message is
> > spam or ham.  You can argue over the exact scoring, but I can't see
> > any reason to score BAYES_99 lower than BAYES_95.  Since a BAYES_99
> > message is even more likely to be spam than a BAYES_95 message, it
> > should have at least a slightly higher score.
> 
> No, it should not. I've given a conclusive reason why it may not
> always be higher. My reason has a solid statistical reason behind it.
> This reasoning is supported by real-world testing and real-world data.
> 
> You've given your opinion to the contrary, but no facts to support it
>  other than declaring the rules to be related, and therefore the
> score should correlate with  the bayes-calculated probability of spam.
> 
> While I don't disagree with you that BAYES_99 scoring lower than
> BAYES_95 is counter-intuitive. I do not believe intuition alone is a
> reason to defy reality. 
> 
> If there are other rules with better performance (ie: fewer FPs) that
> consistently coincide with the hits of BAYES_99, those rules should
> soak up the lions share of the score. However, if there are a lot of
> spam messages with no other rules hit, BAYES_99 should get a strong
> boost from those. 
> 
> The perceptron results show that the former is largely true. BAYES_99
> is mostly redundant. To back it up, I'm going to verify it with my
> own maillog data. 
> 
> Looking at my own current real-world maillogs, BAYES_99 matched 6,643
> messages last week. Of those, only 24 had total scores under 9.0.
> (with BAYES_99 scoring 
> 3.5, it would take a message with a total score of less than 8.5 to
> drop below the threshold of 5.0 if BAYES_99 were omitted entirely).
> 
> So less than 0.37% of BAYES_99's hits actually mattered on my system
> last week. 
> 
> BAYES_95 on the other hand hit 468 messages, 20 of which scored less
> than 9.0. That's 4.2% of messages with BAYES_95 hits. A considerably
> larger percentage. Bringing it down to 8.0 to compensate for the
> score difference and you still get 17 messages, which is still a much
> larger 3.6% of it's hits. 
> 
> On my system, BAYES_95 is significant in pushing mail over the spam
> threshold 10 times more often than BAYES_99 is.
> 
> What are your results?
> 
> These are the greps I used, based on MailScanner log formats. Should
> work for spamd users, perhaps with slight modifications.
> 
> zgrep BAYES_99 maillog.1.gz |wc -l
> zgrep BAYES_99 maillog.1.gz |grep -v "score=[1-9][0-9]\." | grep -v
> "score=9\." | wc -l

I think we are arguing from slightly different viewpoints.

You are saying that higher scores are not needed since the lower score
is made up for by other rules.  I have 13,935 hits for BAYES_99.  412
of them are lower than 9.0.  This seems to be caused by either AWL hits
lowering the score or very few other rules hitting.  BAYES_95 hit 469
times with 18 hits lower than 9.0.  This means that, for me, BAYES_95
is significant slightly more often, percentage-wise, than BAYES_99.
But considering volume, I would say that BAYES_99 is the more useful
rule.

However, that's not what I was arguing about to begin with.  Because
of the way the Bayes algorhytm works, I should be able to have more
confidence in a BAYES_99 hit than a BAYES_95 hit.  Therefore, it
should have a higher score.  Otherwise, you get the very strange
occurance that if you train Bayes too well and the spams go from
BAYES_95 to BAYES_99, the SA score actually goes down.

The better you train your Bayes database, the more confidence it
should have in picking out the spams.  As the scoring moves from
BAYES_50 up to BAYES_99, the SA score should increase to reflect the
higher confidence level of the Bayes engine.

-- 
Bowie


Re: Those "Re: good obfupills" spams (bayes scores)

2006-05-01 Thread Matt Kettler
Bowie Bailey wrote:
> Matt Kettler wrote:
>> It is perfectly reasonable to assume that most of the mail matching
>> BAYES_99 also matches a large number of the stock spam rules that SA
>> comes with. These highly-obvious mails are the model after which
>> most SA rules are made in the first place. Thus, these mails need
>> less score boost, as they already have a lot of score from other
>> rules in the ruleset. 
>>
>> However, mails matching BAYES_95 are more likely to be "trickier",
>> and are likely to match fewer other rules. These messages are more
>> likely to require an extra boost from BAYES_95's score than those
>> which match BAYES_99.
> 
> I can't argue with this description, but I don't agree with the
> conclusion on the scores.
> 
> The Bayes rules are not individual unrelated rules.  Bayes is a series
> of rules indicating a range of probability that a message is spam or
> ham.  You can argue over the exact scoring, but I can't see any reason
> to score BAYES_99 lower than BAYES_95.  Since a BAYES_99 message is
> even more likely to be spam than a BAYES_95 message, it should have at
> least a slightly higher score. 

No, it should not. I've given a conclusive reason why it may not always be
higher. My reason has a solid statistical reason behind it. This reasoning is
supported by real-world testing and real-world data.

You've given your opinion to the contrary, but no facts to support it other than
 declaring the rules to be related, and therefore the score should correlate
with  the bayes-calculated probability of spam.

While I don't disagree with you that BAYES_99 scoring lower than BAYES_95 is
counter-intuitive. I do not believe intuition alone is a reason to defy reality.

If there are other rules with better performance (ie: fewer FPs) that
consistently coincide with the hits of BAYES_99, those rules should soak up the
lions share of the score. However, if there are a lot of spam messages with no
other rules hit, BAYES_99 should get a strong boost from those.

The perceptron results show that the former is largely true. BAYES_99 is mostly
redundant. To back it up, I'm going to verify it with my own maillog data.

Looking at my own current real-world maillogs, BAYES_99 matched 6,643 messages
last week. Of those, only 24 had total scores under 9.0. (with BAYES_99 scoring
3.5, it would take a message with a total score of less than 8.5 to drop below
the threshold of 5.0 if BAYES_99 were omitted entirely).

So less than 0.37% of BAYES_99's hits actually mattered on my system last week.

BAYES_95 on the other hand hit 468 messages, 20 of which scored less than 9.0.
That's 4.2% of messages with BAYES_95 hits. A considerably larger percentage.
Bringing it down to 8.0 to compensate for the score difference and you still get
17 messages, which is still a much larger 3.6% of it's hits.

On my system, BAYES_95 is significant in pushing mail over the spam threshold 10
times more often than BAYES_99 is.

What are your results?

These are the greps I used, based on MailScanner log formats. Should work for
spamd users, perhaps with slight modifications.

zgrep BAYES_99 maillog.1.gz |wc -l
zgrep BAYES_99 maillog.1.gz |grep -v "score=[1-9][0-9]\." | grep -v "score=9\."
|wc -l




RE: Those "Re: good obfupills" spams (bayes scores)

2006-05-01 Thread Bowie Bailey
jdow wrote:
> From: "Bart Schaefer" <[EMAIL PROTECTED]>
> > 
> > On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:
> > > In SA 3.1.0 they did force-fix the scores of the bayes rules,
> > > particularly the high-end. The perceptron assigned BAYES_99 a
> > > score of 1.89 in the 3.1.0 mass-check run. The devs jacked it up
> > > to 3.50.
> > > 
> > > That does make me wonder if:
> > > 1) When BAYES_9x FPs, it FPs in conjunction with lots of
> > > other rules due to the ham corpus being polluted with spam.
> > 
> > My recollection is that there was speculation that the BAYES_9x
> > rules were scored "too low" not because they FP'd in conjunction
> > with other rules, but because against the corpus they TRUE P'd in
> > conjunction with lots of other rules, and that it therefore wasn't
> > necessary for the perceptron to assign a high score to BAYES_9x in
> > order to push the total over the 5.0 threshold.
> > 
> > The trouble with that is that users expect training on their
> > personal spam flow to have a more significant effect on the
> > scoring.  I want to train bayes to compensate for the LACK of
> > other rules matching, not just to give a final nudge when a bunch
> > of others already hit.
> > 
> > I filed a bugzilla some while ago suggesting that the bayes
> > percentage ought to be used to select a rule set, not to adjust
> > the score as a component of a rule set.
> 
> There is one other gotcha. I bet vastly different scores are
> warranted for Bayes when run with per user training and rules as
> compared to global training and rules.

Ack!  I missed the subject change on this thread prior to my last
reply.  Sorry about the duplication.

I think it is also a matter of manual training vs autolearning.  A
Bayes database that is consistently trained manually will be more
accurate and can support higher scores.

-- 
Bowie


Re: Those "Re: good obfupills" spams (bayes scores)

2006-04-29 Thread jdow

From: "Bart Schaefer" <[EMAIL PROTECTED]>

On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:

 In SA 3.1.0 they did force-fix the scores of the bayes rules,
particularly the high-end. The perceptron assigned BAYES_99 a score of
1.89 in the 3.1.0 mass-check run. The devs jacked it up to 3.50.

That does make me wonder if:
1) When BAYES_9x FPs, it FPs in conjunction with lots of other rules
due to the ham corpus being polluted with spam.


My recollection is that there was speculation that the BAYES_9x rules
were scored "too low" not because they FP'd in conjunction with other
rules, but because against the corpus they TRUE P'd in conjunction
with lots of other rules, and that it therefore wasn't necessary for
the perceptron to assign a high score to BAYES_9x in order to push the
total over the 5.0 threshold.

The trouble with that is that users expect training on their personal
spam flow to have a more significant effect on the scoring.  I want to
train bayes to compensate for the LACK of other rules matching, not
just to give a final nudge when a bunch of others already hit.

I filed a bugzilla some while ago suggesting that the bayes percentage
ought to be used to select a rule set, not to adjust the score as a
component of a rule set.

<< jdow >> There is one other gotcha. I bet vastly different scores
are warranted for Bayes when run with per user training and rules
as compared to global training and rules.

{^_^}


Re: Those "Re: good obfupills" spams (bayes scores)

2006-04-29 Thread jdow

From: "Matt Kettler" <[EMAIL PROTECTED]>


Bart Schaefer wrote:

On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:

Besides.. If you want to make a mathematics based argument against me,
start by explaining how the perceptron mathematically is flawed. It
assigned the original score based on real-world data.


Did it?  I thought the BAYES_* scores have been fixed values for a
while now, to force the perceptron to adapt the other scores to fit.


Actually, you're right..I'm shocked and floored, but you're right.

In SA 3.1.0 they did force-fix the scores of the bayes rules,
particularly the high-end. The perceptron assigned BAYES_99 a score of
1.89 in the 3.1.0 mass-check run. The devs jacked it up to 3.50.

That does make me wonder if:
   1) When BAYES_9x FPs, it FPs in conjunction with lots of other rules
due to the ham corpus being polluted with spam. This forces the
perceptron to attempt to compensate.  (Pollution always is a problem
since nobody is perfect, but it occurs to differing degrees).
  -or-
   2) The perceptron is out-of whack. (I highly doubt this because the
perceptron generated the ones for 3.0.x and they were fine)
 -or-
   3) The Real-world FPs of BAYES_99 really do tend to also be cascades
with other rules in the 3.1.x ruleset, and the perceptron is correctly
capping the score. This could differ from 3.0.x due to change in rules,
or change in ham patterns over time.
 -or-
   4) one of the corpus submitters has a poorly trained bayes db.
(possible, but I doubt it)

Looking at statistics-set3 for 3.0.x and 3.1.x there was a slight
increase in ham-hits for BAYES_99 and a slight decrease in spam hits.
3.0.x:
OVERALL%   SPAM% HAM% S/ORANK   SCORE  NAME
43.515 89.3888 0.0335 1.000 0.83 1.89 BAYES_99
3.1.x:
OVERALL%   SPAM% HAM% S/ORANK   SCORE  NAME
60.712 86.7351 0.0396 1.000 0.90 3.50 BAYES_99

Also to consider is set3 of 3.0.x was much closer to a 50/50 mix of
spam/nonspam (48.7/51.3) than 3.1.0 was (nearly 70/30)


What happens comes from the basic reality that Bayes and the other
rules are not orthogonal sets. So many other rules hit 95 and 99 that
the perceptron artificially reduced the goodness rating for these rules.

It needs some serious skewing to catch situations where 95 or 99 hit and
very few other rules hit. Those are the times the accuracy of Bayes is
needed the most. I've found, here, that 5.0 is a suitable score. I
suspect if I were more realistic 4.9 would be closer. But I still do
remember learning the score bias and being floored by it when I noticed
99 on some spams that leaked through with ONLY the 99 hit. I am speaking
of dozens of spams hit that way.

So far over several years I've found a few special cases that warrant
negative rules. That seems to be pulling the 99 rule's false alarm
rate down to "I can't see it." (I have, however, been tempted to generate
a BAYES_99p5 rule and a BAYES_99p9 rule to fine tune the scores up around
4.9 and 5.0.)

{^_


Re: Those "Re: good obfupills" spams (bayes scores)

2006-04-29 Thread Bart Schaefer

On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:

 In SA 3.1.0 they did force-fix the scores of the bayes rules,
particularly the high-end. The perceptron assigned BAYES_99 a score of
1.89 in the 3.1.0 mass-check run. The devs jacked it up to 3.50.

That does make me wonder if:
1) When BAYES_9x FPs, it FPs in conjunction with lots of other rules
due to the ham corpus being polluted with spam.


My recollection is that there was speculation that the BAYES_9x rules
were scored "too low" not because they FP'd in conjunction with other
rules, but because against the corpus they TRUE P'd in conjunction
with lots of other rules, and that it therefore wasn't necessary for
the perceptron to assign a high score to BAYES_9x in order to push the
total over the 5.0 threshold.

The trouble with that is that users expect training on their personal
spam flow to have a more significant effect on the scoring.  I want to
train bayes to compensate for the LACK of other rules matching, not
just to give a final nudge when a bunch of others already hit.

I filed a bugzilla some while ago suggesting that the bayes percentage
ought to be used to select a rule set, not to adjust the score as a
component of a rule set.


Re: Those "Re: good obfupills" spams (bayes scores)

2006-04-29 Thread Matt Kettler
Bart Schaefer wrote:
> On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:
>> Besides.. If you want to make a mathematics based argument against me,
>> start by explaining how the perceptron mathematically is flawed. It
>> assigned the original score based on real-world data.
>
> Did it?  I thought the BAYES_* scores have been fixed values for a
> while now, to force the perceptron to adapt the other scores to fit.
>
Actually, you're right..I'm shocked and floored, but you're right.

 In SA 3.1.0 they did force-fix the scores of the bayes rules,
particularly the high-end. The perceptron assigned BAYES_99 a score of
1.89 in the 3.1.0 mass-check run. The devs jacked it up to 3.50.

That does make me wonder if:
1) When BAYES_9x FPs, it FPs in conjunction with lots of other rules
due to the ham corpus being polluted with spam. This forces the
perceptron to attempt to compensate.  (Pollution always is a problem
since nobody is perfect, but it occurs to differing degrees).
   -or-
2) The perceptron is out-of whack. (I highly doubt this because the
perceptron generated the ones for 3.0.x and they were fine)
  -or-
3) The Real-world FPs of BAYES_99 really do tend to also be cascades
with other rules in the 3.1.x ruleset, and the perceptron is correctly
capping the score. This could differ from 3.0.x due to change in rules,
or change in ham patterns over time.
  -or-
4) one of the corpus submitters has a poorly trained bayes db.
(possible, but I doubt it)

Looking at statistics-set3 for 3.0.x and 3.1.x there was a slight
increase in ham-hits for BAYES_99 and a slight decrease in spam hits.
3.0.x:
OVERALL%   SPAM% HAM% S/ORANK   SCORE  NAME
43.515 89.3888 0.0335 1.000 0.83 1.89 BAYES_99
3.1.x:
OVERALL%   SPAM% HAM% S/ORANK   SCORE  NAME
60.712 86.7351 0.0396 1.000 0.90 3.50 BAYES_99

Also to consider is set3 of 3.0.x was much closer to a 50/50 mix of
spam/nonspam (48.7/51.3) than 3.1.0 was (nearly 70/30)