Re: Testing Bayes (auto)-learning

2005-03-19 Thread Matt Kettler
Greg Abbas wrote:

>Paul Boven  chello.nl> writes:
>  
>
>>Yes, they're forwarding the messages as attachements, and yes, I'm 
>>stripping them out of the message/rfc822 attachements before feeding 
>>them to Bayes. And in all the tests I've done so far this seems to work, 
>>but now that we've upgraded to SA3.0.2 I can't peek 'under the hood' 
>>anymore to see if things are still being learned as they should.
>>
>>
>
>On a related note, if I grab messages from a maildir after
>spamassassin has "quarantined" them ("The original message has
>been attached to this so you can view it... yadda yadda") is
>sa-learn smart enough to realize that the spam is contained in
>the attachment? 
>  
>

sa-learn is smart enough to undo any changes made by spamassassin
itself, so if you use SA to do your tagging, sa-learn will undo it prior
to learning.

However, if you use a tool like amavis, mimedefang, or mailscanner and
use that tool's own encapsulation methods instead of SA's, then sa-learn
won't undo it.



Re: Testing Bayes (auto)-learning

2005-03-19 Thread Greg Abbas
Paul Boven  chello.nl> writes:
> Yes, they're forwarding the messages as attachements, and yes, I'm 
> stripping them out of the message/rfc822 attachements before feeding 
> them to Bayes. And in all the tests I've done so far this seems to work, 
> but now that we've upgraded to SA3.0.2 I can't peek 'under the hood' 
> anymore to see if things are still being learned as they should.

On a related note, if I grab messages from a maildir after
spamassassin has "quarantined" them ("The original message has
been attached to this so you can view it... yadda yadda") is
sa-learn smart enough to realize that the spam is contained in
the attachment? Or is this the same situation as a user-forward,
where I would need to write something to strip it out?

And as an aside, I'm curious about "peeking under the hood" too,
but in my case it's because I'm curious how many messages have
been trained. (In order to find out how soon the filter is going
to think the corpus is large enough to start using its bayes
rules.)

TIA. -g.




Re: Testing Bayes (auto)-learning

2005-03-17 Thread Paul Boven
Hi Daryl, everyone,
Daryl C. W. O'Shea wrote:
Paul Boven wrote:

My problem is that I have end-users that are basically claiming 'the 
more I send to the relearn-address, the lower the Bayes score seems to 
be getting.' The included headers seem to support that claim, so I 
really want to dig a bit deeper into the whole setup.

That there sounds like your problem.  How are your users sending mail to 
the 'relearn address'?  If they're not forwarding messages as an 
attachment, and you're not striping out these attached messages then it 
isn't going to work to your benefit, and you'll see the result you 
describe.
Yes, they're forwarding the messages as attachements, and yes, I'm 
stripping them out of the message/rfc822 attachements before feeding 
them to Bayes. And in all the tests I've done so far this seems to work, 
but now that we've upgraded to SA3.0.2 I can't peek 'under the hood' 
anymore to see if things are still being learned as they should.

Regards, Paul Boven.


Re: Testing Bayes (auto)-learning

2005-03-17 Thread Daryl C. W. O'Shea
Paul Boven wrote:
My problem is that I have end-users that are basically claiming 'the 
more I send to the relearn-address, the lower the Bayes score seems to 
be getting.' The included headers seem to support that claim, so I 
really want to dig a bit deeper into the whole setup.
That there sounds like your problem.  How are your users sending mail to 
the 'relearn address'?  If they're not forwarding messages as an 
attachment, and you're not striping out these attached messages then it 
isn't going to work to your benefit, and you'll see the result you describe.

Daryl