> I am looking for opinions.
> Problems I have with both:
> 1.  What is the best method of obtaining the spam / ham.  I 
> have the server create a spam folder for each user when the 
> user is created. 
> spamassassin will automatically put all mail marked as spam 
> in this folder.  Obviously I will use this folder to run 
> salearn on for spam.  I will also instruct users to move mail 
> that is spam that was not marked as spam to this folder.  My 
> problem is, where do I run salearn for ham. 

Not to mention potentially learning on all the false positives
That SA may or may not produce, or the fact that with bayes turned
On most of the messages in the folder will already have been seen
By the system.

I guess the ideal solution would be to have a False Positive folder
That people can drag messages that the filter gone wrong into (tell
Them that this is where you look and if its not there, you can't
Deal with it (Most users will realise that the fastest way to stop
Receiving the spam is to put it in there then)

You probably also want to tag the messages that are put into the
Spam folder, then maybe once a day run through each users mailbox
And find the messages in their other folders that have a spam tag, as
Its these messages that SA incorrectly tagged (not the ones in the 
Spam folder)

SA automatically learns messages that score above and below a
Defined threshold, so you don't need to run these through again. 
What you actually want to force through bayes are the FPs and FNs
That occur, and these are best identified by eyeball (you have no
Idea how many users will put legitimate things in as spam!)

In short, automated is nice, but learn the right things, and 
Ideally look at them to be sure.



