Diego Pomatta wrote:
Anthony Peacock escribió:
Well the short answer is, yes you can.
The slightly longer answer is that you won't get as good results
doing this, as the Bayes system uses tokens found in the complete
message. By only learning on the body you will not gain any
advantage for tokens found in headers.
Yep, I know, precisely the problem is that I don't have the original
headers after the mail has been delivered.
My intention was to manually feed the few spam messages that slip
thru undetected. By the time I get a hold of those, they are in the
recipient's mail client inbox, not in the server.
I was thinking, if I save the mail as EML files, would that preserve
the headers in a way that sa-learn can parse correctly?
Depends on the client.
For instance, Thunderbird stores it's folders in mbox format, so
sa-learn can work against those files as-is. Other email clients can
save emails in text format complete with headers.
I use Thunderbird. There are two files for that folder: Junk.msf (7k)
and Junk (53.172k). The msf file must be some kind of index. I just feed
the biggest one to sa-learn?
Yes, the .msf file is an index file. I just copy the mbox file (Junk in
your case) to the server and run the following command specifying the
filename (as shown):
/usr/local/bin/spamassassin --report --mbox Junk
--
Anthony Peacock
CHIME, Royal Free & University College Medical School
WWW: http://www.chime.ucl.ac.uk/~rmhiajp/
"A CAT scan should take less time than a PET scan. For a CAT scan,
they're only looking for one thing, whereas a PET scan could result in
a lot of things." - Carl Princi, 2002/07/19