Re: [AMaViS-user] Suggestion: Modify Amavis to optionally retain a virgin copy of each message processed...

2007-12-18 Thread mouss
Ken Morley wrote:
> I'm using Postfix 2.4.6, Amavisd-new 2.5.2, ClamAV 0.91.2 and
> Mail-SpamAssassin 3.2.3 in a Linux mail filter.  I'm having problems
> conveniently getting enough ham and spam for Bayes training.  I'm aware
> that Bayes is more closely related to SA than Amavisd, but please humor
> me before sending me off to the SA forums :)
> 
> I am currently using the Postfix always_bcc function to copy each email
> coming through the system to postmaster.  From postmaster's mailbox, I
> manually classify and copy each email into seperate spam- or
> ham- files.  The problem is that this alters the recipient and adds
> a number of X-Amavis headers that could affect Bayes accuracy.
>  
> It seems to me that it would be better if Amavisd could just make an
> un-altered copy of every e-mail it processes and place them in seperate
> disk files.  From that point, it should be fairly easy to write a script
> that would allow postmaster to rquickly eview and classify the files.
> Then, the script would assign the files an appropriate spam or ham
> filename.  That would take a lot of effort out of building a corpus.
>  
> Any thoughts on that suggestion?
>  

Besides Mark sugggestion, you can use recipient_bcc_maps instead of
always_bcc. The idea is to use a regular expression to "keep" the
original recipient. this looks like:

/^(.*)@(example\.com)$/ [EMAIL PROTECTED]

('+' being configured as the extension delimiter).

This way, you can easily retrieve the "original" recipient.




-
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/


Re: [AMaViS-user] Suggestion: Modify Amavis to optionally retain a virgin copy of each message processed...

2007-12-18 Thread Mark Martinec
Ken,

> I'm using Postfix 2.4.6, Amavisd-new 2.5.2, ClamAV 0.91.2 and
> Mail-SpamAssassin 3.2.3 in a Linux mail filter.  I'm having problems
> conveniently getting enough ham and spam for Bayes training.  I'm aware
> that Bayes is more closely related to SA than Amavisd, but please humor
> me before sending me off to the SA forums :)
>
> I am currently using the Postfix always_bcc function to copy each email
> coming through the system to postmaster.  From postmaster's mailbox, I
> manually classify and copy each email into seperate spam- or
> ham- files.  The problem is that this alters the recipient and adds
> a number of X-Amavis headers that could affect Bayes accuracy.
>
> It seems to me that it would be better if Amavisd could just make an
> un-altered copy of every e-mail it processes and place them in seperate
> disk files.  From that point, it should be fairly easy to write a script
> that would allow postmaster to rquickly eview and classify the files.
> Then, the script would assign the files an appropriate spam or ham
> filename.  That would take a lot of effort out of building a corpus.

Use archival quarantining:
  $archive_quarantine_method = 'local:archive/%m';

or a separate archive for clean and spam:
  $clean_quarantine_method = 'local:clean/%m.gz';
  $spam_quarantine_method  = 'local:spam/%m.gz';

and place the following in a SA config file (local.cf):

bayes_ignore_header X-Envelope-To-Blocked
bayes_ignore_header X-Quarantine-ID
bayes_ignore_header X-Amavis-Alert
bayes_ignore_header X-Amavis-OS-Fingerprint
bayes_ignore_header X-Amavis-PolicyBank
bayes_ignore_header X-Virus-Scanned

(other header fileds like X-Spam-* are ignored by a
SpamAssassin learner by default, including the Delivered-To).

The archived message beyond the few prepended header fields
is in its pristine form, as received by amavisd.

  Mark

-
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/


[AMaViS-user] Suggestion: Modify Amavis to optionally retain a virgin copy of each message processed...

2007-12-18 Thread Ken Morley
I'm using Postfix 2.4.6, Amavisd-new 2.5.2, ClamAV 0.91.2 and
Mail-SpamAssassin 3.2.3 in a Linux mail filter.  I'm having problems
conveniently getting enough ham and spam for Bayes training.  I'm aware
that Bayes is more closely related to SA than Amavisd, but please humor
me before sending me off to the SA forums :)

I am currently using the Postfix always_bcc function to copy each email
coming through the system to postmaster.  From postmaster's mailbox, I
manually classify and copy each email into seperate spam- or
ham- files.  The problem is that this alters the recipient and adds
a number of X-Amavis headers that could affect Bayes accuracy.
 
It seems to me that it would be better if Amavisd could just make an
un-altered copy of every e-mail it processes and place them in seperate
disk files.  From that point, it should be fairly easy to write a script
that would allow postmaster to rquickly eview and classify the files.
Then, the script would assign the files an appropriate spam or ham
filename.  That would take a lot of effort out of building a corpus.
 
Any thoughts on that suggestion?
 
Thanks!
 
Ken Morley
 

-
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services
for just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
___
AMaViS-user mailing list
AMaViS-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/