New DNS white/blacklist + spamassassin rules Re: Please report IPs delivering ham and spam with this script
While I still plan for this to primarily be used via rsync and a spamassassin plugin, I've loaded the data into DNS records and created spamassassin rules so it can easily be tested now. It's updating automatically once a day. I'm hoping this will encourage people to contribute data. Because now you should get an immediate improvement in your spam filtration, based on data you've provided on what IPs send you ham and spam. More info, including the script to submit data (either from spam/ham folders, or individual emails piped to standard input) here: http://www.chaosreigns.com/iprep/ The spamassassin rules: ifplugin Mail::SpamAssassin::Plugin::DNSEval header __RCVD_IN_IPREP eval:check_rbl('iprep-firsttrusted', 'iprep.chaosreigns.com.') tflags __RCVD_IN_IPREP nice net header RCVD_IN_IPREPDNS_100 eval:check_rbl_sub('iprep-firsttrusted', '127.\d+.\d+.100') describe RCVD_IN_IPREPDNS_100 Sender listed at http://www.chaosreigns.com/iprep/, 100% ham tflags RCVD_IN_IPREPDNS_100 nice net header RCVD_IN_IPREPDNS_50eval:check_rbl_sub('iprep-firsttrusted', '127.\d+.\d+.50') describe RCVD_IN_IPREPDNS_50Sender listed at http://www.chaosreigns.com/iprep/, 50% ham tflags RCVD_IN_IPREPDNS_50nice net header RCVD_IN_IPREPDNS_0 eval:check_rbl_sub('iprep-firsttrusted', '127.\d+.\d+.0') describe RCVD_IN_IPREPDNS_0 Sender listed at http://www.chaosreigns.com/iprep/, 0% ham tflags RCVD_IN_IPREPDNS_0 net meta RCVD_NOT_IN_IPREPDNS ( ! RCVD_IN_IPREPDNS_100 ! RCVD_IN_IPREPDNS_50 ! RCVD_IN_IPREPDNS_0 ! NO_RELAYS ) describe RCVD_NOT_IN_IPREPDNS Sender not listed at http://www.chaosreigns.com/iprep/ tflags RCVD_NOT_IN_IPREPDNS net score RCVD_IN_IPREPDNS_100 -0.1 score RCVD_IN_IPREPDNS_50 -0.0001 score RCVD_IN_IPREPDNS_00.1 score RCVD_NOT_IN_IPREPDNS 0.0001 endif For people not contributing data, this is not likely to be useful yet. Out of the 86,899 IPs I have data for, all but 38 are either 100% spam or 100% ham, so a great predictor of what the next email from known IPs will be. This is why blacklists and whitelists, including spamassassin's AWL (which is another combination of both) are nothing new. The advantages I'm providing over SA's AWL are: 1) It's based on human verified ham and spam, not SA's previous opinions of emails. 2) Shared knowledge from other people's email. What I hope to be an advantage over dnswl.org, which I've been involved in, is increased automation. Here's a test I ran using only the last 500 of my own emails. All hand categorized as spam or ham, and sorted by received data. One by one it learns the IP as a ham source, spammer, or mix, and using what it has learned, guesses what the next email is. Every 100 emails it reports its success rate for the last 100 emails: $ ./progress.pl Rank 100, hit 51.7647058823529% of ham, hit 0% of spam. Rank 50, hit 0% of ham, hit 0% of spam. Rank 0, hit 0% of ham, hit 0% of spam. Rank none, hit 48.2352941176471% of ham, hit 100% of spam. Rank 100, hit 76% of ham, hit 0% of spam. Rank 50, hit 0% of ham, hit 0% of spam. Rank 0, hit 0% of ham, hit 28% of spam. Rank none, hit 24% of ham, hit 72% of spam. Rank 100, hit 72.3684210526316% of ham, hit 0% of spam. Rank 50, hit 0% of ham, hit 0% of spam. Rank 0, hit 0% of ham, hit 4.17% of spam. Rank none, hit 27.6315789473684% of ham, hit 95.8% of spam. Rank 100, hit 79.4520547945205% of ham, hit 0% of spam. Rank 50, hit 0% of ham, hit 0% of spam. Rank 0, hit 0% of ham, hit 48.1481481481481% of spam. Rank none, hit 20.5479452054795% of ham, hit 51.8518518518519% of spam. Rank 100, hit 79.2682926829268% of ham, hit 0% of spam. Rank 50, hit 0% of ham, hit 0% of spam. Rank 0, hit 0% of ham, hit 27.8% of spam. Rank none, hit 20.7317073170732% of ham, hit 72.2% of spam. So after 400 emails, RCVD_IN_IPREPDNS_100 is hitting 79% of ham and no spam. I don't think anything else spamassassin uses can do this well. But I have data from 184,335 emails. Using all that data, results for the last 10,000 emails were: Rank 100, hit 94.1176470588235% of ham, hit 0.0101553772722657% of spam. Rank 50, hit 1.30718954248366% of ham, hit 0.0101553772722657% of spam. Rank 0, hit 0% of ham, hit 64.2022951152635% of spam. Rank none, hit 4.57516339869281% of ham, hit 35.7773941301919% of spam. RCVD_IN_IPREPDNS_100 hits 94% of ham, and 0.01% of spam. RCVD_IN_IPREPDNS_0 hits 64% of spam and no ham. Again, I don't think anything else spamassassin uses can do this well. But results this good can only be expected for people contributing data. At least until we get more people contributing data. -- The price of freedom is the willingness to do sudden battle, anywhere, at any time, and with utter recklessness. - Robert A. Heinlein http://www.ChaosReigns.com
Re: New DNS white/blacklist + spamassassin rules Re: Please report IPs delivering ham and spam with this script
eval:check_rbl_sub('iprep-firsttrusted', '127.\d+.\d+.100') describe Do not forget to backslash-quote dots in a regular expression if you mean a literal dot instead of 'any character'. Mark
Re: New DNS white/blacklist + spamassassin rules Re: Please report IPs delivering ham and spam with this script
On 4/1/11 2:34 PM, dar...@chaosreigns.com wrote: header RCVD_IN_IPREPDNS_0 eval:check_rbl_sub('iprep-firsttrusted', '127.\d+.\d+.0') describe RCVD_IN_IPREPDNS_0 Sender listed athttp://www.chaosreigns.com/iprep/, 0% ham tflags RCVD_IN_IPREPDNS_0 net might actually need a quantity qualifier. (if this ip is 0 % ham... does that actually mean it is 100% spam?) or does that mean that I (so far) only saw one email hit it, and it is spam? other than this is marking 'spam rates' and DCC commercial does the same thing for 'bulk' rates, what is the difference between this and DCC? note: dcc uses (for large installs) a local, VLDB that they 'sync' (flood they call it) in real time. but it not only tells you the bulk rate of the sender's ip, but the 'bulk hit rate' for the email you just got. sounds similar, but bulk vs spam. (and its inverse.. you collect percentages of HAM. the collect percentages of BULK). maybe 2nd or 3rd octet could contain 'confidence factor'.. eg: some sliding scale of how many actual emails you have seen? -- Michael Scheidell, CTO o: 561-999-5000 d: 561-948-2259 ISN: 1259*1300 *| *SECNAP Network Security Corporation * Best Intrusion Prevention Product, Networks Product Guide * Certified SNORT Integrator * Hot Company Award, World Executive Alliance * Best in Email Security, 2010 Network Products Guide * King of Spam Filters, SC Magazine __ This email has been scanned and certified safe by SpammerTrap(r). For Information please see http://www.secnap.com/products/spammertrap/ __
Re: New DNS white/blacklist + spamassassin rules Re: Please report IPs delivering ham and spam with this script
On 04/01, Mark Martinec wrote: eval:check_rbl_sub('iprep-firsttrusted', '127.\d+.\d+.100') describe Do not forget to backslash-quote dots in a regular expression if you mean a literal dot instead of 'any character'. Eep. That was copied from existing rules. I believe you're right, and there are a bunch of rules that need more escaping. Thanks. -- Will I ever learn? I hope not, I'm having too much fun. - Brent Minime Avis, motorcycle.com http://www.ChaosReigns.com
Re: New DNS white/blacklist + spamassassin rules Re: Please report IPs delivering ham and spam with this script
On 04/01, Michael Scheidell wrote: On 4/1/11 2:34 PM, dar...@chaosreigns.com wrote: header RCVD_IN_IPREPDNS_0 eval:check_rbl_sub('iprep-firsttrusted', '127.\d+.\d+.0') describe RCVD_IN_IPREPDNS_0 Sender listed athttp://www.chaosreigns.com/iprep/, 0% ham tflags RCVD_IN_IPREPDNS_0 net might actually need a quantity qualifier. (if this ip is 0 % ham... does that actually mean it is 100% spam?) or does that mean that I (so far) only saw one email hit it, and it is spam? It means that all of the email seen from that IP so far has been spam. Which may only have been one email. other than this is marking 'spam rates' and DCC commercial does the same thing for 'bulk' rates, what is the difference between this and DCC? The commercial part. maybe 2nd or 3rd octet could contain 'confidence factor'.. eg: It does, actually. A logarithm of the count of emails seen from that IP (newer emails weighted more than old emails, and scaled up so small old counts are greater than 0). I haven't studied data enough to figure out what threshold is best for what, and I don't think the existing rule definition language provides a good way to specify a range. Also, ignoring it is working quite well. -- I refuse to tip toe through life only to arrive safely at death. http://www.ChaosReigns.com
Re: New DNS white/blacklist + spamassassin rules Re: Please report IPs delivering ham and spam with this script
Do not forget to backslash-quote dots in a regular expression if you mean a literal dot instead of 'any character'. Eep. That was copied from existing rules. I believe you're right, and there are a bunch of rules that need more escaping. Thanks. True, there is a bunch of rules that need more escaping. It is noted somewhere in the bug tracking (but not as a standalone ticket), and needs a volunteer to do the cleaning :) Mark
Re: New DNS white/blacklist + spamassassin rules Re: Please report IPs delivering ham and spam with this script
On 04/01, Mark Martinec wrote: eval:check_rbl_sub('iprep-firsttrusted', '127.\d+.\d+.100') describe Do not forget to backslash-quote dots in a regular expression if you mean a literal dot instead of 'any character'. Updated rules (thanks again): ifplugin Mail::SpamAssassin::Plugin::DNSEval header __RCVD_IN_IPREPDNS eval:check_rbl('iprep-firsttrusted', 'iprep.chaosreigns.com.') tflags __RCVD_IN_IPREPDNS nice net header RCVD_IN_IPREPDNS_100 eval:check_rbl_sub('iprep-firsttrusted', '^127\.\d+\.\d+\.100$') describe RCVD_IN_IPREPDNS_100 Sender listed at http://www.chaosreigns.com/iprep/, 100% ham tflags RCVD_IN_IPREPDNS_100 nice net header RCVD_IN_IPREPDNS_50eval:check_rbl_sub('iprep-firsttrusted', '^127\.\d+\.\d+\.50$') describe RCVD_IN_IPREPDNS_50Sender listed at http://www.chaosreigns.com/iprep/, 50% ham tflags RCVD_IN_IPREPDNS_50nice net header RCVD_IN_IPREPDNS_0 eval:check_rbl_sub('iprep-firsttrusted', '^127\.\d+\.\d+\.0$') describe RCVD_IN_IPREPDNS_0 Sender listed at http://www.chaosreigns.com/iprep/, 0% ham tflags RCVD_IN_IPREPDNS_0 net meta RCVD_NOT_IN_IPREPDNS ( ! RCVD_IN_IPREPDNS_100 ! RCVD_IN_IPREPDNS_50 ! RCVD_IN_IPREPDNS_0 ! NO_RELAYS ) describe RCVD_NOT_IN_IPREPDNS Sender not listed at http://www.chaosreigns.com/iprep/ tflags RCVD_NOT_IN_IPREPDNS net scoreRCVD_IN_IPREPDNS_100 -0.1 scoreRCVD_IN_IPREPDNS_50-0.0001 scoreRCVD_IN_IPREPDNS_0 0.1 scoreRCVD_NOT_IN_IPREPDNS 0.0001 endif -- Go forth, and be excellent to one another. - http://www.jhuger.com/fredski.php http://www.ChaosReigns.com
Re: Please report IPs delivering ham and spam with this script
On Fri, 1 Apr 2011 14:34:16 -0400 dar...@chaosreigns.com wrote: Out of the 86,899 IPs I have data for, all but 38 are either 100% spam or 100% ham, That sounds a bit funny. We have data on over 17 million IP addresses (collected using http://mimedefang.org/reputation) Of those, about 9 million report at least one ham or one spam -- the remainder either never made it past greylisting or only tried emailing nonexistent recipient addresses. Of those 9,102,875 hosts: o 536,596 (5.8%) sent _only_ ham o 7,821,574 (86%) sent _only_ spam o The remaining 744,705 (8.2%) sent a mixture. Most Yahoo! servers are in this category. You saw less than 0.05% sending a mixture, which means you are probably not getting a good sample. Regards, David. PS: If anyone wants to contribute to and download *our* reputation list, please see http://mimedefang.org/reputation and email me off-list. Please be aware that unlike darxus' list, ours is not freely-available, though we generally give free downloads to organizations willing to feed us reputation data if they do a statistically-useful amount of mail (= 50K messages/day).
Re: Please report IPs delivering ham and spam with this script
On 04/01, David F. Skoll wrote: o 536,596 (5.8%) sent _only_ ham o 7,821,574 (86%) sent _only_ spam o The remaining 744,705 (8.2%) sent a mixture. Most Yahoo! servers are in this category. Sounds reasonable. It's nice to see the numbers, thanks. You saw less than 0.05% sending a mixture, which means you are probably not getting a good sample. Yup. I don't have enough data. That's why I'm asking for more. -- Life is either a daring adventure or it is nothing at all. - Helen Keller http://www.ChaosReigns.com
Please report IPs delivering ham and spam with this script
My plan is to create another free reputation service, like a combination of a whitelist and a blacklist, except providing the actual data instead of just yes/no/maybe. To help SpamAssassin filtering, obviously. The data I'm planning to provide is, for every IP address, the percentage of email from it which was ham (normalized like the S/O value in SpamAssassin ruleqa), and total count of recent emails from that IP (a logarithm of it). Output data based on my own email: http://www.chaosreigns.com/iprep/iprep.txt With my 2618 hams, and 2956 spams, there were only *two* IP addresses that were not 100% spam or 100% ham (both belong to google). This kind of thing is why black lists and white lists are useful for predicting if an email is spam or ham. The highest ranked test in SpamAssassin is RCVD_IN_XBL, a spamhaus.org blacklist. #7 is RCVD_IN_PSBL, and #11 is RCVD_IN_DNSWL_HI, which is also the highest ranking nice rule. To do this, I need data from you. Create a folder containing only email you've confirmed is ham, and another containing what you've confirmed is spam. http://www.chaosreigns.com/iprep/dl/iprep.pl ./iprep.pl ham:dir:~/masscheckwork/ham spam:dir:~/masscheckwork/spam/ The arguments are the same as the targets used by SpamAssassin's mass-check (using its perl modules): class:format:location class is spam or ham format is dir, file, mbx, mbox, or detect locationis a file or directory name. globbing of ~ and * is supported You can specify many targets at once. Please run it as a daily cron job. The required ~/.ipreprc config file: $trusted_networks = 'space delimited list of trusted hosts'; $user = 'username'; $pass = 'password'; $trusted_networks is very important, and needs to contain everything from both your trusted_networks and internal_networks values from SpamAssassin, which are documented here: http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html#network_test_options http://wiki.apache.org/spamassassin/TrustPath This is to prevent reporting the IP of your trusted relays instead of the actual IP sending the email. Email me to get an account to upload the data. Please email me from a non-freemail account, one not listed in http://svn.apache.org/repos/asf/spamassassin/trunk/rules/20_freemail_domains.cf Major examples of freemail accounts, which I don't want you to email me from, are: gmail.com, yahoo.com, and hotmail.com. This is just to make it slightly harder for spammers to send me bad data. And if you're on this list, I know you have a non-freemail account. I won't tell anybody your email address, and I consider the uploaded data confidential. I'm thinking about providing the data only via rsync, instead of via DNS, because I think that should reduce network load. I'd create a plugin that would grab the data directly. Just as a disclosure, I have been involved with dnswl.org since November 2006. I have no plan to use any of their data, other than to look for problems in my data. -- Let's just say that if complete and utter chaos was lightning, then he'd be the sort to stand on a hilltop in a thunderstorm wearing wet copper armour and shouting 'All gods are bastards'. - The Color of Magic http://www.ChaosReigns.com
learn spam as ham or ham as spam
so I have to make two calls to learn? ie. one to forget and another to relearn? never is learning combined with forgetting right?
Re: learn spam as ham or ham as spam
On Tue, Aug 14, 2007 at 10:32:56AM +0700, Robert Nicholson wrote: ie. one to forget and another to relearn? never is learning combined with forgetting right? One call is sufficient. If the message was previously learned the wrong way, it'll be forgotten for you, then learned the way you specify. -- Randomly Selected Tagline: Yeah, that's it! I was right! It's reality that has it wrong! - Jim Toth pgpD3F7lnl7ow.pgp Description: PGP signature
Re: HAM and SPAM mailboxes
OK, Chris, I think I'll go on with you suggestion. I seems simpler, and a lower load for my busted servers. However, I'm not a Perl Guru myself, so, mind if you could clarify what did you ment with In that case, Perl's Mail::Box::Manager is your friend. How do I extract the original mail from the forwarded one? Thanks, Luis 2007/3/2, Chris St. Pierre [EMAIL PROTECTED]: On Fri, 2 Mar 2007, Luis Hernán Otegui wrote: Hi, people, I am currently researching, trying to implement a way for my POP3 users to train SA via message forwarding. I've read in the list that the messages should be forwarded as attachments. My question is how do you make SA process them. I was thinking of creating two accounts ( [EMAIL PROTECTED], and [EMAIL PROTECTED]), but frankly, I don't understand the way to hand the forwarded messages to SA... Instead of forwarding as an attachment, I have my users bounce/redirect/resend their mail, which maintains the message in its original state and is a lot easier to process than messages in attachments. That way, I can just have a cron job go through the [EMAIL PROTECTED] and [EMAIL PROTECTED] mailboxes and have sa-learn learn each message. Otherwise, you'll have to strip the attachments and pipe them into sa-learn, which is a lot less trivial. In that case, Perl's Mail::Box::Manager is your friend. Chris St. Pierre Unix Systems Administrator Nebraska Wesleyan University Never send mail to [EMAIL PROTECTED] -- - GNU-GPL: May The Source Be With You... -
Re: HAM and SPAM mailboxes
On Mon, 5 Mar 2007, Luis Hernán Otegui wrote: OK, Chris, I think I'll go on with you suggestion. I seems simpler, and a lower load for my busted servers. However, I'm not a Perl Guru myself, so, mind if you could clarify what did you ment with In that case, Perl's Mail::Box::Manager is your friend. How do I extract the original mail from the forwarded one? No idea -- I'm not doing things that way myself. My suggestion was actually to bounce or resend the FPs and FNs, since it's a lot simpler. You can just call sa-learn on the bounced messages themselves rather than extracting the forwarded messages from the attachments. If you decide to have your users forward as an attachment, though, Mail::Box[::Manager] is a Perl module for doing magic with mailboxes. You'll probably want to do something like this, assuming you're using Maildir: my $mgr = Mail::Box::Manager-new(); my $folder = $mgr-open(folder = /path/to/spam/mailbox, fix_headers = 1); foreach my $msg ($folder-messages()) { magic with $msg-parts() } If you're not using Maildir, you'll have to figure out what to do from there. I know Mail::Box supports MH, Mbox, and who knows what else, but haven't used those myself. http://search.cpan.org/~markov/Mail-Box-2.069/lib/Mail/Box-Overview.pod should get you started. Chris St. Pierre Unix Systems Administrator Nebraska Wesleyan University Never send mail to [EMAIL PROTECTED]
Re: HAM and SPAM mailboxes
On Mon, Mar 05, 2007 at 10:58:00AM -0300, Luis Hernán Otegui wrote: OK, Chris, I think I'll go on with you suggestion. I seems simpler, and a lower load for my busted servers. However, I'm not a Perl Guru myself, so, mind if you could clarify what did you ment with In that case, Perl's Mail::Box::Manager is your friend. How do I extract the original mail from the forwarded one? I have written a small program in Ocaml which I use for that purpose. It extracts emails that was forwarded as attachments and put them in to a separate diretory from where it can be processed. At the moment the directories are hardcoded but I can adapt it for more generic situations there is a need for. If someone is interested, let me know and I will try and make it available. Regards Johann -- Johann Spies Telefoon: 021-808 4036 Informasietegnologie, Universiteit van Stellenbosch The LORD is my light and my salvation; whom shall I fear? the LORD is the strength of my life; of whom shall I be afraid? Psalms 27:1
HAM and SPAM mailboxes
Hi, people, I am currently researching, trying to implement a way for my POP3 users to train SA via message forwarding. I've read in the list that the messages should be forwarded as attachments. My question is how do you make SA process them. I was thinking of creating two accounts ( [EMAIL PROTECTED], and [EMAIL PROTECTED]), but frankly, I don't understand the way to hand the forwarded messages to SA... Currently, I run two production servers, with virtual users, and they have separate HAM and SPAM IMAP folders for each user. Via a cron job, I teach the system the spam messages (I've instructed my users to move the spam messages there via our webmail). But now I'm looking forward to expand the service to my POP3 users. Any suggests will be welcomed. BTW, I run SA (v 3.1.7) trough AMaViS over Postfix, Debian Sarge based install. Thanks in advance, Luis -- - GNU-GPL: May The Source Be With You... -
Re: HAM and SPAM mailboxes
On Fri, 2 Mar 2007, Luis Hernán Otegui wrote: Hi, people, I am currently researching, trying to implement a way for my POP3 users to train SA via message forwarding. I've read in the list that the messages should be forwarded as attachments. My question is how do you make SA process them. I was thinking of creating two accounts ( [EMAIL PROTECTED], and [EMAIL PROTECTED]), but frankly, I don't understand the way to hand the forwarded messages to SA... Instead of forwarding as an attachment, I have my users bounce/redirect/resend their mail, which maintains the message in its original state and is a lot easier to process than messages in attachments. That way, I can just have a cron job go through the [EMAIL PROTECTED] and [EMAIL PROTECTED] mailboxes and have sa-learn learn each message. Otherwise, you'll have to strip the attachments and pipe them into sa-learn, which is a lot less trivial. In that case, Perl's Mail::Box::Manager is your friend. Chris St. Pierre Unix Systems Administrator Nebraska Wesleyan University Never send mail to [EMAIL PROTECTED]
ham and spam
I am sure this has been posted b/4 however I am having a hard time finding it in the archives How does one feed bayes ham and spam on an smpt gateway(no local deliverey). All sever does is accetp mail for one 2 domains scrub for virus and spam and then forward it to its nastly littly exchange server. Regards, Michael Di Martino Director of MIS The telx Group, Inc 17 State St 33rd Floor New York, NY 10004 p: 212.480.3300 m: 646.207.6603 www.telx.com -Original Message- From: Dirk Bonengel [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 20, 2006 11:57 AM To: users@spamassassin.apache.org Subject: Re: How to install iXhash Marc, just drop both files (.cf and .pm) into the directory where your local.cf is. One important piece of (missing) info: You must be running SA v 3.1.0 or higher (not 3.0 as stated). If this is a problem I can easily post a version working with 3.0.x Dirk Marc Perkel schrieb: Matt Kettler wrote: Marc Perkel wrote: Here's the link to the wiki, but I don't know what to do with it. http://wiki.apache.org/spamassassin/iXhash Disclaimer: I've never tried this. However, the following is a fairly well educated guess at how to install it. 1) copy paste the bottom half into a file called iXhash.pm 2) copy-paste the top half into a file called ixhash.cf 3) place iXhash.pm somewhere that is global r/x 4) edit the ixhash.cf to reflect where iXhash.pm is. 5) copy ixhash.cf into /etc/mail/spamassasssin 6) run spamassassin --lint 7) if it passes, restart spamd or any other persistent daemons that use the spamassassin perl API. Thanks Matt - what directory would you put iXhash.pm in? If I get this to work I'll update the wiki.
RE: ham and spam
Michael, There are a couple ways of doing. It really depends on how easy you want to make it for your users/admins. It also depends on your configuration. We use MySQL for bayes and awl. This make it easy for us as we have an internal machine running Cyrus and SA. We have a local account with an imap account on it that we copy the email to. From there we have a script that runs it against the ham/spam folders (including unlearn for the same). If you are running bayes on local DB you options are a little different. * You can create an imap account on the gateway, move mail to it and learn through. * You can create another non-gateway machine, install SA on it, and load the spamd servers to listen to additional subnets beyond localhost * You can take the messages and SFTP them to the gateway and run some type of automated job there (this is what we did in the very beginning for us some years ago). There are other ways, these are just the ones that come off the top of my head. Gary -Original Message- From: Michael Di Martino [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 20, 2006 12:36 PM To: users@spamassassin.apache.org Subject: ham and spam I am sure this has been posted b/4 however I am having a hard time finding it in the archives How does one feed bayes ham and spam on an smpt gateway(no local deliverey). All sever does is accetp mail for one 2 domains scrub for virus and spam and then forward it to its nastly littly exchange server. Regards, Michael Di Martino Director of MIS The telx Group, Inc 17 State St 33rd Floor New York, NY 10004 p: 212.480.3300 m: 646.207.6603 www.telx.com -Original Message- From: Dirk Bonengel [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 20, 2006 11:57 AM To: users@spamassassin.apache.org Subject: Re: How to install iXhash Marc, just drop both files (.cf and .pm) into the directory where your local.cf is. One important piece of (missing) info: You must be running SA v 3.1.0 or higher (not 3.0 as stated). If this is a problem I can easily post a version working with 3.0.x Dirk Marc Perkel schrieb: Matt Kettler wrote: Marc Perkel wrote: Here's the link to the wiki, but I don't know what to do with it. http://wiki.apache.org/spamassassin/iXhash Disclaimer: I've never tried this. However, the following is a fairly well educated guess at how to install it. 1) copy paste the bottom half into a file called iXhash.pm 2) copy-paste the top half into a file called ixhash.cf 3) place iXhash.pm somewhere that is global r/x 4) edit the ixhash.cf to reflect where iXhash.pm is. 5) copy ixhash.cf into /etc/mail/spamassasssin 6) run spamassassin --lint 7) if it passes, restart spamd or any other persistent daemons that use the spamassassin perl API. Thanks Matt - what directory would you put iXhash.pm in? If I get this to work I'll update the wiki.
RE: ham and spam
How does one feed bayes ham and spam on an smpt gateway(no local deliverey). All sever does is accetp mail for one 2 domains scrub for virus and spam and then forward it to its nastly littly exchange server. Can you set up shared Exchange folders that can be exported to mbox format? If so, set up learn-ham and learn-spam folders, tell people to train to them, then periodically export them, transfer them to the SA host, and run sa-learn on them. Perhaps someone sufficiently motivated could write an sa-learn - IMAP client utility to train from arbitrary IMAP folders hosted remotely... Actually a few people have created various versions of that already. I modified one based on the IMAP interface found here http://gagravarr.org/code/ and use that. I found another here http://www.dmzs.com/tools/files/spam.phtml. Essentially, they just use IMAP to retrieve messages from the learn-spam and learn-ham folders (or whatever you want to call them), and pass it to SA to learn. I don't use Exchange, but it is my understanding that it supports IMAP access to its folders... Bret
Re: ham and spam
John D. Hardin wrote: On Tue, 20 Jun 2006, Michael Di Martino wrote: How does one feed bayes ham and spam on an smpt gateway(no local deliverey). All sever does is accetp mail for one 2 domains scrub for virus and spam and then forward it to its nastly littly exchange server. Can you set up shared Exchange folders that can be exported to mbox format? If so, set up learn-ham and learn-spam folders, tell people to train to them, then periodically export them, transfer them to the SA host, and run sa-learn on them. Perhaps someone sufficiently motivated could write an sa-learn - IMAP client utility to train from arbitrary IMAP folders hosted remotely... We have trained users to put misclassified ham and spam into two public folders, should-be-spam and should-be-ham. We created an exchange user, spamiam, that has full rights to these folders. At the top of every hour, this script is run on the one MX server: # more get_ham_spam #! /bin/sh rm -f /var/spool/mail/spamiam touch /var/spool/mail/spamiam chown spamiam:mail /var/spool/mail/spamiam su spamiam -c 'fetchmail -a -K -f /usr/local/scripts/spamiam.fetchmailrc -r Public Folders/should- be-spam' cat /var/spool/mail/spamiam /var/www/html/spamstuff/should-be-spam sa-learn --spam --mbox /var/www/html/spamstuff/should-be-spam rm -f /var/spool/mail/spamiam touch /var/spool/mail/spamiam chown spamiam:mail /var/spool/mail/spamiam su spamiam -c 'fetchmail -a -K -f /usr/local/scripts/spamiam.fetchmailrc -r Public Folders/should- be-ham' cat /var/spool/mail/spamiam /var/www/html/spamstuff/should-be-ham sa-learn --ham --mbox /var/www/html/spamstuff/should-be-ham # more spamiam.fetchmailrc pollexchange..com proto imap user spamiam password x is spamiam here At 15 past each hour, the two other mail servers use wget to grab the should-be files to their local /tmp and run sa-learn. The files are included in logrotate, so they get zero'd every Sunday morning. -- Steve
Re: enabable x-spam-report in all emails ham or spam
Keith Amling wrote: Fascinating the man page seems to indicate this is not one of the options for add_header. They mention other headers but not Report. I guess you found a cheat. What is not one of the options? 'add_header', 'all', and '_REPORT_' are all mentioned directly in the perldoc for Conf. How is my suggestion a 'cheat'? All the options are documented, and using one configuration option to over-ride another is also documented. However, using a configuration option to over-ride a hard-coded setting from Conf.pm, is definitely NOT documented. The fact that it depends on the specific, and undocumented, way the developers chose to implement adding the header for report_safe 0 makes it a cheat. I wouldn't say it's a particularly egregious cheat, but it's certainly not documented. It's just taking advantage of the fact that the developers (sensibly) made use of existing functionality to do this.
Re: enabable x-spam-report in all emails ham or spam
Fascinating the man page seems to indicate this is not one of the options for add_header. They mention other headers but not Report. I guess you found a cheat. What is not one of the options? 'add_header', 'all', and '_REPORT_' are all mentioned directly in the perldoc for Conf. How is my suggestion a 'cheat'? {^_^} Keith
Re: enabable x-spam-report in all emails ham or spam
From: Keith Amling [EMAIL PROTECTED] Fascinating the man page seems to indicate this is not one of the options for add_header. They mention other headers but not Report. I guess you found a cheat. What is not one of the options? 'add_header', 'all', and '_REPORT_' are all mentioned directly in the perldoc for Conf. How is my suggestion a 'cheat'? {^_^} I was looking for Report in and around add_header on the 3.04 docs I have here. {^_^}
Re: enabable x-spam-report in all emails ham or spam
is it possible to enable the addition of the x-spam-report in all emails? I note $self-{headers_spam}-{Report} = _REPORT_; in Conf.pm which amounts to the configuration add_header spam Report _REPORT_ I wanted the exact same thing you want and add_header all Report _REPORT_ has worked perfectly for me. YMMV, esp. wrt. report_safe. -matt Keith
Re: enabable x-spam-report in all emails ham or spam
works perfectly. thanks dude! -Matt - Original Message - From: Keith Amling [EMAIL PROTECTED] To: users@spamassassin.apache.org Sent: Sunday, September 25, 2005 1:33 AM Subject: Re: enabable x-spam-report in all emails ham or spam is it possible to enable the addition of the x-spam-report in all emails? I note $self-{headers_spam}-{Report} = _REPORT_; in Conf.pm which amounts to the configuration add_header spam Report _REPORT_ I wanted the exact same thing you want and add_header all Report _REPORT_ has worked perfectly for me. YMMV, esp. wrt. report_safe. -matt Keith
Re: enabable x-spam-report in all emails ham or spam
From: Keith Amling [EMAIL PROTECTED] is it possible to enable the addition of the x-spam-report in all emails? I note $self-{headers_spam}-{Report} = _REPORT_; in Conf.pm which amounts to the configuration add_header spam Report _REPORT_ I wanted the exact same thing you want and add_header all Report _REPORT_ has worked perfectly for me. YMMV, esp. wrt. report_safe. Fascinating the man page seems to indicate this is not one of the options for add_header. They mention other headers but not Report. I guess you found a cheat. {^_^}
Re: enabable x-spam-report in all emails ham or spam
is it possible to enable the addition of the x-spam-report in all emails? Depends on what you are using to integrate SA. If you are using spamd, yes. Some of the other tools that make their own headers, no. Loren
enabable x-spam-report in all emails ham or spam
is it possible to enable the addition of the x-spam-report in all emails? -matt
Re: enabable x-spam-report in all emails ham or spam
Matthew Lenz wrote: is it possible to enable the addition of the x-spam-report in all emails? -matt Well, there is no X-Spam-Report header made by SA's default configuration. By default SA does add X-Spam-Status to all messages, which would include the score and list of rules that hit. However, in general all you'd need to do is modify the add_header spam Report ... command to use all instead of spam. Of course, that's assuming your X-Spam-Report header is being made by SA. If you're using an integration tool like MailScanner, qmail or Mimedefang you may have to change their configuration, not SA. Post some more details about your configuration if you're still having problems.
Re: enabable x-spam-report in all emails ham or spam
- Original Message - From: Matt Kettler [EMAIL PROTECTED] To: Matthew Lenz [EMAIL PROTECTED] Cc: users@spamassassin.apache.org Sent: Saturday, September 24, 2005 6:59 PM Subject: Re: enabable x-spam-report in all emails ham or spam Matthew Lenz wrote: is it possible to enable the addition of the x-spam-report in all emails? -matt Well, there is no X-Spam-Report header made by SA's default configuration. huh? it addes it to spam by default X-Spam-Status: Yes, score=6.4 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DSBL, RCVD_IN_XBL,URIBL_SBL,URIBL_WS_SURBL autolearn=no version=3.0.3 X-Spam-Report: * 0.0 HTML_MESSAGE BODY: HTML included in message * 2.8 RCVD_IN_DSBL RBL: Received via a relay in list.dsbl.org * [http://dsbl.org/listing?218.234.40.38] * 2.5 RCVD_IN_XBL RBL: Received via a relay in Spamhaus XBL * [218.234.40.38 listed in sbl-xbl.spamhaus.org] * 0.6 URIBL_SBL Contains an URL listed in the SBL blocklist * [URIs: grounansho.com] * 0.5 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist * [URIs: grounansho.com] unless thats my imagination By default SA does add X-Spam-Status to all messages, which would include the score and list of rules that hit. yep sure does. However, in general all you'd need to do is modify the add_header spam Report ... command to use all instead of spam. what is ... ? add_header all Report doesn't do squat. Of course, that's assuming your X-Spam-Report header is being made by SA. If you're using an integration tool like MailScanner, qmail or Mimedefang you may have to change their configuration, not SA. i gave you everything.. yes i'm only using SA and sending mail through spamc using procmail. i don't have it encapsulate spams by default. by putting report_safe 0 in my local.cf Post some more details about your configuration if you're still having problems.
Re: enabable x-spam-report in all emails ham or spam
From: Matthew Lenz [EMAIL PROTECTED] From: Matt Kettler [EMAIL PROTECTED] Matthew Lenz wrote: is it possible to enable the addition of the x-spam-report in all emails? -matt Well, there is no X-Spam-Report header made by SA's default configuration. huh? it addes it to spam by default Can't be done acto the docs. The X-Spam-Report header is only added to spam. How it is added is handled by the report_safe option. If you want this for testing then use spamassassin -t. If you want it for everybody then you cannot use spamc/spamd. You'd need to use spamassassin itself and add the -t option explicitly. Please RTFM, man Mail::SpamAssassin::Conf, for more details. {^_^}
Re: Spamassassin only autolearning ham, not spam after upgrade to 3.0.2
Kelly Corbin wrote: I have 4 machines configured identically (with the exception of the -m option due to differences in resources on each machine) with SpamAssassin and spamass-milter. I recently upgraded to 3.0.2 from 2.64 and everything seems to be working pretty good with the exception of one machine. After watching the mail log, I noticed that it is not autolearning any spam, no matter how high it scores. It does autolearn ham however, and the other 3 machines autolearn spam fine. I've looked at everything I can think of (configuration files, file permissions, checked FAQ's, searched list archives, etc.) and can't figure out why it won't autolearn any spam. Any ideas? Take an email with lots of hits and save it as 'spam-email', then run 'spamassassin -t -D spam-email' and see what the debug has to say about it. Feel free to post the Bayes-specific parts of the debug here if you aren't sure of how to read it. Thanks! Kelly
Re: Spamassassin only autolearning ham, not spam after upgrade to 3.0.2
Here's my auto-learn lines from the machine that doesn't work: debug: auto-learn: currently using scoreset 3, recomputing score based on scoreset 1. debug: auto-learn: message score: 23.316, computed score for autolearn: 24.06 debug: auto-learn? ham=0.1, spam=10, body-points=16.82, head-points=9.84, learned-points=-2.599 debug: auto-learn? no: scored as spam but learner indicated ham (-2.599 -1) debug: is spam? score=23.316 required=6 And here's my output from the machine that's learning OK: debug: auto-learn: currently using scoreset 3, recomputing score based on scoreset 1. debug: auto-learn: message score: 25.916, computed score for autolearn: 24.06 debug: auto-learn? ham=0.1, spam=10, body-points=16.82, head-points=9.84, learned-points=0.001 debug: auto-learn? yes, spam (24.06 10) debug: Learning Spam What is this 'learned-points'? Is my database poisoned on the affected machine? Thanks! Kelly Kevin Peuhkurinen wrote: Kelly Corbin wrote: I have 4 machines configured identically (with the exception of the -m option due to differences in resources on each machine) with SpamAssassin and spamass-milter. I recently upgraded to 3.0.2 from 2.64 and everything seems to be working pretty good with the exception of one machine. After watching the mail log, I noticed that it is not autolearning any spam, no matter how high it scores. It does autolearn ham however, and the other 3 machines autolearn spam fine. I've looked at everything I can think of (configuration files, file permissions, checked FAQ's, searched list archives, etc.) and can't figure out why it won't autolearn any spam. Any ideas? Take an email with lots of hits and save it as 'spam-email', then run 'spamassassin -t -D spam-email' and see what the debug has to say about it. Feel free to post the Bayes-specific parts of the debug here if you aren't sure of how to read it. Thanks! Kelly -- -- Kelly Corbin -- Network Administrator -- -- http://www.theiqgroup.com -- -- The IQ Group, Inc. -- 6740 Antioch Suite 260 -- Merriam, KS 66204 -- (913)722-6700 x105 -- Fax (913)722-7264
Re: Spamassassin only autolearning ham, not spam after upgrade to 3.0.2
Kelly Corbin wrote: Here's my auto-learn lines from the machine that doesn't work: debug: auto-learn: currently using scoreset 3, recomputing score based on scoreset 1. debug: auto-learn: message score: 23.316, computed score for autolearn: 24.06 debug: auto-learn? ham=0.1, spam=10, body-points=16.82, head-points=9.84, learned-points=-2.599 debug: auto-learn? no: scored as spam but learner indicated ham (-2.599 -1) debug: is spam? score=23.316 required=6 And here's my output from the machine that's learning OK: debug: auto-learn: currently using scoreset 3, recomputing score based on scoreset 1. debug: auto-learn: message score: 25.916, computed score for autolearn: 24.06 debug: auto-learn? ham=0.1, spam=10, body-points=16.82, head-points=9.84, learned-points=0.001 debug: auto-learn? yes, spam (24.06 10) debug: Learning Spam What is this 'learned-points'? Is my database poisoned on the affected machine? I'm guessing here that the email is hitting BAYES_00 (which has a score of -2.599 by default, and which is the learned points).SA now has some code to ensure that emails that hit low BAYES scores will not be autolearned as spam and emails that hit high BAYES scores will not be autolearned as ham, no matter what they score otherwise. I'm assuming, then, that all or most of your emails are hitting BAYES_00 to BAYES_40 only. This means that indeed your Bayes database is pooched. The easiest solution is likely to just delete the database from this machine and copy over the database from one of your other systems, provided that they are handling similar types of emails.
Re: Spamassassin only autolearning ham, not spam after upgrade to 3.0.2
That did the trick! I just copied over the databases from one of the good machines and right away it started doing the autolearn=spam. Thanks for all your help. Kelly Kevin Peuhkurinen wrote: Kelly Corbin wrote: Here's my auto-learn lines from the machine that doesn't work: debug: auto-learn: currently using scoreset 3, recomputing score based on scoreset 1. debug: auto-learn: message score: 23.316, computed score for autolearn: 24.06 debug: auto-learn? ham=0.1, spam=10, body-points=16.82, head-points=9.84, learned-points=-2.599 debug: auto-learn? no: scored as spam but learner indicated ham (-2.599 -1) debug: is spam? score=23.316 required=6 And here's my output from the machine that's learning OK: debug: auto-learn: currently using scoreset 3, recomputing score based on scoreset 1. debug: auto-learn: message score: 25.916, computed score for autolearn: 24.06 debug: auto-learn? ham=0.1, spam=10, body-points=16.82, head-points=9.84, learned-points=0.001 debug: auto-learn? yes, spam (24.06 10) debug: Learning Spam What is this 'learned-points'? Is my database poisoned on the affected machine? I'm guessing here that the email is hitting BAYES_00 (which has a score of -2.599 by default, and which is the learned points).SA now has some code to ensure that emails that hit low BAYES scores will not be autolearned as spam and emails that hit high BAYES scores will not be autolearned as ham, no matter what they score otherwise. I'm assuming, then, that all or most of your emails are hitting BAYES_00 to BAYES_40 only. This means that indeed your Bayes database is pooched. The easiest solution is likely to just delete the database from this machine and copy over the database from one of your other systems, provided that they are handling similar types of emails. -- -- Kelly Corbin -- Network Administrator -- -- http://www.theiqgroup.com -- -- The IQ Group, Inc. -- 6740 Antioch Suite 260 -- Merriam, KS 66204 -- (913)722-6700 x105 -- Fax (913)722-7264
Re: Spamassassin only autolearning ham, not spam after upgrade to 3.0.2
Kelly Corbin wrote: What is this 'learned-points'? That's what score the BAYES_* rules would have given this message based on existing learning. This is basically used to prevent SA from automatically learning anything that noticeably contradicts the existing training. Is my database poisoned on the affected machine? Possibly. It's either poisoned, or it's just not trained on a wide enough variety of spam. It looks like SA's existing training tells it to regard that message as BAYES_00. (ie: less than 1% chance of being spam). I'm basing the BAYES_00 claim on the learned points being -2.599, which matches the score of the BAYES_00 rule.
Spamassassin only autolearning ham, not spam after upgrade to 3.0.2
I have 4 machines configured identically (with the exception of the -m option due to differences in resources on each machine) with SpamAssassin and spamass-milter. I recently upgraded to 3.0.2 from 2.64 and everything seems to be working pretty good with the exception of one machine. After watching the mail log, I noticed that it is not autolearning any spam, no matter how high it scores. It does autolearn ham however, and the other 3 machines autolearn spam fine. I've looked at everything I can think of (configuration files, file permissions, checked FAQ's, searched list archives, etc.) and can't figure out why it won't autolearn any spam. Any ideas? Thanks! Kelly -- -- Kelly Corbin -- Network Administrator -- -- http://www.theiqgroup.com -- -- The IQ Group, Inc. -- 6740 Antioch Suite 260 -- Merriam, KS 66204 -- (913)722-6700 x105 -- Fax (913)722-7264
List number of ham and spam in Bayes
Good day, I'm sorry if that question has been answered before, but I could not find an answer. Is there a command / way that will show how many spams and hams have been learned by the Bayesian filter? -- Mathieu Nantel, RHCE - Systems Manager Ecopia BioSciences Inc. (514) 336-2724 x434
Re: List number of ham and spam in Bayes
Mathieu Nantel wrote: Good day, I'm sorry if that question has been answered before, but I could not find an answer. Is there a command / way that will show how many spams and hams have been learned by the Bayesian filter? sa-learn --dump magic -- Adam Lanier Bernard L. Madoff Investment Securities LLC
Re: Manually learnt HAM as SPAM. Can I undo?
Hey, On Sat, 16 Oct 2004 09:46:36 +0200, Nicolas wrote: N This morning I made a mistake with spamassassin. I manually learnt N (/usr/bin/sa-learn --spam) an IMPORTANT message as SPAM. I N immediately learnt it as HAM, but is that sufficient? N Do I have to delete all of the bayes tokens I accumulated for over 1 N year? Nope, you're fine. If you RTM, you'd know that: --ham Learn the input message(s) as ham. If you have previously learnt any of the messages as spam, SpamAssassin will forget them first, then re-learn them as ham. Alternatively, if you have previously learnt them as ham, it'll skip them this time around. If the messages have already been filtered through SpamAssassin, the learner will ignore any modifications SpamAssassin may have made. tobias