Re: pyzor: check failed: internal error, python traceback seen in response
On 06/30/2014 08:58 PM, Steve Bergman wrote: I'm getting: pyzor: check failed: internal error, python traceback seen in response I'm running Ubuntu 10.04 on the server, with the Ubuntu provided packages. On 30.06.14 21:15, Axb wrote: time to update... pyzor 1:0.5.0-0ubuntu2 ancient, buggy, EOL version for both issues, you should ask help on ubuntu. I have no idea whether 10.04 is supported still (is that LTS version?) but ubuntu should take care about such issues if it's supported (well, that's what support means) -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. REALITY.SYS corrupted. Press any key to reboot Universe.
Re: Bayer Filter Not Working
Den 25.06.2014 00:42, skrev Bruce Sackett: I apologize, I’m sure it’s been covered, but I have not been successful finding results in searches on the web or through the history of the list. I get no BAYES results in the headers, so I don’t see any working. The part that gets me is below: Jun 24 13:47:53.165 [3245] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks Jun 24 13:47:53.166 [3245] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen Jun 24 13:47:53.167 [3245] dbg: bayes: found bayes db version 3 Jun 24 13:47:53.167 [3245] warn: plugin: eval failed: Insecure dependency in sprintf while running with -T switch at /usr/local/share/perl/5.14.2/Mail/SpamAssassin/Logger.pm line 241. Jun 24 13:47:53.168 [3245] dbg: config: score set 0 chosen. That seems to be the last time Bayes is referenced in a spamassassin -D —lint Has anyone else run into this? I am using an Ubuntu 12.04 server, if that makes any difference. I have the same problem on FreeBSD: Jul 1 05:33:51.765 [43144] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x805b09f78), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM Jul 1 05:33:51.778 [43144] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x806108798) Jul 1 05:33:51.779 [43144] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks Jul 1 05:33:51.779 [43144] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen Jul 1 05:33:51.779 [43144] dbg: bayes: found bayes db version 3 Jul 1 05:33:51.779 [43144] warn: plugin: eval failed: Insecure dependency in sprintf while running with -T switch at /usr/local/lib/perl5/site_perl/5.16/Mail/SpamAssassin/Logger.pm line 241. Jul 1 05:33:51.799 [43144] warn: plugin: eval failed: Insecure dependency in sprintf while running with -T switch at /usr/local/lib/perl5/site_perl/5.16/Mail/SpamAssassin/Logger.pm line 241. Running 'sa-learn --force-expire' seems to resolve the issue temporally. Jul 1 09:35:06.084 [49647] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x805b09f78), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM Jul 1 09:35:06.097 [49647] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x806108798) Jul 1 09:35:06.098 [49647] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks Jul 1 09:35:06.098 [49647] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen Jul 1 09:35:06.098 [49647] dbg: bayes: found bayes db version 3 Jul 1 09:35:06.099 [49647] dbg: bayes: DB journal sync: last sync: 0 Jul 1 09:35:06.570 [49647] dbg: bayes: DB journal sync: last sync: 0 Jul 1 09:35:06.570 [49647] dbg: bayes: corpus size: nspam = 120857, nham = 664988 After a while the error returns. Do I have to wipe my bayes DB? -- Herbert
Changes in Spamhaus DBL DNSBL return codes
As per: http://www.spamhaus.org/news/article/713/ Return CodesTypeNote 127.0.1.2 spam domain 127.0.1.3 spammed redirector / url shortener (Phased out on January 7th, 2015) 127.0.1.4 phish domain 127.0.1.5 malware domain 127.0.1.6 Botnet CC domain 127.0.1.102 abused legit spam 127.0.1.103 abused legit redirector / url shortener 127.0.1.104 abused legit phish 127.0.1.105 abused legit malware 127.0.1.106 abused legit botnet CC 127.0.1.255 IP queries prohibited! Rules have been updated (SA Bug 7056 - 2014-06-17) to reflect this. Please run sa-update to get the updated rules scores. Axb
Re: pyzor: check failed: internal error, python traceback seen in response
Hmmm... My original question was where's the traceback. Not whether this or that project chooses to abandon its stable releases. Ubuntu 10.04 LTS Server is supported until May 2015. And similar time-frame releases of SA and Pyzor are supported until 2020 in RHEL/Scientific Linux/Centos. I'm sure that bugs have been fixed, and new ones introduced, in later versions of both packages. All I really want is to find some diagnostic output. When I run Pyzor from the command line on the same emails it returns without an error. -Steve Bergman
Re: pyzor: check failed: internal error, python traceback seen in response
pyzor 1:0.5.0-0ubuntu2 ancient, buggy, EOL version Interestingly, pyzor 0.7.0 (the latest stable version) gives the same error. And SA is not preserving the diagnostic output from it for the admin to view, even with debuging turned on in both packages. Looks like the bugs are in Spamassassin. I guess I'm not sure why such buggy software would ever have been released as gold in the first place. -Steve
Re: pyzor: check failed: internal error, python traceback seen in response
On 06/30/2014 02:15 PM, Axb wrote: As you don't mention what gue you use with SA it's hard to guess where your Pyzor config files should be. I guess I'm not quite sure what gue I am using with SA. Where would I find that?
Re: pyzor: check failed: internal error, python traceback seen in response
On 07/01/2014 02:57 PM, Steve Bergman wrote: On 06/30/2014 02:15 PM, Axb wrote: As you don't mention what gue you use with SA it's hard to guess where your Pyzor config files should be. I guess I'm not quite sure what gue I am using with SA. Where would I find that? phatfingers meant glue - how do you interface spamassassin with your MTA/MUA amavisd, procmail, some milter, etc.. running under what user etc
Re: getting tons of SPAM
Hello, I am trying to manipulate spamassassin scores, I am getting lots of SPAM with very low score. X-Virus-Scanned: amavisd-new at fqdn.com X-Spam-Flag: NO X-Spam-Score: 0.003 X-Spam-Level: X-Spam-Status: No, score=0.003 tagged_above=-999 required=5.3 tests=[DKIM_SIGNED=0.001, HTML_IMAGE_RATIO_06=0.001, HTML_MESSAGE=0.001, T_DKIM_INVALID=0.01, T_RP_MATCHES_RCVD=-0.01] autolearn=unavailable Authentication-Results: maria.fqdn.com (amavisd-new); dkim=fail (1024-bit key) reason=fail (body has been altered) header.d=dttusa.com Please help, Thanks On Fri, Jun 27, 2014 at 8:16 AM, Matus UHLAR - fantomas uh...@fantomas.sk wrote: On 27.06.14 07:50, motty cruz wrote: I can't figureout why spammy email get very little score, X-Quarantine-ID: 4QFxoaNchYOk X-Virus-Scanned: amavisd-new at fqdn.com X-Amavis-Alert: BAD HEADER SECTION, MIME error: error: unexpected end of header This might explain much. seems that the mail was broken somehow. Did you use default configs for spamassassin and amavis? X-Spam-Flag: NO X-Spam-Score: 0.102 X-Spam-Level: X-Spam-Status: No, score=0.102 tagged_above=-999 required=5.3 tests=[AWL=0.311, DKIM_SIGNED=0.001, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VERIFIED=-0.001, HTML_MESSAGE=0.001, T_RP_MATCHES_RCVD=-0.01] --- Received: by bell.cuxrrb.com id hllmas0e97ct for mo...@fdqn.com; Fri, 27 Jun 2014 08:58:12 -0400 (envelope-from life-motty+5F=f...@cuxrrb.com) From: Pimsleur Approach l...@cuxrrb.com Date: Fri, 27 Jun 2014 08:58:12 -0400 Subject: Want to speak a foreign language but don't have a lot of time? Reply-To: reply-b89161365ddc621bf5b4340f26597...@cuxrrb.com Message-ID: b89161365ddc621bf5b4340f2659783e095437-2598-hINbimNU@ cuxrrb.com MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=b89161365ddc621bf5b4340f2659783e69.692014062755451 -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. 10 GOTO 10 : REM (C) Bill Gates 1998, All Rights Reserved!
Re: getting tons of SPAM
On 27.06.14 07:50, motty cruz wrote: X-Quarantine-ID: 4QFxoaNchYOk X-Virus-Scanned: amavisd-new at fqdn.com X-Amavis-Alert: BAD HEADER SECTION, MIME error: error: unexpected end of header On Fri, Jun 27, 2014 at 8:16 AM, Matus UHLAR - fantomas uh...@fantomas.sk wrote: This might explain much. seems that the mail was broken somehow. Did you use default configs for spamassassin and amavis? On 01.07.14 07:48, motty cruz wrote: Hello, I am trying to manipulate spamassassin scores, I am getting lots of SPAM with very low score. you haven't answered my question, have you? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Chernobyl was an Windows 95 beta test site.
Re: getting tons of SPAM
maybe I missed your questions, was this your questions Did you use default configs for spamassassin and amavis? because if it is, I replied immediately, here is my response again, yes I was using default configurations except for language scores I added some time ago. Thanks, On Tue, Jul 1, 2014 at 8:49 AM, Matus UHLAR - fantomas uh...@fantomas.sk wrote: On 27.06.14 07:50, motty cruz wrote: X-Quarantine-ID: 4QFxoaNchYOk X-Virus-Scanned: amavisd-new at fqdn.com X-Amavis-Alert: BAD HEADER SECTION, MIME error: error: unexpected end of header On Fri, Jun 27, 2014 at 8:16 AM, Matus UHLAR - fantomas uh...@fantomas.sk wrote: This might explain much. seems that the mail was broken somehow. Did you use default configs for spamassassin and amavis? On 01.07.14 07:48, motty cruz wrote: Hello, I am trying to manipulate spamassassin scores, I am getting lots of SPAM with very low score. you haven't answered my question, have you? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Chernobyl was an Windows 95 beta test site.
Re: pyzor: check failed: internal error (strace to the rescue)
OK. So I replaced pyzor with a dash script to run it under strace and log the output to to a file. What it was complaining about was (drum roll, please) the permissions on /home/pyzor/servers. Which is odd, because I'm pretty sure I set that file to be world readable and world writable for testing purposes. But when I checked again it was owned by root with 600 permissions. If we assume that I had a temporary brain aneurysm or mini-stroke or something when I thought I was doing that, it explains both why it wasn't working then, and why it wasn't working with aliases earlier, since aliases don't have home directories to even have servers files with permissions on them. Or perhaps I didn't have a brain stroke, and those permissions changed. I'll be monitoring that. But at least I'm past the it doesn't work and I have no idea why stage. And that's very nice, indeed. -Steve Bergman
Re: getting tons of SPAM
Hey motty cruz, I just moved our 100 users over from our ISP's mail servers to our own. Apparently, the ISP's mail servers were doing remarkably well. Because it turns out that we get some 5000 spams a day, and users were getting essentially no spam. Then I upgraded us to a new OS on our Debian/X2Go/MATE desktop server, and move us to our own mail server, and the spam was coming through like water through the sluice gates of a dam. It didn't help that I'd moved everyone from Evolution to Thunderbird. So the client bayesian spam filters were completely untrained. So I installed SA on the server. That helped. But it wasn't enough. I compiled up DCC and and installed Pyzor, and that helped some. (Though SA's Pyzor support had some teething problems, as you can see from my recent posts, which I think may be now resolved.) What SA really needs if for its own Bayesian filter to kick in. But to be used at all, you need at least 200 ham and 200 spam messages registered with it. i.e. if you have to have a way to train the filter. I don't really have much confidence in autolearn. And I'm a little scared of it. So I turned it off. We use Dovecot. So I used the dovecot-antispam plugin to automatically train SA when mail gets moved in or out of the junk folder. (It handles the moving of mail from Junk into Trash or regular folders intelligently and appropriately.) But that only solved half the problem. You need 200 hams and 200 spams. Mail was not getting marked as ham when it went into the Inboxes. So I wrote a script that could be called from the users' .forward files to mark messages as ham. Then if the user, or Thunderbird's own spam filter chooses to move it to Junk, it gets relearned as spam. Finally, to deal with many of the false positives I was getting with SA, I wrote a script, executed from cron, which takes new mail in the users' Sent folders, and whitelists them with spamassassin in the users' own individual user_prefs files. This is what it took before I was really happy with the performance of SA. Well... that and adding a 1 second sleep after connection in the Postfix configuration. That made a huge difference. But our mail volume is small enough that the 1 second sleep doesn't cause any problems as it would on a really high volume server. I hope that rough outline is helpful to you in some way. However, having come through all that, I find myself wondering if we should simply impose capital punishment for the crime of spamming, or if more drastic action is indicated. ;-)
Re: getting tons of SPAM
No mention of RBLs or greylisting ... -- Jeremy McSpadden Flux Labs | http://www.fluxlabs.nethttp://www.fluxlabs.net/ | Endless Solutions Office : 850-250-5590x501tel:850-250-5590;501 | Cell : 850-890-2543tel:850-890-2543 | Fax : 850-254-2955tel:850-254-2955 On Jul 1, 2014, at 2:06 PM, Steve Bergman sbergma...@gmail.commailto:sbergma...@gmail.com wrote: Hey motty cruz, I just moved our 100 users over from our ISP's mail servers to our own. Apparently, the ISP's mail servers were doing remarkably well. Because it turns out that we get some 5000 spams a day, and users were getting essentially no spam. Then I upgraded us to a new OS on our Debian/X2Go/MATE desktop server, and move us to our own mail server, and the spam was coming through like water through the sluice gates of a dam. It didn't help that I'd moved everyone from Evolution to Thunderbird. So the client bayesian spam filters were completely untrained. So I installed SA on the server. That helped. But it wasn't enough. I compiled up DCC and and installed Pyzor, and that helped some. (Though SA's Pyzor support had some teething problems, as you can see from my recent posts, which I think may be now resolved.) What SA really needs if for its own Bayesian filter to kick in. But to be used at all, you need at least 200 ham and 200 spam messages registered with it. i.e. if you have to have a way to train the filter. I don't really have much confidence in autolearn. And I'm a little scared of it. So I turned it off. We use Dovecot. So I used the dovecot-antispam plugin to automatically train SA when mail gets moved in or out of the junk folder. (It handles the moving of mail from Junk into Trash or regular folders intelligently and appropriately.) But that only solved half the problem. You need 200 hams and 200 spams. Mail was not getting marked as ham when it went into the Inboxes. So I wrote a script that could be called from the users' .forward files to mark messages as ham. Then if the user, or Thunderbird's own spam filter chooses to move it to Junk, it gets relearned as spam. Finally, to deal with many of the false positives I was getting with SA, I wrote a script, executed from cron, which takes new mail in the users' Sent folders, and whitelists them with spamassassin in the users' own individual user_prefs files. This is what it took before I was really happy with the performance of SA. Well... that and adding a 1 second sleep after connection in the Postfix configuration. That made a huge difference. But our mail volume is small enough that the 1 second sleep doesn't cause any problems as it would on a really high volume server. I hope that rough outline is helpful to you in some way. However, having come through all that, I find myself wondering if we should simply impose capital punishment for the crime of spamming, or if more drastic action is indicated. ;-)
Re: getting tons of SPAM
nor, if using Postfix, postscreen On 07/01/2014 09:17 PM, Jeremy McSpadden wrote: No mention of RBLs or greylisting ... -- Jeremy McSpadden Flux Labs | http://www.fluxlabs.nethttp://www.fluxlabs.net/ | Endless Solutions Office : 850-250-5590x501tel:850-250-5590;501 | Cell : 850-890-2543tel:850-890-2543 | Fax : 850-254-2955tel:850-254-2955 On Jul 1, 2014, at 2:06 PM, Steve Bergman sbergma...@gmail.commailto:sbergma...@gmail.com wrote: Hey motty cruz, I just moved our 100 users over from our ISP's mail servers to our own. Apparently, the ISP's mail servers were doing remarkably well. Because it turns out that we get some 5000 spams a day, and users were getting essentially no spam. Then I upgraded us to a new OS on our Debian/X2Go/MATE desktop server, and move us to our own mail server, and the spam was coming through like water through the sluice gates of a dam. It didn't help that I'd moved everyone from Evolution to Thunderbird. So the client bayesian spam filters were completely untrained. So I installed SA on the server. That helped. But it wasn't enough. I compiled up DCC and and installed Pyzor, and that helped some. (Though SA's Pyzor support had some teething problems, as you can see from my recent posts, which I think may be now resolved.) What SA really needs if for its own Bayesian filter to kick in. But to be used at all, you need at least 200 ham and 200 spam messages registered with it. i.e. if you have to have a way to train the filter. I don't really have much confidence in autolearn. And I'm a little scared of it. So I turned it off. We use Dovecot. So I used the dovecot-antispam plugin to automatically train SA when mail gets moved in or out of the junk folder. (It handles the moving of mail from Junk into Trash or regular folders intelligently and appropriately.) But that only solved half the problem. You need 200 hams and 200 spams. Mail was not getting marked as ham when it went into the Inboxes. So I wrote a script that could be called from the users' .forward files to mark messages as ham. Then if the user, or Thunderbird's own spam filter chooses to move it to Junk, it gets relearned as spam. Finally, to deal with many of the false positives I was getting with SA, I wrote a script, executed from cron, which takes new mail in the users' Sent folders, and whitelists them with spamassassin in the users' own individual user_prefs files. This is what it took before I was really happy with the performance of SA. Well... that and adding a 1 second sleep after connection in the Postfix configuration. That made a huge difference. But our mail volume is small enough that the 1 second sleep doesn't cause any problems as it would on a really high volume server. I hope that rough outline is helpful to you in some way. However, having come through all that, I find myself wondering if we should simply impose capital punishment for the crime of spamming, or if more drastic action is indicated. ;-)
Re: getting tons of SPAM
Hello Jeremy, I have the following rbl main.cfg in postfix: reject_rbl_client b.barracudacentral.org, reject_rbl_client zen.spamhaus.org, reject_rbl_client bl.spamcop.net, reject_rbl_client all.spamrats.com RBL are very nice, helping me block lots of SPAM but a lot of spam are making it through, with very low score. I trained SA with about 700 SPAM emails and with about 258 HAM emails. X-Virus-Scanned: amavisd-new at fqdn.com X-Spam-Flag: NO X-Spam-Score: 0.003 X-Spam-Level: X-Spam-Status: No, score=0.003 tagged_above=-999 required=5.3 tests=[DKIM_SIGNED=0.001, HTML_IMAGE_RATIO_06=0.001, HTML_MESSAGE=0.001, T_DKIM_INVALID=0.01, T_RP_MATCHES_RCVD=-0.01] autolearn=no Email hearder is very spammy, I need help stoping this attack, Thanks for your support, On Tue, Jul 1, 2014 at 12:17 PM, Jeremy McSpadden jer...@fluxlabs.net wrote: No mention of RBLs or greylisting ... -- Jeremy McSpadden Flux Labs | http://www.fluxlabs.net | Endless Solutions Office : 850-250-5590x501 850-250-5590;501 | Cell : 850-890-2543 | Fax : 850-254-2955 On Jul 1, 2014, at 2:06 PM, Steve Bergman sbergma...@gmail.com wrote: Hey motty cruz, I just moved our 100 users over from our ISP's mail servers to our own. Apparently, the ISP's mail servers were doing remarkably well. Because it turns out that we get some 5000 spams a day, and users were getting essentially no spam. Then I upgraded us to a new OS on our Debian/X2Go/MATE desktop server, and move us to our own mail server, and the spam was coming through like water through the sluice gates of a dam. It didn't help that I'd moved everyone from Evolution to Thunderbird. So the client bayesian spam filters were completely untrained. So I installed SA on the server. That helped. But it wasn't enough. I compiled up DCC and and installed Pyzor, and that helped some. (Though SA's Pyzor support had some teething problems, as you can see from my recent posts, which I think may be now resolved.) What SA really needs if for its own Bayesian filter to kick in. But to be used at all, you need at least 200 ham and 200 spam messages registered with it. i.e. if you have to have a way to train the filter. I don't really have much confidence in autolearn. And I'm a little scared of it. So I turned it off. We use Dovecot. So I used the dovecot-antispam plugin to automatically train SA when mail gets moved in or out of the junk folder. (It handles the moving of mail from Junk into Trash or regular folders intelligently and appropriately.) But that only solved half the problem. You need 200 hams and 200 spams. Mail was not getting marked as ham when it went into the Inboxes. So I wrote a script that could be called from the users' .forward files to mark messages as ham. Then if the user, or Thunderbird's own spam filter chooses to move it to Junk, it gets relearned as spam. Finally, to deal with many of the false positives I was getting with SA, I wrote a script, executed from cron, which takes new mail in the users' Sent folders, and whitelists them with spamassassin in the users' own individual user_prefs files. This is what it took before I was really happy with the performance of SA. Well... that and adding a 1 second sleep after connection in the Postfix configuration. That made a huge difference. But our mail volume is small enough that the 1 second sleep doesn't cause any problems as it would on a really high volume server. I hope that rough outline is helpful to you in some way. However, having come through all that, I find myself wondering if we should simply impose capital punishment for the crime of spamming, or if more drastic action is indicated. ;-)
Re: getting tons of SPAM
On 07/01/2014 02:23 PM, Axb wrote: nor, if using Postfix, postscreen Indeed. I've looked at that. It's probably better than the sleep. But it's not yet an option for us. And at 7000 emails per day or whatever we get, I'm not sure there's that much difference. (There may be. I haven't looked at postscreen all that closely.) We'll be doing an OS upgrade on our server to Ubuntu 14.04 LTS within the next year. Possibly even within the next few weeks. I'd actually kind of like to move to Debian 7. But I really can't justify all the extra complication when I can do an in-place upgrade of the Ubuntu 10.04. -Steve
Re: getting tons of SPAM
On 07/01/2014 02:33 PM, motty cruz wrote: I trained SA with about 700 SPAM emails and with about 258 HAM emails. In case I missed this, are you the single user, or does this server handle many mail accounts? I have many, and took the conservative approach of giving each user their own filedb database of tokens, and traning them with the stream of emails which are actually coming into the users' Inboxes. Doing it that way, it takes a while for the training to mature. But my thinking is that they will mature into more accurate bayesian classifiers that way. -Steve
Re: getting tons of SPAM
On Tue, 2014-07-01 at 19:17 +, Jeremy McSpadden wrote: No mention of RBLs or greylisting ... Quite. When my ISP switched on greylisting my mail immediately went from a spam:ham ratio of 80:20 to one of 20:80, which pretty much where it has stayed ever since. The soam:ham ratio is reported on a daily basis by my spamkiller filter that I wrote and installed immediately downstream of my local copy of SA. The filter quarantines spam for a week before deleting it. I don't use my ISP's copy of SA because it didn't do a good job on some of the maillists I get: my local SA does, but has a ruleset that is highly customised for my mail stream. Martin
Re: getting tons of SPAM
On 07/01/2014 03:29 PM, Martin Gregorie wrote: On Tue, 2014-07-01 at 19:17 +, Jeremy McSpadden wrote: No mention of RBLs or greylisting ... Quite. When my ISP switched on greylisting my mail immediately went from a spam:ham ratio of 80:20 to one of 20:80 But the variable delay, which is not under your control? My users complained loudly about that minority of mails which took an hour to arrive. I had to turn it off. Yes, I'm sure the autowhitelist features help with time. But we're always receiving mail from new customers whom our mail server has never heard from before. And you really don't want to not receive a mail from a new customer for an hour or more when you are a service company advertising fast and efficient service of your customers' restaurant kitchen equipment during the lunch hours. I did not find greylisting viable for our use case. And I suspect many businesses would have similar incompatibilities with the strategy. -Steve
Re: getting tons of SPAM
yes I guest I could change the variable delay, I will do a quick search to see how would affect users. some users are very sensitive to this issues. Thanks a bunch, On Tue, Jul 1, 2014 at 1:37 PM, Steve Bergman sbergma...@gmail.com wrote: On 07/01/2014 03:29 PM, Martin Gregorie wrote: On Tue, 2014-07-01 at 19:17 +, Jeremy McSpadden wrote: No mention of RBLs or greylisting ... Quite. When my ISP switched on greylisting my mail immediately went from a spam:ham ratio of 80:20 to one of 20:80 But the variable delay, which is not under your control? My users complained loudly about that minority of mails which took an hour to arrive. I had to turn it off. Yes, I'm sure the autowhitelist features help with time. But we're always receiving mail from new customers whom our mail server has never heard from before. And you really don't want to not receive a mail from a new customer for an hour or more when you are a service company advertising fast and efficient service of your customers' restaurant kitchen equipment during the lunch hours. I did not find greylisting viable for our use case. And I suspect many businesses would have similar incompatibilities with the strategy. -Steve
Re: Bayer Filter Not Working
On Tue, 01 Jul 2014 09:37:17 +0200 Herbert J. Skuhra wrote: Den 25.06.2014 00:42, skrev Bruce Sackett: I apologize, I’m sure it’s been covered, but I have not been successful finding results in searches on the web or through the history of the list. I get no BAYES results in the headers, so I don’t see any working. The part that gets me is below: Jun 24 13:47:53.165 [3245] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks Jun 24 13:47:53.166 [3245] dbg: bayes: tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen Jun 24 13:47:53.167 [3245] dbg: bayes: found bayes db version 3 Jun 24 13:47:53.167 [3245] warn: plugin: eval failed: Insecure dependency in sprintf while running with -T switch at /usr/local/share/perl/5.14.2/Mail/SpamAssassin/Logger.pm line 241. Jun 24 13:47:53.168 [3245] dbg: config: score set 0 chosen. That seems to be the last time Bayes is referenced in a spamassassin -D ―lint Has anyone else run into this? I am using an Ubuntu 12.04 server, if that makes any difference. I have the same problem on FreeBSD: Jul 1 05:33:51.765 [43144] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x805b09f78), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM Jul 1 05:33:51.778 [43144] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x806108798) Jul 1 05:33:51.779 [43144] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks Jul 1 05:33:51.779 [43144] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen Jul 1 05:33:51.779 [43144] dbg: bayes: found bayes db version 3 Jul 1 05:33:51.779 [43144] warn: plugin: eval failed: Insecure dependency in sprintf while running with -T switch at /usr/local/lib/perl5/site_perl/5.16/Mail/SpamAssassin/Logger.pm line 241. Jul 1 05:33:51.799 [43144] warn: plugin: eval failed: Insecure dependency in sprintf while running with -T switch at /usr/local/lib/perl5/site_perl/5.16/Mail/SpamAssassin/Logger.pm line 241. Running 'sa-learn --force-expire' seems to resolve the issue temporally. Jul 1 09:35:06.084 [49647] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x805b09f78), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM Jul 1 09:35:06.097 [49647] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x806108798) Jul 1 09:35:06.098 [49647] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_toks Jul 1 09:35:06.098 [49647] dbg: bayes: tie-ing to DB file R/O /var/amavis/.spamassassin/bayes_seen Jul 1 09:35:06.098 [49647] dbg: bayes: found bayes db version 3 Jul 1 09:35:06.099 [49647] dbg: bayes: DB journal sync: last sync: 0 Jul 1 09:35:06.570 [49647] dbg: bayes: DB journal sync: last sync: 0 Jul 1 09:35:06.570 [49647] dbg: bayes: corpus size: nspam = 120857, nham = 664988 After a while the error returns. Do I have to wipe my bayes DB? I wiped my bayes DB and learned more than 200 spam and ham messages each. While nham and nspam were below 200 message error was gone. But now it is back. % spamassassin -t OvwTlDIfJxAe [...] 3.5 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.] [...] 0.2 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.] [...] Running the same command with the -D switch the error appears and I don't see the BAYES score. There is also no BAYES score in the amavisd log. :-( Any ideas? Thanks. -- Herbert
Re: getting tons of SPAM
On 07/01/2014 04:00 PM, motty cruz wrote: yes I guest I could change the variable delay, I will do a quick search to see how would affect users. some users are very sensitive to this issues. What mail server and version of it are you using? There was a good suggestion made about postscreen, earlier, if you are using version 2.8 or later of postfix, IIRC. I was a bit confused about what greylisting means, until a few days ago. Basically, your server maintains a database of servers that it has previously talked to and received legitimate emails from. If it has not talked to a server at this ip before (generally a block of 256 or so addresses) then it says Hey, I'd love to accept your incoming mail, but I'm *really* busy right now. Could you come back in 5 minutes and I'll be able to take it then? Normal mail servers will come back. Spam servers do a sort of wham, bam, thank you ma'am version of sending mail. Except that they don't even bother to say thank you, and they certainly don't come back in 5 minutes. They just move on to their next victim. They don't waste time with slow receiving servers. The problem is that saying come back in 5 minutes does not mean that even legitimate servers are going to come back in 5 minutes. They might wait 10 minutes. Or 15. Or 20. Or 30. Or an hour. Some of the sites listed in postgrey's default whitelist file delay as much as 12 hours between reties. And you cannot necessarily trust whitelists to cover all the important senders on an ongoing basis. -Steve
Re: getting tons of SPAM
On Tue, 2014-07-01 at 15:37 -0500, Steve Bergman wrote: On 07/01/2014 03:29 PM, Martin Gregorie wrote: On Tue, 2014-07-01 at 19:17 +, Jeremy McSpadden wrote: No mention of RBLs or greylisting ... Quite. When my ISP switched on greylisting my mail immediately went from a spam:ham ratio of 80:20 to one of 20:80 But the variable delay, which is not under your control? You're right: its not. My users complained loudly about that minority of mails which took an hour to arrive. I had to turn it off. I know what can happen, and also that those complaints can arise from a total misunderstanding of what e-mail is designed to do: that it is *not* an instant messaging medium but it is a reliable one despite delivering over sometimes flaky networks. IOW demanding instant e-mail delivery is quite unreasonable. Yes, I'm sure the autowhitelist features help with time. But we're always receiving mail from new customers whom our mail server has never heard from before. And you really don't want to not receive a mail from a new customer for an hour or more when you are a service company advertising fast and efficient service of your customers' restaurant kitchen equipment during the lunch hours. I think that specific whitelisting could help here: I run a mail archive that takes an automatic BCC feed of both incoming and outgoing mail from Postfix. This just works and has an important secondary use. SA uses a special-purpose plugin to query the mail archive: any incoming mail received from an e-mail address I've previously sent mail to gets whitelisted. It was simple to do because the archive is held in a PostgreSQL database and has almost zero maintenance costs. As I run it, the whitelist is assembled automatically from outgoing mail, but it would not be hard to accept an address feed from, say, a sales system or a guarantee registration database which would allow customer addresses to be whitelisted as their orders are confirmed. For that matter, if a new customer is always sent an e-mail to ensure they have your address and to confirm that theirs is correctly entered, then they'd be automatically whitelisted by that e-mail. Martin I did not find greylisting viable for our use case. And I suspect many businesses would have similar incompatibilities with the strategy. -Steve
Re: getting tons of SPAM
On Tue, 1 Jul 2014, Martin Gregorie wrote: On Tue, 2014-07-01 at 15:37 -0500, Steve Bergman wrote: On 07/01/2014 03:29 PM, Martin Gregorie wrote: On Tue, 2014-07-01 at 19:17 +, Jeremy McSpadden wrote: No mention of RBLs or greylisting ... Quite. When my ISP switched on greylisting my mail immediately went from a spam:ham ratio of 80:20 to one of 20:80 But the variable delay, which is not under your control? You're right: its not. My users complained loudly about that minority of mails which took an hour to arrive. I had to turn it off. I know what can happen, and also that those complaints can arise from a total misunderstanding of what e-mail is designed to do: that it is *not* an instant messaging medium but it is a reliable one despite delivering over sometimes flaky networks. IOW demanding instant e-mail delivery is quite unreasonable. +1 And if your business is predicated on instant e-mail you are setting yourself up for pain. If it needs to be *instant*, have them visit a web page to enter service requests. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- It is criminal to teach a man not to defend himself when he is the constant victim of brutal attacks. -- Malcolm X (1964) --- 3 days until the 238th anniversary of the Declaration of Independence
Re: getting tons of SPAM
Today I build a new Spam filter with latest release, I leave all default configuration except a few changes, for now seem to be doing better at blocking really spammy emails. Thanks for all your help, On Tue, Jul 1, 2014 at 2:39 PM, John Hardin jhar...@impsec.org wrote: On Tue, 1 Jul 2014, Martin Gregorie wrote: On Tue, 2014-07-01 at 15:37 -0500, Steve Bergman wrote: On 07/01/2014 03:29 PM, Martin Gregorie wrote: On Tue, 2014-07-01 at 19:17 +, Jeremy McSpadden wrote: No mention of RBLs or greylisting ... Quite. When my ISP switched on greylisting my mail immediately went from a spam:ham ratio of 80:20 to one of 20:80 But the variable delay, which is not under your control? You're right: its not. My users complained loudly about that minority of mails which took an hour to arrive. I had to turn it off. I know what can happen, and also that those complaints can arise from a total misunderstanding of what e-mail is designed to do: that it is *not* an instant messaging medium but it is a reliable one despite delivering over sometimes flaky networks. IOW demanding instant e-mail delivery is quite unreasonable. +1 And if your business is predicated on instant e-mail you are setting yourself up for pain. If it needs to be *instant*, have them visit a web page to enter service requests. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- It is criminal to teach a man not to defend himself when he is the constant victim of brutal attacks. -- Malcolm X (1964) --- 3 days until the 238th anniversary of the Declaration of Independence
Re: getting tons of SPAM
On 07/01/2014 04:31 PM, Martin Gregorie wrote: I know what can happen, and also that those complaints can arise from a total misunderstanding of what e-mail is designed to do: that it is *not* an instant messaging medium but it is a reliable one despite delivering over sometimes flaky networks. IOW demanding instant e-mail delivery is quite unreasonable. I disagree. Email is what it is. If the sender and the receiver both agree that they should be able to expect messages between them to arrive in a short time, then that is the implied personal contract. The real issue, here, is that we're applying a technological kluge to try to combat the social problem of massive abuse of the email system. If the goal of spammers is to wreck the (admittedly rather naive) email system, they've won. Of course, that was never their goal. But they've still wrecked email. And we admins trying to stop spam are also damaging the email system. We just hope we're doing more good than harm. In short... try to explain that email isn't an instant messaging system to a customer with a dead fryer at 11AM emailing for a tech to help before the lunch crowd arrives. That's how email is used in the real world. And no amount of our saying you shouldn't do that is going to change the fact. People do expect all sorts of things that email was never designed to handle. Protection for abuse by spammers is one. The sending of DVD attachments is another. And our own abuse of the system to try to prevent others' abuse of the system results in a certain collateral damage which is quite real. I think that specific whitelisting could help here It can help. But I cannot think of a whitelisting system, in tandem with a kluge like greylisting, which would not do more harm than good. At least not for a service organization like ours. That said. I have plenty of kluges in place myself. I'm far from being authorized to speak from a holier than thou position. ;-) -Steve
Re: getting tons of SPAM
On Wednesday 02 July 2014 at 00:12:07, Steve Bergman wrote: In short... try to explain that email isn't an instant messaging system to a customer with a dead fryer at 11AM emailing for a tech to help before the lunch crowd arrives. That's how email is used in the real world. And no amount of our saying you shouldn't do that is going to change the fact. This may be true, but in the example that you give, tech support should really have provided a better (ie: more reliable) mechanism for contact than email if the customer is entitled to (expect) a prompt response. I don't agree with blaming users of email for expecting it to work the way it used to some years ago (remember the days of open relays, without problems, and delivery notifications?) when it's the technology, and the security systems which have been imposed on the system, which have changed, without the users necessarily realising or being told, and which have made it work differently. People do expect all sorts of things that email was never designed to handle. Protection for abuse by spammers is one. The sending of DVD attachments is another. And our own abuse of the system to try to prevent others' abuse of the system results in a certain collateral damage which is quite real. People with email accounts should be (continually) informed of what service they're being offered and can thereby reasonably expect to receive. I cannot think of a whitelisting system, in tandem with a kluge like greylisting, which would not do more harm than good. At least not for a service organization like ours. Horses for courses - what works well for others may not work well for you - but that's no reason to dismiss it outright. (Yes, I agree that you didn't, but the point remains that for some people both whitelisting and greylisting are very effective.) That said. I have plenty of kluges in place myself. I'm far from being authorized to speak from a holier than thou position. ;-) Me too :) Antony. -- There is no reason for any individual to have a computer in their home. - Ken Olsen, President of Digital Equipment Corporation (DEC, later consumed by Compaq, later merged with HP) Please reply to the list; please don't CC me.
Re: getting tons of SPAM
On Tue, 01 Jul 2014 14:06:14 -0500 Steve Bergman wrote: What SA really needs if for its own Bayesian filter to kick in. But to be used at all, you need at least 200 ham and 200 spam messages registered with it. i.e. if you have to have a way to train the filter. I don't really have much confidence in autolearn. And I'm a little scared of it. So I turned it off. We use Dovecot. So I used the dovecot-antispam plugin to automatically train SA when mail gets moved in or out of the junk folder. (It handles the moving of mail from Junk into Trash or regular folders intelligently and appropriately.) I'm sceptical about the use of Dovecot-Antispam with Spamassassin. The problem is that it trains on SpamAssassin errors rather than Bayes errors. It may be possible to get sufficient spam this way, but ham is learned very slowly through avoidable FPs. But that only solved half the problem. You need 200 hams and 200 spams. You need several thousand hams and spams for it to work optimally.
Re: getting tons of SPAM
On 07/01/2014 05:35 PM, Antony Stone wrote: This may be true, but in the example that you give, tech support should really have provided a better (ie: more reliable) mechanism for contact than email if the customer is entitled to (expect) a prompt response. There are multiple methods. But customers (New, as well as previously existing) do expect that their emails will be received promptly. This is the way people use email. We have no way of controlling what new customers contacting use expect. Our minority opinions, as admins, of what email should be today don't count for that much. We as admins, our predecessors. our predecessors, and their predecessors, are responsible for the way that we have failed our users today. The email system we use is incredibly naive. There's got to be a way to do this right. SPF is a big step forward. But we don't trust it. An SPF fail gets, what? 2.0 points in spamassassin? An SPF fail should be end of game for spam. But no one trusts SPF well enough to do that. Because sending email server admins don't take SPF seriously enough for receiving servers to take *them* seriously. (And which of us does not handle sending and receiving?) SPF could go a long way toward forming a basis for fixing email. It can't do it by itself. All it does is give basic assurance that My sending domain is what I claim it is.. But that is exactly the rock solid basis that a real solution could be based upon. The email system is and always has been, from a security standpoint, a joke. And SA is an amazing and wonderful kluge that tries to sweep the fact under the rug as best it can. No disrespect intended to SA. But if it could absolutely identify the sending domain, with confidence, that would be a big step forward. Spammers could still abuse. But then reputation would really mean something, I know that it's a complicated problem. And it's a social problem. We admins can't agree on a solution. A month ago I was pretty ignorant, didn't know about sender protection schemes, etc. So I still kind of have a foot in both the camps of enlightenment and ignorance. But I can report that there are still a lot of people in that ignorance camp, over there. And I certainly cannot claim to be entirely in the enlightenment camp. But such is life. ;-) That said, I *do* think that it is possible for email to work just the way users expect it to work, based upon their experience with it an arbitrary number of years ago. It all depends upon admins as a community. Or we might fail. I dunno. I'm just glad SA is here to fill in the gap, and perhaps to herd in a better future. -Steve
Re: getting tons of SPAM
On 07/01/2014 06:09 PM, RW wrote: I'm sceptical about the use of Dovecot-Antispam with Spamassassin. The problem is that it trains on SpamAssassin errors rather than Bayes errors. It may be possible to get sufficient spam this way, but ham is learned very slowly through avoidable FPs. We currently (early days for this installation) get plenty of spam for the users to train by moving it to the junk folder. Ham was the problem. Dovecot does nothing about training ham. That's why I have a line in the users' default .forward file to train incoming mail as ham. Then if they or Thunderbird decide to move the mail to Junk, it gets re-trained as spam. dovecot-antispam is *not* a complete solution, so far as I can see. At this early stage, it *is* painful to watch all that spam coming in over the weekend getting trained as ham. I tell my users to mark it as spam on Monday morning. And if they don't, I just figure it's not my fault. Once the token databases get larger there won't be so much potential flux back and forth, I guess. -Steve
Re: getting tons of SPAM
On Tue, 2014-07-01 at 12:33 -0700, motty cruz wrote: I trained SA with about 700 SPAM emails and with about 258 HAM emails. X-Spam-Status: No, score=0.003 tagged_above=-999 required=5.3 tests=[DKIM_SIGNED=0.001, HTML_IMAGE_RATIO_06=0.001, HTML_MESSAGE=0.001, T_DKIM_INVALID=0.01, T_RP_MATCHES_RCVD=-0.01] autolearn=no There's no BAYES_* rule hit. That means your manual training of ham and spam has been done as the wrong user. You need to do the training as the same user Amavis / SA runs as. Earlier header pastes suggest you are using catch-all. Just, don't. Not using catch-all will *significantly* reduce the amount of spam, simply by completely eliminating the bulk of spam to otherwise false addresses. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Bayes, Manual and Auto Learning Strategies (was: Re: getting tons of SPAM)
On Tue, 2014-07-01 at 18:43 -0500, Steve Bergman wrote: On 07/01/2014 06:09 PM, RW wrote: I'm sceptical about the use of Dovecot-Antispam with Spamassassin. The problem is that it trains on SpamAssassin errors rather than Bayes errors. It may be possible to get sufficient spam this way, but ham is learned very slowly through avoidable FPs. We currently (early days for this installation) get plenty of spam for the users to train by moving it to the junk folder. Ham was the problem. Dovecot does nothing about training ham. Dovecot (and its antispam plugin) does nothing about training ham, either. It offers target folders and triggers, for easy manual (re-) classification -- and thus training -- of ham and spam. That's why I have a line in the users' default .forward file to train incoming mail as ham. That's pretty bad practice. Fundamentally, you are implementing a custom auto-learn flavor, overruling the SA configurable auto-learn behavior and ignoring all safety concepts implemented by SA. There's a reason for the ham and spam learning thresholds, and the ham threshold to be 0.1 by default, *not* equaling required_score's default of 5.0. Then if they or Thunderbird decide to move the mail to Junk, it gets re-trained as spam. So if a user in a hurry simply deletes some spam, it will remain ham, as far as Bayes is concerned. dovecot-antispam is *not* a complete solution, so far as I can see. At this early stage, it *is* painful to watch all that spam coming in over the weekend getting trained as ham. I tell my users to mark it as spam on Monday morning. And if they don't, I just figure it's not my fault. It is your fault to implement a broken training strategy. Once the token databases get larger there won't be so much potential flux back and forth, I guess. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: getting tons of SPAM
On 07/01/2014 05:07 PM, motty cruz wrote: If it needs to be *instant*, have them visit a web page to enter service requests. Because there's not way that web-based email forms can be abused. Please. The whole delay thing is about the ridiculous greylisting kluge. There are plenty of other spam avoidance kluges which don't involve significant delay. I really can't believe what I'm hearing here. It has little to nothing to do with reality. Spam is a problem. But you don't have to make your users wait hours for important emails by making your mail servers play hard to get games with each other. This is just silly. If I forwarded this conversation to my email users, they'd be ROTFL over what the experts are saying about the tool they use daily. It has problems. But long delays would be unacceptable. And http can't really replace all it's functionality. Web email forms are the slow, limiting, and annoying. No offense intended. But that's honestly the way I see it. -Steve
Re: Funky HARP Spam
On Jun 27, 2014, at 12:34 PM, Philip Prindeville philipp_s...@redfish-solutions.com wrote: On Jun 27, 2014, at 7:30 AM, RW rwmailli...@googlemail.com wrote: As I mentioned before, the real violation is in the previous mime section, which claims 7bit, but contains octets with the high-bit set. Yup. Just submitted a patch for this: https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7063 Loving this filter! It’s catching 50% or more of our SPAM
Re: Bayes, Manual and Auto Learning Strategies
On 07/01/2014 07:32 PM, Karsten Bräckelmann wrote: That's pretty bad practice. Fundamentally, you are implementing a custom auto-learn flavor, overruling the SA configurable auto-learn behavior SA's autolearn behavior doesn't make much sense. I have no confidence in it. This method shields the user from the worst of the spam, while giving them full control of what gets relearned as spam. and ignoring all safety concepts implemented by SA. What safety concepts? autolearn is a complete joke. Even the docs explain that it's only there as a last resort method of kinda sorta training the spam filter. So if a user in a hurry simply deletes some spam, it will remain ham, as far as Bayes is concerned. Same as with Thunderbird, I think. And it's working very well for them. If they act irresponsibly, they'll get more spam. It takes no longer to highlight the spam and click Junk than it does to highlight the spam and click Delete. I've pretty much decided at this point that if the users don't do what I tell them to do, repeatedly, then what results is not my responsibility. And it's not. The alternative is to not mark incoming mail as ham, and allow the SA Bayesian filter to remain inactive forever. I opted to give the users the choice of being responsible for sorting, and reaping the benefits of that if they do. And yes, I know that some are not going to. I'd be interested if you have a better solution in mind. -Steve
Re: Bayes, Manual and Auto Learning Strategies
On 07/01/2014 07:32 PM, Karsten Bräckelmann wrote: That's pretty bad practice. Fundamentally, you are implementing a custom auto-learn flavor, overruling the SA configurable auto-learn behavior BTW, that reminds me of a question I had been meaning to ask on the list. Autolearn. There's very little written about it, so far as I am aware. But from what I have gleaned, from old posts, is that it is system-wide and in-memory. Now, I have Spamass-milter set to run SA 3.3 as the recipient user, using the filedb backend. So in 3.3, is autolearn system wide and in memory, or per user and on disk? This makes a difference regarding what Karsten and I are discussing. I don't suppose I would object to being wrong. But I have a feeling that I'm right. -Steve
Re: getting tons of SPAM
--As of July 1, 2014 7:39:43 PM -0500, Steve Bergman is alleged to have said: On 07/01/2014 05:07 PM, motty cruz wrote: If it needs to be *instant*, have them visit a web page to enter service requests. Because there's not way that web-based email forms can be abused. Please. The whole delay thing is about the ridiculous greylisting kluge. There are plenty of other spam avoidance kluges which don't involve significant delay. I really can't believe what I'm hearing here. It has little to nothing to do with reality. Spam is a problem. But you don't have to make your users wait hours for important emails by making your mail servers play hard to get games with each other. This is just silly. If I forwarded this conversation to my email users, they'd be ROTFL over what the experts are saying about the tool they use daily. It has problems. But long delays would be unacceptable. And http can't really replace all it's functionality. Web email forms are the slow, limiting, and annoying. --As for the rest, it is mine. 95+% of the time, email is immediate, true. But it is not uncommon for mail to be delayed for hours or days either, even without greylisting. It happens in the wild all the time, even (especially...) with the big providers. Email is also not 100% reliable: It is a best-effort service and can and does drop messages on occasion. (With varying degrees of notification: By the spec, notification should always happen, but experience says that causes backscatter, so it's not always by the spec.) If you need an immediate, reliable communication method email will appear to work - but will randomly fail, and there will be *nothing you can do about it.* If that's what your users are expecting you are doing a *disservice* to your users, because it *won't work.* There are solutions that will, which have higher overhead costs than email. A password-protected web form is better - it won't fail silently. Or there are specialist messaging protocols. But if your users are expecting email to be that solution you are going to give yourself headaches. Now, if 'most of the time' immediate communication is enough, that's fine. It may not be worth it for you to implement a higher reliability protocol - they cost time and money. (I used to work for a company who's sole product was a 100% reliable communication protocol.) But don't complain when it fails, because it will, and both you and the users need to expect that. Daniel T. Staal --- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---
Re: getting tons of SPAM
95+% of the time, email is immediate, true. More like 99%+ of the time. When it's not, I hear about it. But it is not uncommon for mail to be delayed for hours or days either, It's uncommon enough that when it does happen I get a phone call about a user not being able to receive email. even without greylisting. Greylisting is an ugly hack that I'm hesitant to even dignify by having the topic of serious conversation. I'm not at all sure what you're talking about regarding email vs web form reliability. What are the links in that chain? The email client can malfunction in some way. But then again, so can a browser. The sending server can malfunction in some way. But so can the web proxy. Then WAN link can go down on the sending side. But then, that can happen with both web and email. The receiving side's WAN can go down too. But in the case of a mail server it tries and tries and tries to get the message through as quickly as possible. The browser and proxy server certainly don't. They just drop it if anything goes wrong. You tell me that email is unreliable. And yet anyone can see that it *is* quite reliable, until you, as a mail admin, foolishly introduce the self-DOSing technique of greylisting, and fall on your own sword. You can go on about how it makes sense to fall on your sword. But I'm a realist, and not buying it. Have fun in your ivory tower. I'll also be typing this post up, putting a stamp on it, and mailing it. It might reach you there faster. ;-) How many people here actually use greylisting and don't get complaints? Our ISP, who previously handled our email certainly didn't introduce any noticeable delays. And nobody ever got a noticeable amount of spam, or reported to me a missed or late email. Amazing, IMO. But it was obviously done without the ridiculous and unacceptable practice of greylististing. I want to achieve the results that Windstream does. -Steve
Re: getting tons of SPAM
I said: Have fun in your ivory tower. Please permit me to retroactively back this line out of my previous post. The smiley on the next line was intended to cover it. But it just came out sounding nasty. My amigdala's been acting up lately. ;-) -Steve
Re: Bayes, Manual and Auto Learning Strategies
On Tue, 2014-07-01 at 20:36 -0500, Steve Bergman wrote: On 07/01/2014 07:32 PM, Karsten Bräckelmann wrote: That's pretty bad practice. Fundamentally, you are implementing a custom auto-learn flavor, overruling the SA configurable auto-learn behavior SA's autolearn behavior doesn't make much sense. I have no confidence in it. The auto-learning feature is NOT meant to be a fully automated training system. It's an aid for the user to eliminate the need to care about the extremes, while focusing on the close-calls. There are options to tweak to your specific needs, and there even is no single SA autolearn behavior as you stated, but different flavors. And an option to turn it off. Frankly, it appears you don't understand what auto-learning is. This method shields the user from the worst of the spam, while giving them full control of what gets relearned as spam. Wrong. It is not this (your) method, that shields the user from the worst of the spam. That's SA. Not your style of auto-training. And unless you disabled Bayes auto-learning in SA (dunno, might have been mentioned deep in the thread), the user does not have full control of what gets relearned as spam. and ignoring all safety concepts implemented by SA. What safety concepts? autolearn is a complete joke. Even the docs explain that it's only there as a last resort method of kinda sorta training the spam filter. You are doing (custom) auto-learning as ham of any message with a score less than required_score of 5.0. *That* is a joke. (Besides, you *are* doing auto-learning, which you just claimed to be a complete joke.) At this point I won't get into details. It should suffice to highlight that a default ham auto-learning threshold of 0.1 is part of the safety concepts. (See the M::SA::Plugin::AutoLearnThreshold man-page for more.) So if a user in a hurry simply deletes some spam, it will remain ham, as far as Bayes is concerned. Same as with Thunderbird, I think. I never checked the TB internal Bayes implementation and auto-learn strategy, but I'd be surprised if they do train on black/white, without any gray area in between. You stated it. Please back up your claim. And it's working very well for them. If they act irresponsibly, they'll get more spam. It takes no longer to highlight the spam and click Junk than it does to highlight the spam and click Delete. While I am aware I'm not the average user -- there's a delete action key on my keyboard. There's no junk equivalent. Yes, I avoid using the mouse if keyboard interaction is more productive... I've pretty much decided at this point that if the users don't do what I tell them to do, repeatedly, then what results is not my responsibility. And it's not. Do you hate your users or your job? (Sorry, snide-remark I couldn't resist. Feel free to ignore.) The alternative is to not mark incoming mail as ham, and allow the SA Bayesian filter to remain inactive forever. No. I can only guess, but it appears there are some mis-interpretations in that conclusion. The SA Bayesian classifier to remain inactive forever can only refer to insufficient initial training. Manual training. Of at least 200 ham and spam each (by default, you can lower that to 0). You will easily get that by manual training of existing messages. And even default auto- learning would eventually cross the ham number. Less than forever. More importantly, SA still marks (classifies) incoming mail as ham. Just because its overall score is less than 5.0. It just does not *learn* all of them as ham. Because there's a chance it might not actually be ham, but a FN. That area, between (default) auto-learning as ham and classifying as spam is the gray area, where actual user input is of much value. For both, learning spam AND ham, for that matter. In particular, because generally (and as SA principle), a FP is *much* worse than a FN. Your approach of force learning those as ham, is biasing your Bayes DB. At the very least temporarily (unless a fresh spam campaign has been re-trained by your users on Monday). At worst, until you clear it. Btw, is that per-user, or are you gambling a site-wide Bayes DB? I opted to give the users the choice of being responsible for sorting, and reaping the benefits of that if they do. And yes, I know that some are not going to. I'd be interested if you have a better solution in mind. Do not auto-learn ham every message that scores below required_score. Introduce train-on-error for your users, with an extended manual training option. Specific ham and spam folders, where moving or copying mail into trains the Bayes classifier. Kind of optional for the user, unless they feel there's too much mis-classification. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: getting tons of SPAM
It seems to me that grey listing could be useful for small non time critical email servers, such as hobbyist setups, but for business, grey listing is not the way to go. On Jul 1, 2014 10:48 PM, Steve Bergman sbergma...@gmail.com wrote: I said: Have fun in your ivory tower. Please permit me to retroactively back this line out of my previous post. The smiley on the next line was intended to cover it. But it just came out sounding nasty. My amigdala's been acting up lately. ;-) -Steve
Re: Bayes, Manual and Auto Learning Strategies
On 07/01/2014 09:53 PM, Karsten Bräckelmann wrote: Frankly, it appears you don't understand what auto-learning is. So please specify, explicitly, what it is. I asked some specific questions about it. And I'm very interested in the answers. Is auto-learn still system-wide? I'd need it to apply to individual users. Is it in-memory only? Or can I have it update the users' filedb token databases? If it's now per user and uses the user databases, then I am more than ready to reconsider my opinion. But I've not been able to get a clear answer to this. I haven't had an opportunity to test. And I'd want confirmation from someone in the know anyway, before I changed strategies. This method shields the user from the worst of the spam, while giving them full control of what gets relearned as spam. Wrong. It is not this (your) method, that shields the user from the worst of the spam. That's SA. Not your style of auto-training. Mine is not autotraining at all. it's giving the user a way of explicitly training the backend spam filter. And unless you disabled Bayes auto-learning in SA (dunno, might have been mentioned deep in the thread), the user does not have full control of what gets relearned as spam. I have disabled autolearning. I thought I mentioned that to you. (Besides, you *are* doing auto-learning, which you just claimed to be a complete joke.) No. The messages are assumed ham until the user classifies it as spam. It is explicit learning. Under user control, At this point I won't get into details. It should suffice to highlight that a default ham auto-learning threshold of 0.1 is part of the safety concepts. (See the M::SA::Plugin::AutoLearnThreshold man-page for more.) I really don't think you understand what it is I'm doing. Anything below a score of 5.0 goes into their mailbox and learned as ham. If it's ham, that's great. If it's spam, they move it to Junk and it gets learned as spam. auto-learn is as brain dead as the defunct AWL. I never checked the TB internal Bayes implementation and auto-learn strategy, but I'd be surprised if they do train on black/white, without any gray area in between. Optimally, I would have an incoming folder and then the user could manually move the messages from there to spam or ham. But considering that this was not even remotely necessary with our old email provider, I don't feel that I can put my users to that level of extra trouble that they never even thought about having to deal with before, just because SA is not performing as well as the spam filter they are used to. The mail needs to go into the inbox directly. And for SA's bayesian tp work, it needs to be assumed as ham initially. The only thing I see which might change my view would be explicit details about where autolearn stores its data and how it is used on a per user basis. -Steve
Re: Bayes, Manual and Auto Learning Strategies
On Tue, 2014-07-01 at 20:53 -0500, Steve Bergman wrote: On 07/01/2014 07:32 PM, Karsten Bräckelmann wrote: That's pretty bad practice. Fundamentally, you are implementing a custom auto-learn flavor, overruling the SA configurable auto-learn behavior BTW, that reminds me of a question I had been meaning to ask on the list. Autolearn. There's very little written about it, so far as I am http://spamassassin.apache.org/doc/Mail_SpamAssassin_Conf.html http://spamassassin.apache.org/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html aware. But from what I have gleaned, from old posts, is that it is system-wide and in-memory. It depends on how you call SA (SMTP or MDA level). SA itself is a filter, called by your mail-processing chain. Thus, there is no SA default context of system-wide or per-user. It depends on how you call it. Now, I have Spamass-milter set to run SA 3.3 as the recipient user, using the filedb backend. So in 3.3, is autolearn system wide and in memory, or per user and on disk? Milter usually means system-wide. (But since you just asked, it is.) Which, referring to my previous post, also means, a single sloppy user deleting your custom-auto-learned FN ham messages affects all your other users. Or a non-sloppy, but on-vacation-mode user. Moreover, there is no in-memory only, not on-disk mode. Unless you don't have to ask about it. This makes a difference regarding what Karsten and I are discussing. I don't suppose I would object to being wrong. But I have a feeling that I'm right. Irrespective of your feeling -- cheers! /me having a beer -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: getting tons of SPAM
On 07/01/2014 10:11 PM, Daniel Reynolds wrote: It seems to me that grey listing could be useful for small non time critical email servers, such as hobbyist setups, but for business, grey listing is not the way to go. Indeed. We should always remember that our workloads are *not* the only ones out there. There are many different kinds. Greylisting with postgrey might even work for us, after a teething period of building up the necessary (and rather large) necessary whitelist of sending servers. I might even do it if I didn't feel I was being compared side-by-side with WindstreamHosting, who delivered neither spam nor delays nor noticeable false positives. The gods only know how they manage that. But I'm learning. And I've gotten some very helpful posts from members of this list today, both on the list and privately. I should be able to do this without ugly hacks like greylisting. That said, for my own home use, I'm perfectly fine with ugly hacks. I do them all the time. I've had the whole place done over with Ugly Hack wallpaper, It's great. :-) -Steve
Re: Bayes, Manual and Auto Learning Strategies
On 07/01/2014 10:21 PM, Karsten Bräckelmann wrote: http://spamassassin.apache.org/doc/Mail_SpamAssassin_Conf.html http://spamassassin.apache.org/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html I've read those over and over. It never says anything about where the data is maintained, or whether it's per-user or not. The *only* solid claim I have is a ten year old (yes, at the dawn of SA Bayes) post which specifically says it's in memory, system-wide, and lost upon SA restart. Milter usually means system-wide. (But since you just asked, it is.) I'm using spamass-milter. It suid's to the recipient user for most mails. For aliases it defaults to a particular user who gets an unbelievable amount of spam at the gate, and whom I know sorts his ham/spam religiously. Which, referring to my previous post, also means, a single sloppy user deleting your custom-auto-learned FN ham messages affects all your other users. No. I make sure to keep each user solely responsible for their own email welfare. Irrespective of your feeling -- cheers! /me having a beer Whew! After the conversations I've had here, today, I need one, too! ;-) -Steve
Re: getting tons of SPAM
On Tue, 1 Jul 2014, Steve Bergman wrote: I'm not at all sure what you're talking about regarding email vs web form reliability. What are the links in that chain? The email client can malfunction in some way. But then again, so can a browser. The sending server can malfunction in some way. But so can the web proxy. Then WAN link can go down on the sending side. But then, that can happen with both web and email. The receiving side's WAN can go down too. But in the case of a mail server it tries and tries and tries to get the message through as quickly as possible. The browser and proxy server certainly don't. They just drop it if anything goes wrong. But the user *sees* that failure *immediately* and can fall back to an alternate method of communication, say, a telephone call, if the situation is as urgent as you portray. Email is store-and-forward best-effort with *no guarantees* of timely delivery, no matter how well it performs 99% of the time. An email message can get stuck for a day or more at any (or even all) of the intermediate hops, and the system is *working properly* if it is ultimately delivered, or a notification is eventually sent back to the user that it cannot be delivered. And greylisting is a perfectly valid way to behave within the defined communications protocol. It fails because poor admins set the delivery retry time to an absurdly-long period, or poor programmers write MTAs that don't even know *how* to retry. FWIW, I did not say, and did not have in mind a web-email form when I made my suggestion. I had in mind a more-direct interface to the trouble ticket management system. Of course, I may be assuming a more-sophisticated operation than is the case. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- News flash: Lowest Common Denominator down 50 points --- 3 days until the 238th anniversary of the Declaration of Independence
Re: Bayes, Manual and Auto Learning Strategies
On Tue, 1 Jul 2014, Steve Bergman wrote: On 07/01/2014 10:21 PM, Karsten Bräckelmann wrote: http: //spamassassin.apache.org/doc/Mail_SpamAssassin_Conf.html http: //spamassassin.apache.org/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html I've read those over and over. It never says anything about where the data is maintained, or whether it's per-user or not. The *only* solid claim I have is a ten year old (yes, at the dawn of SA Bayes) post which specifically says it's in memory, system-wide, and lost upon SA restart. Autolearn trains the bayes database. The bayes data is stored wherever you configured it to be stored, in a DBM database or SQL or redis, and it's per-user if you configure per-user Bayes databases and scan emails using different usernames (vs. a global user like root or amavis). -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- News flash: Lowest Common Denominator down 50 points --- 3 days until the 238th anniversary of the Declaration of Independence
Re: Bayes, Manual and Auto Learning Strategies
On Tue, 2014-07-01 at 22:18 -0500, Steve Bergman wrote: On 07/01/2014 09:53 PM, Karsten Bräckelmann wrote: Frankly, it appears you don't understand what auto-learning is. So please specify, explicitly, what it is. I asked some specific questions about it. And I'm very interested in the answers. If you want my opinion, please re-phrase your questions. I locally deleted most of this previous (originally unrelated) thread. Is auto-learn still system-wide? I'd need it to apply to individual users. Is it in-memory only? Or can I have it update the users' filedb token databases? SA itself never was system-wide, neither user-specific. It is both, can be either. It depends on the context of calling SA. If it's now per user and uses the user databases, then I am more than ready to reconsider my opinion. But I've not been able to get a clear answer to this. I haven't had an opportunity to test. And I'd want confirmation from someone in the know anyway, before I changed strategies. It does not depend on SA, but on how you invoke SA. We cannot give you a clear answer. It depends on your system, your SMTP, glue, system wide calling of SA, and possibly per-user invocations even after system-wide. To be clear: SA is a filter. It does nothing itself, other than classification. Being called, and at which point, is outside the scope of SA. Rejecting, deleting, delivering or any other kind of action is outside the scope of SA. That's actions performed by the calling layer, based on the result of SA evaluation. This method shields the user from the worst of the spam, while giving them full control of what gets relearned as spam. Wrong. It is not this (your) method, that shields the user from the worst of the spam. That's SA. Not your style of auto-training. Mine is not autotraining at all. it's giving the user a way of explicitly training the backend spam filter. Quoting your previous post, you have a line in the users' default .forward file to train incoming mail as ham. That is auto-training. (Besides, you *are* doing auto-learning, which you just claimed to be a complete joke.) No. The messages are assumed ham until the user classifies it as spam. It is explicit learning. Under user control, Being assumed is not the same as being treated and automatically reinforced. The latter is what you do. (And btw, Yes. You are auto-learning.) At this point I won't get into details. It should suffice to highlight that a default ham auto-learning threshold of 0.1 is part of the safety concepts. (See the M::SA::Plugin::AutoLearnThreshold man-page for more.) I really don't think you understand what it is I'm doing. Anything below a score of 5.0 goes into their mailbox and learned as ham. If it's ham, that's great. If it's spam, they move it to Junk and it gets learned as spam. auto-learn is as brain dead as the defunct AWL. I perfectly understood what you are doing. You didn't understand why that is bad. Failing to explain might be my bad, though I'll leave re-explaining for tomorrow my timezone. Or you carefully re-reading my posts. I never checked the TB internal Bayes implementation and auto-learn strategy, but I'd be surprised if they do train on black/white, without any gray area in between. Optimally, I would have an incoming folder and then the user could manually move the messages from there to spam or ham. But considering Which is basically what you came from, using Dovecot antispam plugin with SA, and dedicated folders where the user could manually move the messages to. Why didn't you just set that up? (Hint: That's your set-up without auto-learning ham Inbox deliveries.) that this was not even remotely necessary with our old email provider, I don't feel that I can put my users to that level of extra trouble that they never even thought about having to deal with before, just because SA is not performing as well as the spam filter they are used to. The Do initial manual training. Then get back to us. mail needs to go into the inbox directly. And for SA's bayesian tp work, it needs to be assumed as ham initially. No. It seems your previous email provider, whatever that might be, had some sort of spam filtering service. Now you're on your own. Which you are, unless you decide to ask for free (as in beer) support by the community providing the software for free (as in speech) to help you weed out the spam. You did ask, which is just fine, but your assumptions are kind of hostile. Like your previous email provider would not use SA internally. He most likely does. The only thing I see which might change my view would be explicit details about where autolearn stores its data and how it is used on a per user basis. So the only thing that might change your view would be reading the docs. Go read them. Auto-learn stores its data exactly where Bayes generally stores its data. In fact, it is the same. Just being triggered
Re: getting tons of SPAM
--As of July 1, 2014 9:40:05 PM -0500, Steve Bergman is alleged to have said: 95+% of the time, email is immediate, true. More like 99%+ of the time. When it's not, I hear about it. But it is not uncommon for mail to be delayed for hours or days either, It's uncommon enough that when it does happen I get a phone call about a user not being able to receive email. It's common enough that I saw it every day in my last job. 99.9% of the time the users didn't notice, or care. On the other hand there were the times I had to show them the log files showing exactly when we got and sent the message, and had to have a talk about expectations. (Nearly always the message had gone through our system in seconds.) even without greylisting. Greylisting is an ugly hack that I'm hesitant to even dignify by having the topic of serious conversation. I won't defend it. I've never used it. ;) I'm not at all sure what you're talking about regarding email vs web form reliability. What are the links in that chain? The email client can malfunction in some way. But then again, so can a browser. The sending server can malfunction in some way. But so can the web proxy. Then WAN link can go down on the sending side. But then, that can happen with both web and email. The receiving side's WAN can go down too. But in the case of a mail server it tries and tries and tries to get the message through as quickly as possible. The browser and proxy server certainly don't. They just drop it if anything goes wrong. I only said that it won't fail silently: If you are depending on it for immediate communications, you'll know when you didn't get that, while with email it'll be hidden. Maybe 'better' wasn't the right word: It's a trade off. If you want the message to go through, email is set up to keep trying. If you want the message to go *now*, the web form will tell you if it did (making the assumption that the form returns a 'message delivered' screen once it has delivered the message), and the user can try for another form of communication if it fails. You tell me that email is unreliable. And yet anyone can see that it *is* quite reliable, until you, as a mail admin, foolishly introduce the self-DOSing technique of greylisting, and fall on your own sword. You can go on about how it makes sense to fall on your sword. But I'm a realist, and not buying it. As I said: I've never used greylisting. I have seen mail queues regularly holding messages for hours or days. Email is fairly reliable - but I wouldn't let a user treat it as 100% reliable and immediate, because I know it isn't. Better a few hard conversations about expectations and options then lost business due to using the wrong tool for the job. I'll also be typing this post up, putting a stamp on it, and mailing it. It might reach you there faster. ;-) Not faster, but probably more reliable. ;) How many people here actually use greylisting and don't get complaints? Our ISP, who previously handled our email certainly didn't introduce any noticeable delays. And nobody ever got a noticeable amount of spam, or reported to me a missed or late email. Then they didn't notice them. In the normal course of things, most mail gets through in seconds, and most of the delays are in the range of minutes to hours - short enough that people don't see them unless they are paying close attention. (And they may not be checking mail that often anyway.) Amazing, IMO. But it was obviously done without the ridiculous and unacceptable practice of greylististing. I want to achieve the results that Windstream does. You probably can. ;) But I'm sure Windstream didn't get you every piece of mail immediately after it was sent - just as soon as they could after they got it. I'm not even saying I like greylisting - I'm just saying you should work to set user expectations to reality, which is that email sometimes takes time to get delivered and (rarely) gets lost. If something is absolutely time-critical, they should treat email as a backup, not the primary form of communication. If it can spare an hour or two on occasion, email's fine. Daniel T. Staal --- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---
Re: getting tons of SPAM
On 07/01/2014 11:09 PM, John Hardin wrote: FWIW, I did not say, and did not have in mind a web-email form when I made my suggestion. I had in mind a more-direct interface to the trouble ticket management system. Of course, I may be assuming a more-sophisticated operation than is the case. John, What my users expect is the level of speed and reliability of email that they have always had over the years with our ISP, up until 2 months ago when I took over with our new server. It was fast, reliable, mostly spam-free, and free of false positives (that they ever noticed, anyway). I can't go in and try to convince them that in the last 2 months that I've been in charge of the mail server that the world's email has become slow, unreliable, and spammy. I've got to come up with a solution that is as good as what our ISP provided. The good news is that by conservatively (OK, maybe not always so conservatively. I was a little desperate at first) adding strategies in Postfix and SA, I guess I'm nearly at parity with our old ISP. Allowing for a bit of sugar-coating of their descriptions of the good old days, maybe I'm even already there. Until we do our server OS upgrade, I don't have postscreen. But the 1 second sleep after smtpd connection seems to have been the finishing touch on our spam control. It seems about as effective as postgrey. Personally, I detest those web mail forms. I, too. expect to be able to compose an email, send it, and have it received within a minute. And I do not think that to be an unreasonable expectation at all, as long as we administrators keep our feet on the ground and don't start doing stupid stuff like greylisting. Though, as been pointed out by Daniel, greylisting may be appropriate in certain contexts. -Steve
Re: Bayes, Manual and Auto Learning Strategies
On Tue, 2014-07-01 at 22:40 -0500, Steve Bergman wrote: On 07/01/2014 10:21 PM, Karsten Bräckelmann wrote: http://spamassassin.apache.org/doc/Mail_SpamAssassin_Conf.html http://spamassassin.apache.org/doc/Mail_SpamAssassin_Plugin_AutoLearnThreshold.html I've read those over and over. It never says anything about where the data is maintained, or whether it's per-user or not. The *only* solid claim I have is a ten year old (yes, at the dawn of SA Bayes) post which specifically says it's in memory, system-wide, and lost upon SA restart. Those do not tell you about using file or SQL based databases? You never thought about googling for spamassassin per user and friends? You never checked the SA wiki? FWIW, the links given do NOT refer to in-memory only at all. An in-memory only Bayes database definitely is much more than ten years ago. If it ever existed. No need for me to even check. Milter usually means system-wide. (But since you just asked, it is.) I'm using spamass-milter. It suid's to the recipient user for most mails. For aliases it defaults to a particular user who gets an unbelievable amount of spam at the gate, and whom I know sorts his ham/spam religiously. So you want to check back with your specific setup and its docs. Suid'ing is pretty likely to be per-user, though the definition of user is not specifically clear in the context of a milter (and the final recipient). In either case, that is not SA specific. (SA happily uses both, per-user or site-wide config AND bayes database, depending on context.) Refer to your milter's docs. Irrespective of your feeling -- cheers! /me having a beer Whew! After the conversations I've had here, today, I need one, too! ;-) Don't see this as an attack on you. It isn't. Just pointers on helping your understanding of the situation and your issues. Not always gentle, but that also reflects the initial stance. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: getting tons of SPAM
On 07/01/2014 11:15 PM, Daniel Staal wrote: You probably can. ;) But I'm sure Windstream didn't get you every piece of mail immediately after it was sent - just as soon as they could after they got it. Yeah. I'm conservatively holding myself to higher standards than is perhaps warranted. But I think that those standards are along the lines of what my long-time customer thought they were getting from Windstream. And it Winstream had too many issues, I think I would have heard about it. And their servers *did* become unavailable for short periods from time to time. But once I'm satisfied that I've reached parity, the real fun starts. We were on POP3. Now we're on our own IMAP. And there is Dovecot full text search in our near future. It will be fun to be able to go beyond and show off a little. My client company's CEO does a lot of full text searching over his email history. I'm not even saying I like greylisting - I'm just saying you should work to set user expectations to reality, When trust died on the Internet, telnet died, but somehow the unbelievably naive email system did not. It was never prepared for spammer abuse. And we're still accommodating to 7 bit systems for crying out loud. If it were material I suppose it would make a fine antique in someone's collection. Right along side the PDP-11. which is that email sometimes takes time to get delivered and (rarely) gets lost. If something is absolutely time-critical, they should treat email as a backup, I think that It's largely a matter of *peoples* expectations and understanding, If a mail gets missed, folks can understand an occasional I never got your email, we'll send someone over right away. What I object to is the idea of regular and unpredictable delays as introduced by greylisting. And it's just plain ugly from an aesthetic standpoint. But then so are our current email protocols. But I do think that can be fixed. Never did like texting. And that's the alternative. -Steve
Re: Bayes, Manual and Auto Learning Strategies
On 07/01/2014 11:49 PM, Karsten Bräckelmann wrote: Those do not tell you about using file or SQL based databases? They do. But not specifically with respect to autolearn. You never thought about googling for spamassassin per user and friends? You never checked the SA wiki? I have, indeed. No reference to autolearn and persistent storage. The lack of mention is notable. I'd expect people to be lining up to tell me I'm mistaken if I absolutely were. Can you point me to a change log somewhere documenting autolearn moving from in-memory and system-wide to per user and persistent? I don't hold a strong opinion on this. It would be nice if I were wrong. It would open more options. I'm just waiting for evidence that it's the case. My perception is that It's not. -Steve
Re: Bayes, Manual and Auto Learning Strategies
On 07/01/2014 11:14 PM, John Hardin wrote: Autolearn trains the bayes database. The bayes data is stored wherever you configured it to be stored, in a DBM database or SQL or redis, and it's per-user if you configure per-user Bayes databases and scan emails using different usernames (vs. a global user like root or amavis). That is interesting. How sure are you of this? Because if you're pretty sure, it's a piece of information I've been keen to confirm for a while. Odd, though, that before I set up .forward to train incoming mails as ham and disabled autolearn, no nhams were showing up in sa-learn --dump magic for the individual users. Just nspams. -Steve
Re: Bayes, Manual and Auto Learning Strategies
On 07/02/2014 07:19 AM, Steve Bergman wrote: On 07/01/2014 11:49 PM, Karsten Bräckelmann wrote: Those do not tell you about using file or SQL based databases? They do. But not specifically with respect to autolearn. You never thought about googling for spamassassin per user and friends? You never checked the SA wiki? I have, indeed. No reference to autolearn and persistent storage. The lack of mention is notable. I'd expect people to be lining up to tell me I'm mistaken if I absolutely were. Can you point me to a change log somewhere documenting autolearn moving from in-memory and system-wide to per user and persistent? I don't hold a strong opinion on this. It would be nice if I were wrong. It would open more options. I'm just waiting for evidence that it's the case. My perception is that It's not. Lets turn this around? Can you prove autolearn was ever done to memory? If you mean autolearn to journal, this is also file based. I've been using SA since before it was an Apache project, when it was developed by McAfee and the sources were on Sourceforge and back then it was already file based.
Re: Bayes, Manual and Auto Learning Strategies
Lets turn this around? Can you prove autolearn was ever done to memory? I'm not really interested in proving anything. I'm interested in being convinced that autolearn is individual file-based when spamc is run as the individual user. I'm not quite sure how that would affect my strategy. But it might (or might not) make autolearn useful. -Steve
Re: Bayes, Manual and Auto Learning Strategies
On 07/02/2014 07:37 AM, Steve Bergman wrote: Lets turn this around? Can you prove autolearn was ever done to memory? I'm not really interested in proving anything. I'm interested in being convinced that autolearn is individual file-based when spamc is run as the individual user. It's in the code... but yes, autolearn is always file based and respects the per user settings unless you run spamd with -x I'm not quite sure how that would affect my strategy. But it might (or might not) make autolearn useful. More important, you may need to reconsider is if per user Bayes will give you the level of quality you're aiming for, and from experience I can tell you: it won't. Site wide bayes works VERY well even under such ugly conditions as traffic with multiple languages, for ham as well as spam.