Re: Issues with Yahoo/AOL emails and RCVD_NUMERIC_HELO
On 29/07/18 19:21, RW wrote: On Sun, 29 Jul 2018 19:00:56 +0100 Dominic Raferd wrote: On Sun, 29 Jul 2018 at 18:33, RW wrote: On Sun, 29 Jul 2018 12:28:08 +0200 Antony Stone wrote: On Sunday 29 July 2018 at 12:17:07, Sebastian Arcus wrote yet another email that's guaranteed to fail DMARC with a reject when posted through a mailing list, and consequently I didn't receive: ... Ditto, and I haven't received (and won't receive) any of his subsequent postings either (opendmarc is - quite rightly - blocking them). More strangely, I didn't receive this message (above) except apparently when quoted in reply by RW. Note to OP: when posting to mailing lists, use a domain that does not have DMARC with p=reject (and preferably not p=quarantine either). Actually it's worse than that, the main problem (the last I looked) is that his DKIM signs some List-* headers which guarantees a DKIM fail when he posts through a mailing list. I had no idea that DKIM signing can be such a nightmare. I have disabled all DKIM for the time being until I can get my head around on how to configure it properly - if that is even possible. Thank you for pointing it out - I wasn't aware of the issue.
Re: Issues with Yahoo/AOL emails and RCVD_NUMERIC_HELO
On 29/07/18 19:00, Dominic Raferd wrote: On Sun, 29 Jul 2018 at 18:33, RW <mailto:rwmailli...@googlemail.com>> wrote: On Sun, 29 Jul 2018 12:28:08 +0200 Antony Stone wrote: > On Sunday 29 July 2018 at 12:17:07, Sebastian Arcus wrote yet another > email that's guaranteed to fail DMARC with a reject when posted > through a mailing list, and consequently I didn't receive: ... Ditto, and I haven't received (and won't receive) any of his subsequent postings either (opendmarc is - quite rightly - blocking them). More strangely, I didn't receive this message (above) except apparently when quoted in reply by RW. Note to OP: when posting to mailing lists, use a domain that does not have DMARC with p=reject (and preferably not p=quarantine either). Thank you for highlighting this - I wasn't aware of the problem. I had no idea that enabling DMARC fixes one set of problems while creating a whole different one! I've disabled DMARC for the time, until I find a workable solution.
Re: Issues with Yahoo/AOL emails and RCVD_NUMERIC_HELO
On 29/07/18 14:36, Matus UHLAR - fantomas wrote: On Sunday 29 July 2018 at 12:17:07, Sebastian Arcus wrote: I've been having a number of emails recently from Yahoo and AOL senders hitting the RCVD_NUMERIC_HELO rule. I'm trying to understand what is going on: 1. First off, the rule hits on the EHLO line - which means the it is an authenticated SMTP submission. On 29/07/18 11:28, Antony Stone wrote: Er, what? No, EHLO simply means "Hello, I'm capable of doing ESMTP". On 29.07.18 12:29, Sebastian Arcus wrote: Looking again at it - the 82.132.242.82 is registered as O2/Telefonica wireless broadband. I wonder if this is a 3G/4G connection - which in UK always has a private IP address - at the mobile phone level. Maybe that's why the confusion - the MUA on the mobile phone thinks it is 10.7.54.227 (which it is), but the Yahoo server can only see the public IP 80.132.242.82, which belongs to the O2 gateway. Could that explain that particular header? it does. Received: from 82.132.242.82 (EHLO [10.7.54.227]) ([82.132.242.82]) by smtp409.mail.ir2.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 84be422cfd662692400891131b957bd8 for ; Mon, 23 Jul 2018 13:59:41 + (UTC) Looking at /usr/share/perl5/Mail/SpamAssassin/Plugin/RelayEval.pm I guess it should not match: my $rcvd = $pms->{relays_untrusted_str}; if ($rcvd) { my $IP_ADDRESS = IPV4_ADDRESS; my $IP_PRIVATE = IP_PRIVATE; local $1; if ($rcvd =~ /\bhelo=($IP_ADDRESS)(?=[\000-\040,;\[()<>]|\z)/i # Bug 5878 && $1 !~ /$IP_PRIVATE/) { return 1; } but maybe I read wrong. Which SA version do you have? I have: # spamassassin --version SpamAssassin version 4.0.0-r1823176 running on Perl version 5.26.2
Re: Issues with Yahoo/AOL emails and RCVD_NUMERIC_HELO
On 29/07/18 11:28, Antony Stone wrote: On Sunday 29 July 2018 at 12:17:07, Sebastian Arcus wrote: I've been having a number of emails recently from Yahoo and AOL senders hitting the RCVD_NUMERIC_HELO rule. I'm trying to understand what is going on: 1. First off, the rule hits on the EHLO line - which means the it is an authenticated SMTP submission. Er, what? No, EHLO simply means "Hello, I'm capable of doing ESMTP". Thank you - I clearly got that one wrong. Looking again at it - the 82.132.242.82 is registered as O2/Telefonica wireless broadband. I wonder if this is a 3G/4G connection - which in UK always has a private IP address - at the mobile phone level. Maybe that's why the confusion - the MUA on the mobile phone thinks it is 10.7.54.227 (which it is), but the Yahoo server can only see the public IP 80.132.242.82, which belongs to the O2 gateway. Could that explain that particular header? >> After all, if it is EHLO, it probably is an MUA, > > No; MTAs also speak E/SMTP to each other, and some of those Received headers > indicating handover of the mail from one server to another will contain the > HELO or EHLO greetings. > >> 2. Or maybe this is caused by Yahoo's end - in which case would some >> sort of exception be a good idea? > > Yes, I would do that. > >> Or maybe I am misunderstanding completely what is going on? I've >> uploaded a set of headers here: https://pastebin.com/KDV1f0wW > > Given that the example you've posted is from a machine with a public IP > 82.132.242.82, but thinks it has a private IP 10.7.54.227, I'm not entirely > surprised there is no rDNS set up for the private address.
Issues with Yahoo/AOL emails and RCVD_NUMERIC_HELO
I've been having a number of emails recently from Yahoo and AOL senders hitting the RCVD_NUMERIC_HELO rule. I'm trying to understand what is going on: 1. First off, the rule hits on the EHLO line - which means the it is an authenticated SMTP submission. Is the correct HELO format important when the client actually does authenticated SMTP? After all, if it is EHLO, it probably is an MUA, which can't be expected to have proper DNS etc. 2. Or maybe this is caused by Yahoo's end - in which case would some sort of exception be a good idea? Or maybe I am misunderstanding completely what is going on? I've uploaded a set of headers here: https://pastebin.com/KDV1f0wW Thank you for any useful hints.
Re: SPF_HELO_FAIL triggers on domain with valid SPF record and HELO settings
On 11/06/18 08:56, Sebastian Arcus wrote: I am running SA 4.0.0-r1823176 on Perl 5.26.2. On a number of domains I administer, outbound mail triggers the SPF_HELO_FAIL rule - but the regular SPF check passes. I am struggling to see why this is happening, as the HELO name is set to the same value as the name of the server/dns name, it has rDNS - and it clearly passes during the regular SPF check - but not the SPF_HELO check. I have re-checked the domain settings at mxtoolbox.com - and there doesn't seem to be any problem. Any ideas please? It turns out that it is indeed something I did. Somehow in all this time since I started to use SPF, I never realised that SPF checks are also done on the HELO hostname itself, not only the sending domain - and the need to have a separate SPF record for it. I actually had a separate SPF record for mail.sinclair-accounting.co.uk, in which I denied everything - as my understanding was that there will never be an address of the type u...@mail.sinclair-accounting.co.uk - so I wouldn't need to allow anything on SPF. All corrected now - thank you for the input.
Re: SPF_HELO_FAIL triggers on domain with valid SPF record and HELO settings
On 11/06/18 10:20, Reindl Harald wrote: Am 11.06.2018 um 10:57 schrieb Sebastian Arcus: On 11/06/18 09:39, Matus UHLAR - fantomas wrote: On 11.06.18 08:56, Sebastian Arcus wrote: I am running SA 4.0.0-r1823176 on Perl 5.26.2. On a number of domains I administer, outbound mail triggers the SPF_HELO_FAIL rule - but the regular SPF check passes. I am struggling to see why this is happening, as the HELO name is set to the same value as the name of the server/dns name, it has rDNS - and it clearly passes during the regular SPF check - but not the SPF_HELO check. I have re-checked the domain settings at mxtoolbox.com - and there doesn't seem to be any problem. Any ideas please? do users use SMTP authentication? Messages submitted over SMTP are authenticated. Other messages are generated locally on the sending server and passed on the command line to Exim. All messages hit SPF_HELO_FAIL Is that visible in headers? I'm not really sure. Which bit of the headers should contain the authentication data? look if exim has a similar feature http://www.postfix.org/postconf.5.html#smtpd_sasl_authenticated_header My question is, is this header a requirement? Both servers at both ends are configured by me, so I know the smtp submission is authenticated. Is the SPF check at the receiving end supposed to fail if it can't find a specific header showing the authenticated user at the sending end? What is the connection between SPF HELO checks at the receiving server, and the user which is submitting the message to the sending server? I'm not really following I'm afraid - but I could be missing the point.
Re: SPF_HELO_FAIL triggers on domain with valid SPF record and HELO settings
On 11/06/18 09:39, Matus UHLAR - fantomas wrote: On 11.06.18 08:56, Sebastian Arcus wrote: I am running SA 4.0.0-r1823176 on Perl 5.26.2. On a number of domains I administer, outbound mail triggers the SPF_HELO_FAIL rule - but the regular SPF check passes. I am struggling to see why this is happening, as the HELO name is set to the same value as the name of the server/dns name, it has rDNS - and it clearly passes during the regular SPF check - but not the SPF_HELO check. I have re-checked the domain settings at mxtoolbox.com - and there doesn't seem to be any problem. Any ideas please? do users use SMTP authentication? Messages submitted over SMTP are authenticated. Other messages are generated locally on the sending server and passed on the command line to Exim. All messages hit SPF_HELO_FAIL Is that visible in headers? I'm not really sure. Which bit of the headers should contain the authentication data? # spamassassin -D 2>&1 < /test.eml | grep -i spf we need to see the Received: header. Sure: Received: from mail.sinclair-accounting.co.uk ([80.229.84.190]:47700) by mail.open-t.co.uk with esmtps (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90) (envelope-from ) id 1fSIEL-0001Wn-P4 for email_removed; Mon, 11 Jun 2018 09:31:16 +0100 Received: from jucara ([192.168.71.82]) by mail.sinclair-accounting.co.uk with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.90_1) (envelope-from ) id 1fSIEG-0007bx-Lw for email_removed; Mon, 11 Jun 2018 09:31:10 +0100
SPF_HELO_FAIL triggers on domain with valid SPF record and HELO settings
I am running SA 4.0.0-r1823176 on Perl 5.26.2. On a number of domains I administer, outbound mail triggers the SPF_HELO_FAIL rule - but the regular SPF check passes. I am struggling to see why this is happening, as the HELO name is set to the same value as the name of the server/dns name, it has rDNS - and it clearly passes during the regular SPF check - but not the SPF_HELO check. I have re-checked the domain settings at mxtoolbox.com - and there doesn't seem to be any problem. Any ideas please? # spamassassin -D 2>&1 < /test.eml | grep -i spf Jun 11 08:46:30.177 [5534] dbg: spf: checking to see if the message has a Received-SPF header that we can use Jun 11 08:46:30.341 [5534] dbg: spf: using Mail::SPF for SPF checks Jun 11 08:46:30.342 [5534] dbg: spf: found Envelope-From in first external Received header Jun 11 08:46:30.342 [5534] dbg: spf: checking EnvelopeFrom (helo=mail.sinclair-accounting.co.uk, ip=80.229.84.190, envfrom=) Jun 11 08:46:30.519 [5534] dbg: spf: query for /80.229.84.190/mail.sinclair-accounting.co.uk: result: pass, comment: , text: Mechanism 'mx' matched Jun 11 08:46:30.758 [5534] dbg: spf: already checked for Received-SPF headers, proceeding with DNS based checks Jun 11 08:46:30.758 [5534] dbg: spf: checking HELO (helo=mail.sinclair-accounting.co.uk, ip=80.229.84.190) Jun 11 08:46:30.776 [5534] dbg: spf: query for /80.229.84.190/mail.sinclair-accounting.co.uk: result: fail, comment: Please see http://www.openspf.org/Why?s=helo;id=mail.sinclair-accounting.co.uk;ip=80.229.84.190;r=obelisk.open-t.lan, text: Mechanism '-all' matched Jun 11 08:46:30.836 [5534] dbg: spf: def_whitelist_from_spf: ser...@sinclair-accounting.co.uk is not in DEF_WHITELIST_FROM_SPF Jun 11 08:46:30.846 [5534] dbg: rules: ran eval rule SPF_PASS ==> got hit (1) Jun 11 08:46:30.853 [5534] dbg: rules: ran eval rule SPF_HELO_FAIL ==> got hit (1)
Re: FP with URI_TRY_3LD on get.adobe.com
On 27/04/18 16:22, John Hardin wrote: On Fri, 27 Apr 2018, Sebastian Arcus wrote: On 27/04/18 10:49, Sebastian Arcus wrote: I am getting some FP's with URI_TRY_3LD hitting the url get.adobe.com in the body of emails: Apr 27 10:45:39.330 [32173] dbg: rules: ran uri rule URI_TRY_3LD ==> got hit: "http://get.adobe.com; Would it be possible to add some exception to this rule - as many legitimate emails containing invoice attachments in pdf include the above url in the body. It also appears to not like some DHL url's for some reason: Apr 27 11:02:05.148 [32339] dbg: rules: ran uri rule URI_TRY_3LD ==> got hit: "https://mybill.dhl.com; my{mumble}.mumble.com is targeted. I'll think about that one; the rule isn't scored highly and I could see that helping out to detect DHL phishing. If it is detecting DHL phishing is good - but if it is triggering on both legitimate DHL emails and phishing emails, I'm not sure it is that useful?
Re: FP with URI_TRY_3LD on get.adobe.com
On 27/04/18 16:19, John Hardin wrote: On Fri, 27 Apr 2018, Sebastian Arcus wrote: I am getting some FP's with URI_TRY_3LD hitting the url get.adobe.com in the body of emails: Apr 27 10:45:39.330 [32173] dbg: rules: ran uri rule URI_TRY_3LD ==> got hit: "http://get.adobe.com; Would it be possible to add some exception to this rule - as many legitimate emails containing invoice attachments in pdf include the above url in the body. Fixed. Thank you
Re: FP with URI_TRY_3LD on get.adobe.com
On 27/04/18 10:49, Sebastian Arcus wrote: I am getting some FP's with URI_TRY_3LD hitting the url get.adobe.com in the body of emails: Apr 27 10:45:39.330 [32173] dbg: rules: ran uri rule URI_TRY_3LD ==> got hit: "http://get.adobe.com; Would it be possible to add some exception to this rule - as many legitimate emails containing invoice attachments in pdf include the above url in the body. It also appears to not like some DHL url's for some reason: Apr 27 11:02:05.148 [32339] dbg: rules: ran uri rule URI_TRY_3LD ==> got hit: "https://mybill.dhl.com;
FP with URI_TRY_3LD on get.adobe.com
I am getting some FP's with URI_TRY_3LD hitting the url get.adobe.com in the body of emails: Apr 27 10:45:39.330 [32173] dbg: rules: ran uri rule URI_TRY_3LD ==> got hit: "http://get.adobe.com; Would it be possible to add some exception to this rule - as many legitimate emails containing invoice attachments in pdf include the above url in the body.
Re: URI_TRY_3LD fp's with QuickBooks Intuit emails
On 13/04/18 16:39, John Hardin wrote: On Fri, 13 Apr 2018, John Hardin wrote: On Fri, 13 Apr 2018, John Hardin wrote: On Fri, 13 Apr 2018, Giovanni Bechis wrote: On 04/13/18 09:06, Sebastian Arcus wrote: But when it hits, it still adds 2.0 to the score (and I haven't customized the score anywhere else). Is this a special form of SA syntax? The score in the current update is 0.001 across the board. Are you up-to-date and are you *sure* you don't have any overrides anywhere? 72_scores.cf:score URI_TRY_3LD 0.001 0.001 0.001 0.001 OK - after more digging it surfaced that the original report with 2.0 score is from a different server than the one I am testing on. That server has 2.0 scores in 4.00/updates_spamassassin_org/72_active.cf When trying to run sa-update on that server, I am getting errors, so it must be that SA stopped updating a while ago there. I will dig in and find out why. Thank you for flagging the fact that the default score on the current configs is not supposed to be 2.0!
Re: URI_TRY_3LD fp's with QuickBooks Intuit emails
On 13/04/18 11:36, Giovanni Bechis wrote: On 04/13/18 09:06, Sebastian Arcus wrote: Hello all. I am getting some fp's with emails from QuickBooks / Intuit with the above rule: Apr 13 08:00:30.853 [5768] dbg: rules: ran uri rule URI_TRY_3LD ==> got hit: "https://myturbotax.intuit.com; On a slightly different note, and mainly for my curiosity to understand SA rules syntax, in 72_active.cf, the score seems to be commented out: #score URI_TRY_3LD 2.000 # limit But when it hits, it still adds 2.0 to the score (and I haven't customized the score anywhere else). Is this a special form of SA syntax? the score is present in rulesrc/sandbox/jhardin/20_misc_testing.cf with tflags publish. Is that a location on the SA server - or am I suppose to have those dirs locally here? I can't seem to find them anywhere locally.
URI_TRY_3LD fp's with QuickBooks Intuit emails
Hello all. I am getting some fp's with emails from QuickBooks / Intuit with the above rule: Apr 13 08:00:30.853 [5768] dbg: rules: ran uri rule URI_TRY_3LD ==> got hit: "https://myturbotax.intuit.com; On a slightly different note, and mainly for my curiosity to understand SA rules syntax, in 72_active.cf, the score seems to be commented out: #score URI_TRY_3LD 2.000 # limit But when it hits, it still adds 2.0 to the score (and I haven't customized the score anywhere else). Is this a special form of SA syntax? Thank you for any answers
[OT] Re: Check for valid MX of sender and rspamd testing
On 10/04/18 08:41, Daniele Duca wrote: On 09/04/2018 20:40, Sebastian Arcus wrote: This might not really answer your question, but I've had really good results leaving all this to the MTA (Exim in my case). I actually go for the whole hog full callout verification - checking with the MX that the sender really exists. I know that some people are against this and say that you get blacklisted - but I've been doing this for about 8 months on 4 sites and it has worked very well. I have a local full callout verification whitelist - to skip callout verification mainly for Microsoft operated domains - which will blacklist you at the drop of the hat. Hello Sebastian, I'm curious about this approach. I never tried it, but, assuming that you check the MX of the envelope from domain, how do you deal with poorly-configured-but-legit VPS that use, in example, www-d...@hostname.of.the.server ? I have live examples of wordpress and vbulletin installations that have not existent envelope from mailboxes or VPS hostnames without MX records. There are also other services that actively send email in the form of "nore...@domain.com". If I understood correctly, your approach would heavily penalize these senders. I know that in the ideal world everyone should configure their systems neatly, but unfortunately we are far from ideal conditions in real life :/ I'm happy to discuss this technique but I can't really afforhttps://www.exim.org/exim-html-current/doc/html/spec_html/ch-access_control_lists.htmld the administrative overhead I would have with users complaining about rejected emails.. Hi Daniele. I agree that configuring a real life system is often a balancing act between having a standards compliant and efficient system on one side - but at the same time compromising so that the users are not too inconvenienced. I started with a configuration which was as strict as I preferred, and then gradually loosened things up. I also think that there is some scope to penalizing badly configured systems - if time and circumstances allow. Accepting crap often means condoning it - and encouraging systems administrators in sloppy practices. Of course, if you can find the time to do this - and not end up inconveniencing your own users too much :-) Generally if emails come from poorly configured servers and they are relatively small providers or organisations, I try and liaise with them and get them to implement better settings. Fortunately I can do this as most of the setups at my end are relatively small - but in larger ones that is probably not possible. For larger providers and domains at the sending end, sometimes I have to implement local workarounds and whitelists - as there isn't usually much chance to get any cooperation from them. I believe (but I could be wrong) that the envelope from address should be able to receive bounce messages - so I don't think an address of the type www-data@server_hostname is acceptable. Also, I found that most noreply@ type of addresses from clued-up providers seem to react correctly to callout verifications and confirm the address is real and valid (although they might return a bounceback message if you actually try to email them). I think this should be the correct way to configure noreply@ addresses. The exception to this is pretty much all Microsoft controlled domains and systems - which seem to be rubbish at both following standards and also configuring a decent email setup. Hence why I have to have a local whitelist and skip verification for all MX's of the form *.outlook.com (which include Microsoft cloud hosted domains).
Re: Check for valid MX of sender and rspamd testing
On 09/04/18 15:24, David Jones wrote: I was wondering if anyone knows of an SA plugin or another method to determine if the envelope-from domain has a valid MX record that is listening on TCP port 25. I don't think it would be a major scorer but it could be useful in meta rules. This might not really answer your question, but I've had really good results leaving all this to the MTA (Exim in my case). I actually go for the whole hog full callout verification - checking with the MX that the sender really exists. I know that some people are against this and say that you get blacklisted - but I've been doing this for about 8 months on 4 sites and it has worked very well. I have a local full callout verification whitelist - to skip callout verification mainly for Microsoft operated domains - which will blacklist you at the drop of the hat. Pretty much everybody else on the internet seems to understand the full callout verification has more advantages than disadvantages in fighting spam. I also use Exim to keep count of how many callout verifications have failed for an origin IP address and then start rejecting connections after 10/24 hours - to stop spammers from using my boxes as dictionary attacks proxies against other domains (and getting me blacklisted in the process). All of this seems to have worked out very well so far - but I realise that it will depend on the size of the email system and number of mailboxes and all sorts of other things - so it might not work so well elsewhere.
Re: MSGID_SPAM_CAPS fp's hitting messages from The Pension Regulator in UK
On 08/04/18 13:41, David Jones wrote: On 04/07/2018 10:42 AM, Sebastian Arcus wrote: I'm not entirely sure what is the cause of this - notification emails from The Pension Regulator in UK (a government body overseeing pensions) have the destination email in upper case as part of the Message-ID. I don't know if the user has input their email address in caps when creating the account with TPR, and the system at TPR just preserves caps - or maybe their email software does that on purpose somehow. In all events, all email notifications from them go straight to the Junk folder. Do the standards really require a message id to be in all lower case? I've enclosed one of the messages received here: https://pastebin.com/9Bmu3pj1 I added this to the 60_whitelist_auth.cf to trust this sender: def_whitelist_auth *@*.tpr.gov.uk This will get pushed out in a couple of days by sa-update. I know it's not directly addressing your question about the rule's high score but this is how I address these types of issues. If you create a "fast lane" for trusted senders then this allows for more aggressive tactics/scores for new and untrusted senders. Thank you David. It sounds like a reasonable solution to me.
Re: MSGID_SPAM_CAPS fp's hitting messages from The Pension Regulator in UK
On 07/04/18 21:20, Bill Cole wrote: On 7 Apr 2018, at 11:42 (-0400), Sebastian Arcus wrote: Do the standards really require a message id to be in all lower case? Of course not, and that's also not an accurate description of MSGID_SPAM_CAPS. A small minority of rules in SA are based on any external standard. They are empirical and pragmatic, not legalistic. There is a complex analysis of multiple mail streams used to generate scores for the rules and to decide which rules are good enough to publish in updates, run on a daily basis because it takes most of a day to run. The fact that MSGID_SPAM_CAPS exists with that name (and mot with a 'T_' or developer's tag prefix) implies that at some point in the past it was reliable enough as an indicator of spam to be part of the default set. Thank you Bill. That is useful to know.
Re: MSGID_SPAM_CAPS fp's hitting messages from The Pension Regulator in UK
On 07/04/18 17:22, Antony Stone wrote: On Saturday 07 April 2018 at 18:10:18, Sebastian Arcus wrote: On 07/04/18 16:52, Reindl Harald wrote something. Thank you for answering, but really, in effect you haven't answered at all my question. And the way I customise the scores are based on the type of emails received at this particular site. It might seem "idiotic" to you, but there are reasons for those scores. Not everyone receives the same mix of email - so it isn't constructive to start calling other people's scoring "idiotic" just because they are not the same as your own or the defaults. Please note that there are good reasons why you received only a private response from this person, and that he is no longer permitted to post to the list. My personal recommendation is to consider carefully anything he says, judge whether you find it useful, and not to reply. Hi Antony. Thank you kindly for the information. I didn't notice that the message was private and not from the list - as the message CC'ed the list - so it looked like a regular reply. I will take your advice - thank you.
Re: MSGID_SPAM_CAPS fp's hitting messages from The Pension Regulator in UK
On 07/04/18 17:14, Reindl Harald wrote: Am 07.04.2018 um 18:10 schrieb Sebastian Arcus: And the way I customise the scores are based on the type of emails received at this particular site. It might seem "idiotic" to you, but there are reasons for those scores. Not everyone receives the same mix of email - so it isn't constructive to start calling other people's scoring "idiotic" just because they are not the same as your own or the defaults if a single misfired rule make a BAYES_00 message to a spam message it's idiotic - it's that easy - with or without MSGID_SPAM_CAPS that can happen at every moment in time and when you trust your bayes -0.2 is not justified and if you don't trust your bayes train it A default score of 3.1 for MSGID_SPAM_CAPS is pretty high - even compared with some of the DNS blacklists rules - and some of those are pretty powerful INMHO. Hence why I was trying to understand why this rule is assigned such a high score and what is the significance of it. Secondly, I found in the past that a high negative score for BAYES_00 is counter-productive, because: 1. As soon as you receive a spam message with a new type of content, it essentially has a free ride until it gets put through the bayes training - as the high negative on BAYES_00 counteracts any other rule it hits - even pretty effective rules, such as Pyzor and blacklists. 2. Spammers have learned from the above, and I get a lot of spam which changes the wording all the time, so that bayes becomes essentially ineffective against it - but at the same time it stops other rules from working - because of the high negative scores on low BAYES. 3. Spammers have also learned from no.1 , and I see a lot of extremely short spam messages - just one short line of few words. Bayes seems to be extremely ineffective on these very short messages, not matter how much you train it - because of the small amount of data to work on, and with a little bit of cunning and varying the words used - they all score as BAYES_00. Again, the high negative score gives these spammers a guaranteed free ride, as it overrides any other rules. So at least from the type of spam that I see, BAYES_00 with a large negative score is really counter-productive and it makes SA far less efficient at picking spam. BAYES_00 doesn't necessarily mean "I am sure this is not spam" - as a good quality whitelist rule would, for example. It merely means "I haven't really seen this type of spam before", or simply "this message is too short and I really can't say anything useful about it". For these reasons, I don't think low BAYES scores should be given large negative scores - and hence why I changed them on my systems - with really good results.
Re: MSGID_SPAM_CAPS fp's hitting messages from The Pension Regulator in UK
On 07/04/18 16:52, Reindl Harald wrote: Content analysis details: (5.1 points, 4.0 required) who did set the *non default* required score to 4.0? why did the person not adjust -0.2 for BAYES_00 too? the scoring of this system is idiotic! required score here is 5.5 and BAYES_00 is scored to -3.5 while milter reject starts with 8.0 so nothing would happen just because *one single* rule hti wrongly Thank you for answering, but really, in effect you haven't answered at all my question. I was merely trying to understand the MSGID_SPAM_CAPS rule - and what rationale it is based on. I know I can alter the score just for it - I was trying to understand what other implications this might have. I didn't even suggest that SA default config or scoring needs to change! And the way I customise the scores are based on the type of emails received at this particular site. It might seem "idiotic" to you, but there are reasons for those scores. Not everyone receives the same mix of email - so it isn't constructive to start calling other people's scoring "idiotic" just because they are not the same as your own or the defaults. Am 07.04.2018 um 17:42 schrieb Sebastian Arcus: I'm not entirely sure what is the cause of this - notification emails from The Pension Regulator in UK (a government body overseeing pensions) have the destination email in upper case as part of the Message-ID. I don't know if the user has input their email address in caps when creating the account with TPR, and the system at TPR just preserves caps - or maybe their email software does that on purpose somehow. In all events, all email notifications from them go straight to the Junk folder. Do the standards really require a message id to be in all lower case? I've enclosed one of the messages received here: https://pastebin.com/9Bmu3pj
MSGID_SPAM_CAPS fp's hitting messages from The Pension Regulator in UK
I'm not entirely sure what is the cause of this - notification emails from The Pension Regulator in UK (a government body overseeing pensions) have the destination email in upper case as part of the Message-ID. I don't know if the user has input their email address in caps when creating the account with TPR, and the system at TPR just preserves caps - or maybe their email software does that on purpose somehow. In all events, all email notifications from them go straight to the Junk folder. Do the standards really require a message id to be in all lower case? I've enclosed one of the messages received here: https://pastebin.com/9Bmu3pj1
Re: FUZZY_XPILL FP hitting all Travelodge emails
On 02/04/18 14:58, RW wrote: On Mon, 2 Apr 2018 08:26:27 -0500 David Jones wrote: On 04/02/2018 07:18 AM, Sebastian Arcus wrote: Thank you - one example here: https://pastebin.com/UGStfCys It found "xon, OX" in "Aylesbury Road, Thame, Oxon, OX9 3AT" It's an aggressive rule that finds anything that might be an obfuscated Xanax. It only scores 0.8 points because it can produce FPs like this. Actually that is my private, custom score. I think the default is 2.8 or something like that.
Re: FUZZY_XPILL FP hitting all Travelodge emails
On 02/04/18 14:26, David Jones wrote: On 04/02/2018 07:18 AM, Sebastian Arcus wrote: Thank you - one example here: https://pastebin.com/UGStfCys On 02/04/18 13:10, Kevin A. McGrail wrote: Pastebin a sample(s). On Mon, Apr 2, 2018, 08:06 Sebastian Arcus <s.ar...@open-t.co.uk <mailto:s.ar...@open-t.co.uk>> wrote: I have a client which handles a lot of hotel bookings as part of their work - and all hotel booking confirmations coming from Travelodge (a UK hotel chain) hit FUZZY_XPILL. I've tried looking at the regex of the rule, but can't quite get my head around what it is supposed to do, and can't figure out why it triggers on all the Travelodge emails either. Could anybody provide some hints - or have others seen this as well? I can provide some sample mail, if it helps. Thank you. I have added an entry to 60_whitelist_auth.cf to help with this in all SA instances that run sa-update regularly. This will be out there in a couple of days trusting email from that sender when there is an SPF_PASS or DKIM_VALID_AU hit. def_whitelist_auth *@travelodge.co.uk These emails from Travelodge are important enough to be DKIM signed as well for http://dkimwl.org which I would eventually like to get added to the default SA ruleset. Thank you very much for the fix and for the quick replies.
Re: FUZZY_XPILL FP hitting all Travelodge emails
On 02/04/18 13:35, Pedro David Marco wrote: Sebastian, can you run spamassassin -D -t &1 | grep got | grep FUZZY_XPILL and post the result, please? Hi Pedro. Please find the output below: Apr 2 15:45:59.961 [6928] dbg: rules: ran body rule FUZZY_XPILL ==> got hit: "xon, OX"
Re: FUZZY_XPILL FP hitting all Travelodge emails
Thank you - one example here: https://pastebin.com/UGStfCys On 02/04/18 13:10, Kevin A. McGrail wrote: Pastebin a sample(s). On Mon, Apr 2, 2018, 08:06 Sebastian Arcus <s.ar...@open-t.co.uk <mailto:s.ar...@open-t.co.uk>> wrote: I have a client which handles a lot of hotel bookings as part of their work - and all hotel booking confirmations coming from Travelodge (a UK hotel chain) hit FUZZY_XPILL. I've tried looking at the regex of the rule, but can't quite get my head around what it is supposed to do, and can't figure out why it triggers on all the Travelodge emails either. Could anybody provide some hints - or have others seen this as well? I can provide some sample mail, if it helps. Thank you.
FUZZY_XPILL FP hitting all Travelodge emails
I have a client which handles a lot of hotel bookings as part of their work - and all hotel booking confirmations coming from Travelodge (a UK hotel chain) hit FUZZY_XPILL. I've tried looking at the regex of the rule, but can't quite get my head around what it is supposed to do, and can't figure out why it triggers on all the Travelodge emails either. Could anybody provide some hints - or have others seen this as well? I can provide some sample mail, if it helps. Thank you.
Re: BODY custom rule not working if text and html parts are different?
On 01/04/18 19:18, John Hardin wrote: On Sun, 1 Apr 2018, John Hardin wrote: On Sun, 1 Apr 2018, Matus UHLAR - fantomas wrote: On 01.04.18 05:47, Pedro David Marco wrote: This is a problem i see oftenly... what if the URL is only in the TEXT part and not in the HTML? many email aplications show those URLs as clickable as if they were valid HTML HREFs when they are not... in this case, body rule matches, but uri does not. I think there are hueristics to pull (non-obfuscated) URIs out of body text. Yeah, just confirmed. A non-obfuscated URI in plain-text body part is recognized and extracted for uri rules. That's great - thank you for testing this out and letting us know.
Re: BODY custom rule not working if text and html parts are different?
On 01/04/18 07:10, Matus UHLAR - fantomas wrote: On 01.04.18 05:47, Pedro David Marco wrote: This is a problem i see oftenly... what if the URL is only in the TEXT part and not in the HTML? many email aplications show those URLs as clickable as if they were valid HTML HREFs when they are not... in this case, body rule matches, but uri does not. I wonder if RAWBODY would match the url both in the text part and in the html part? Does anybody know?
Re: BODY custom rule not working if text and html parts are different?
On 31/03/18 22:39, John Hardin wrote: On Sat, 31 Mar 2018, Sebastian Arcus wrote: I have a really simple rule looking for custom text string contained in spam urls in the body of the email, like so: body SHORT_BITCOIN_DATING /specific_string_here/i score SHORT_BITCOIN_DATING 3.0 describe SHORT_BITCOIN_DATING Body URL signature of spam I just realised that it is only working if the URL exists in both the text and html versions. If the text version doesn't have the url, it isn't working. Do "body" rules only work on the html part of the message? I've tried searching through the documentation, but I can't see that being the case. Maybe there is something else having an effect here? "body" includes the *rendered* part of HTML. If the URL only appears within in the HTML part then "body" will not see it. If you are looking for URLs, you should probably be using a "uri" rule. There are heuristics to pull those out of the body text, as well out of HTML tags. Thank you for the suggestions - much appreciated. As my original rule worked initially, I didn't realise the subtle difference between using BODY and URI rules. It is working fine now. Thank you again!
BODY custom rule not working if text and html parts are different?
I have a really simple rule looking for custom text string contained in spam urls in the body of the email, like so: body SHORT_BITCOIN_DATING/specific_string_here/i score SHORT_BITCOIN_DATING3.0 describe SHORT_BITCOIN_DATINGBody URL signature of spam I just realised that it is only working if the URL exists in both the text and html versions. If the text version doesn't have the url, it isn't working. Do "body" rules only work on the html part of the message? I've tried searching through the documentation, but I can't see that being the case. Maybe there is something else having an effect here? Many thanks for any hints.
Re: T_DKIM_INVALID false positives with Gmail
On 19/03/18 15:53, Bill Cole wrote: On 19 Mar 2018, at 11:29, Sebastian Arcus wrote: I've been seeing a number of false positives recently from T_DKIM_INVALID with Gmail emails. Are some Gmail servers misconfigured, or could something be going on at my end? The DKIM record which is flagged as invalid is below: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=mime-version:from:date:message-id:subject:to;bh=8wlgvdpEOmUO2ugslPxRkFYA/ZThwu2bWy5VmlR76ug=; b=gRcnOIzmENqS8a91mSdETdXvyH6df7u0tSwsadk6CMD0KtAbzuM3ojHW+kPEo7AB1i vnbCDc/vsR6H7pP0k3hZmF7z/dAaeZWD4RVzqM+Fv70oHy4af64j+fGSekOCM9o4ShRQ Vk3KyF+69sKTK3rRWEnfrcgi/pN2DJWDvrIBRjmFOZYKNVN+8elaVM9DOO7tEMLYuw7T +sVaUMNt8MuPxRhrskJYOIxK8zzkcJHYV+1TuWJuqZAHRVwgnDWX7q3Wx0GwrX+3lKpm 3A1+F5dBVjH4dXvdfIESm5XpV8b9uBn9daGWrUgkR+PB23XsL9QkxEqCRXdgII3FRxtQ Ps6A== There are LOTS of ways to break a DKIM signature. Whether that one is broken can't be checked and how it might have been broken can't be guessed at without the full *unmodified* headers and body of the message. I use Exim to pass stuff directly to SA. Could I attach the DKIM header in a text file and send it to the list?
T_DKIM_INVALID false positives with Gmail
I've been seeing a number of false positives recently from T_DKIM_INVALID with Gmail emails. Are some Gmail servers misconfigured, or could something be going on at my end? The DKIM record which is flagged as invalid is below: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=mime-version:from:date:message-id:subject:to;bh=8wlgvdpEOmUO2ugslPxRkFYA/ZThwu2bWy5VmlR76ug=; b=gRcnOIzmENqS8a91mSdETdXvyH6df7u0tSwsadk6CMD0KtAbzuM3ojHW+kPEo7AB1i vnbCDc/vsR6H7pP0k3hZmF7z/dAaeZWD4RVzqM+Fv70oHy4af64j+fGSekOCM9o4ShRQ Vk3KyF+69sKTK3rRWEnfrcgi/pN2DJWDvrIBRjmFOZYKNVN+8elaVM9DOO7tEMLYuw7T +sVaUMNt8MuPxRhrskJYOIxK8zzkcJHYV+1TuWJuqZAHRVwgnDWX7q3Wx0GwrX+3lKpm 3A1+F5dBVjH4dXvdfIESm5XpV8b9uBn9daGWrUgkR+PB23XsL9QkxEqCRXdgII3FRxtQ Ps6A==
Re: Extremely persistent sex/make money spam with very little text in the body
On 07/03/18 11:25, Leandro wrote: 2018-03-07 5:52 GMT-03:00 Sebastian Arcus <s.ar...@open-t.co.uk <mailto:s.ar...@open-t.co.uk>>: 6. The links they include in the body of the email are almost never flagged up either by Clam or Spamassassin - and they point to a different domain in every single message. Although they use multiple domains in the URLs at body, many of these URLs are addressed to the same IPv4/IPv6 address or IP ranges, that is just one shared web server or a group of shared web servers of the spammer. The key to solving this problem is that you all start to cross the data and start scoring the URL host IP, that is the exact fiscal place they want to you visit even fired by many hacked mail servers at world and many distinct domains. The mail services and domains are very disperse but the web servers are very concentrated. As far as I can tell, the URL's in the spam I see point to php scripts on various compromised servers - which, maybe, further redirect to the final payment servers. But thank you for the suggestion - I will keep an eye on it.
Re: Extremely persistent sex/make money spam with very little text in the body
On 07/03/18 09:08, Daniele Duca wrote: On 07/03/2018 09:52, Sebastian Arcus wrote: I have this one email account receiving, for more than a year, a very specific type of spam which I find very difficult to block: 1. The messages are all kept very short, generally below 20 words - I assume so that Bayes is less efficient at classifying them? 2. Although they are all invitations to sex, or making money - they are phrased differently every time and use different words - so Bayes scores are consistently low. Hi Sebastian, I perfectly know what type of email you are talking about, I've seen them written at least in italian, english and spanish. If you click the link you are being redirected to shady dating websites or bitcoin/investment scams sites (at least in my experience). Since I get the majority of these emails in italian, I've written a meta rule that takes in account: - Common mispelled words/phrases - Body lines must be < 5 - The common pattern in all the urls. Take a close look at them, there IS a pattern, not writing it here for obvious reasons :) Thank you so much for that! The emails I see don't usually have spelling mistakes, but you are right, it seems that the url is the way to go. I've been looking for patters in the headers and source servers all along - it never crossed my mind to check the body! Thanks again
Extremely persistent sex/make money spam with very little text in the body
I have this one email account receiving, for more than a year, a very specific type of spam which I find very difficult to block: 1. The messages are all kept very short, generally below 20 words - I assume so that Bayes is less efficient at classifying them? 2. Although they are all invitations to sex, or making money - they are phrased differently every time and use different words - so Bayes scores are consistently low. 3. They come from servers all around the world - possibly compromised, or maybe quickly setup and taken down - so they are usually not flagged by blacklists 4. Pyzor tends to flag most of them up though. 5. In most cases, DKIM is correct, SPF is fine, and the headers are all correct - so they don't hit any other rules. 6. The links they include in the body of the email are almost never flagged up either by Clam or Spamassassin - and they point to a different domain in every single message. The bizarre thing is that I only see them coming to this one particular email account, at a single domain of all the ones I administer. Based on the above whoever sends them really know what they are doing, and must have significant resources at their disposal - but I still have no idea why they only hit this particular email address. I can only assume that greylisting wouldn't help much, as they seem to arrive from properly configured smpt servers, which would retry like any other regular smtp server and bypass greylisting. Has anybody else seen these, and is there anything else that I could try to block them?
Re: IADB whitelist - again
On 01/03/18 19:50, David Jones wrote: On 03/01/2018 12:29 PM, Sebastian Arcus wrote: I know I have brought up this issue on this list before, and sorry for the persistence, but having 7 different rules adding scores for the IADB whitelist still seems either ridiculous, or outright suspect: -0.2 RCVD_IN_IADB_RDNS RBL: IADB: Sender has reverse DNS record [199.127.240.84 listed in iadb.isipp.com] -0.1 RCVD_IN_IADB_SPF RBL: IADB: Sender publishes SPF record -0.1 RCVD_IN_IADB_OPTIN RBL: IADB: All mailing list mail is opt-in -0.0 RCVD_IN_IADB_SENDERID RBL: IADB: Sender publishes Sender ID record -0.0 RCVD_IN_IADB_LISTED RBL: Participates in the IADB system -0.1 RCVD_IN_IADB_DK RBL: IADB: Sender publishes Domain Keys record -0.1 RCVD_IN_IADB_VOUCHED RBL: ISIPP IADB lists as vouched-for sender It really raises some very uncomfortable questions regarding the impartiality of SA and/or its anti-spam capabilities. And by the way, this message is definitely unsolicited, and in now way we gave any sort of permission or consent to this company or its "affiliates" to email us - so the whole "All mailing list mail is opt-in" is nonsense. And why have "Sender has reverse DNS record" and "Sender publishes SPF record" as separate IADB rules - when SA itself already checks for these? Isn't this just a glaring way of pumping up SA scores for the IADB subscribers? Once in a while, even the best senders can get a bad customer of theirs that obtained email addresses by a violation of their terms and conditions. Just block that sender with a local "blacklist_from *@example.com" entry and report it to SpamCop. If the message headers have any abuse reporting information then send the headers there too. They should do their own internal investigation and shutdown that bad customer of theirs. That is still beside the point. There is simply no reason in the interest of SA as an antispam solution to publish all those rules. One or two rules would be more than enough. I know I can block this and that in SA, and tweak rules all the time - but I am concerned when the default settings in SA effectively facilitate marketing companies to stuff my Inbox full of junk. In that case you would achieve better results not using SA at all. As to reporting bad senders and "internal investigation" - my experience shows that doesn't get very far with any providers.
Re: IADB whitelist - again
On 01/03/18 19:04, John Hardin wrote: On Thu, 1 Mar 2018, Sebastian Arcus wrote: I know I have brought up this issue on this list before, and sorry for the persistence, but having 7 different rules adding scores for the IADB whitelist still seems either ridiculous, or outright suspect: -0.2 RCVD_IN_IADB_RDNS RBL: IADB: Sender has reverse DNS record [199.127.240.84 listed in iadb.isipp.com] -0.1 RCVD_IN_IADB_SPF RBL: IADB: Sender publishes SPF record -0.1 RCVD_IN_IADB_OPTIN RBL: IADB: All mailing list mail is opt-in -0.0 RCVD_IN_IADB_SENDERID RBL: IADB: Sender publishes Sender ID record -0.0 RCVD_IN_IADB_LISTED RBL: Participates in the IADB system -0.1 RCVD_IN_IADB_DK RBL: IADB: Sender publishes Domain Keys record -0.1 RCVD_IN_IADB_VOUCHED RBL: ISIPP IADB lists as vouched-for sender It really raises some very uncomfortable questions regarding the impartiality of SA and/or its anti-spam capabilities. And by the way, this message is definitely unsolicited, and in now way we gave any sort of permission or consent to this company or its "affiliates" to email us - so the whole "All mailing list mail is opt-in" is nonsense. And why have "Sender has reverse DNS record" and "Sender publishes SPF record" as separate IADB rules - when SA itself already checks for these? Isn't this just a glaring way of pumping up SA scores for the IADB subscribers? Don't assume malice right off the bat. More likely it is that IADB provides all those status codes and SA exposes a rule for each, with minimal scores, to allow local tuning if desired. But why does SA have to expose a rule for each and every code IADB provides? SA is an antispam solution, IADB is a facilitator for the marketing industry (in spite of their continuous protestations on this list). The goals of the two are not the same. Surely SA can decide by itself what is really useful from a spam filtering point of view - not churn out whatever it gets passed by marketing whitelists? SA uses other whitelists (some may I say a lot more useful than IADB), and it only exposes one or two rules for each. Also, there is RCVD_IN_IADB_DOPTIN, so RCVD_IN_IADB_OPTIN may be "someone somehow gave us your name somewhere" (i.e. "single opt-in") rather than "we confirmed you actually want to receive our garbage" ("double opt-in"). So effectively pretty useless, as if you ever made the mistake of forgetting to untick the "receive email from our carefully selected partners" in the past, you will never be able to take that consent back as your email address gets passed from entity to entity. Consent to be emailed marketing material is a joke - and SA shouldn't be a facilitator - otherwise its value as a spam filter is gone. The scores appear hardcoded (50_scores.cf) vs. from masscheck (72_scores.cf) so they may be *very* stale. In that case maybe at least some of the rules should be removed then
IADB whitelist - again
I know I have brought up this issue on this list before, and sorry for the persistence, but having 7 different rules adding scores for the IADB whitelist still seems either ridiculous, or outright suspect: -0.2 RCVD_IN_IADB_RDNS RBL: IADB: Sender has reverse DNS record [199.127.240.84 listed in iadb.isipp.com] -0.1 RCVD_IN_IADB_SPF RBL: IADB: Sender publishes SPF record -0.1 RCVD_IN_IADB_OPTIN RBL: IADB: All mailing list mail is opt-in -0.0 RCVD_IN_IADB_SENDERID RBL: IADB: Sender publishes Sender ID record -0.0 RCVD_IN_IADB_LISTEDRBL: Participates in the IADB system -0.1 RCVD_IN_IADB_DKRBL: IADB: Sender publishes Domain Keys record -0.1 RCVD_IN_IADB_VOUCHED RBL: ISIPP IADB lists as vouched-for sender It really raises some very uncomfortable questions regarding the impartiality of SA and/or its anti-spam capabilities. And by the way, this message is definitely unsolicited, and in now way we gave any sort of permission or consent to this company or its "affiliates" to email us - so the whole "All mailing list mail is opt-in" is nonsense. And why have "Sender has reverse DNS record" and "Sender publishes SPF record" as separate IADB rules - when SA itself already checks for these? Isn't this just a glaring way of pumping up SA scores for the IADB subscribers?
Re: Spamassassin DNS problems
On 10/01/18 12:14, ter...@web.de wrote: Hi. I found your spamassassin problem while looking for answers to my problem: http://spamassassin.1065346.n5.nabble.com/Dns-Blocklists-always-returning-0-records-td124564.html It seems the problem you have/had is exactly the problem I have! Sadly there is no solution in the thread. Did you manage to find a solution? Hi. I did reply to the list at the time in case others would find the information useful. For some strange reason, the SA mailing list archive doesn't seem to include my last reply. I've enclosed it below: On 17/05/17 18:11, Sebastian Arcus wrote: Just a follow-up and clarification on this issue - after more testing, it seems that it was the Spamassassin version which was the problem. I have had to upgrade SA on 7 servers running 3.4.1 on Slackware - as the dns rbl's weren't working on any of them. The only server I had with SA 3.4.0 *was* actually working correctly. After upgrading all the boxes to 4.0.0, the dns rbl's are now working correctly. I have *not* changed any configuration options in SA - I left all the servers as they were in this respect - so it seems it was not a configuration issue. I'm afraid I haven't been able to narrow it down further than this. The servers were all running various kernels, both x86 and x86_64 architectures, and several different versions of Perl - so I would guess the SA version was the common factor and the likely culprit.
Re: IADB whitelist
On 25/12/17 23:57, Bill Cole wrote: On 25 Dec 2017, at 3:28 (-0500), Sebastian Arcus wrote: Also, any idea why are there 6 different rules associated with this particular whitelist? IADB has many independent return codes that each have distinct meaning. See http://www.isipp.com/email-accreditation/about-the-codes/list-of-codes/ for details. If you get mail from an IADB-listed sender that you are 100% sure is spam (i.e. not "I would never ask for such mail" but "the recipient absolutely did not consent to receiving this mail.") then you should report that to ISIPP. "ab...@suretymail.com" is the reporting address listed on their website and while I've not had cause to use it, people I trust with no reason to lie say that reports to that address do actually work to either change sender behavior or eliminate listings. Anne Mitchell (head of ISIPP) is an ex-coworker of mine whose integrity and dedication to the anti-spam fight (which is dependent on keeping *wanted* mail deliverable) I can personally vouch for. However, the different responses from IADB are VERY nuanced and the two strongest rules you listed (RCVD_IN_IADB_OPTIN and RCVD_IN_IADB_VOUCHED) are essentially "good intentions" markers. Due to unfortunate terminology choices by ISIPP and a willingness to engage in nuance and estimate intentions, those aren't really as worthwhile as they might seem. The IADB definition of "All mailing list mail is opt-in" is (effectively) "we believe that this ESP believes in good faith that every recipient has chosen to receive this mail." Their "vouching" for a record is an assertion that either the ESP is personally known to ISIPP staff as competent and honest OR has maintained stable positive listings for >6 months. I'm pretty sure I don't want ANY score for a non-vouched record and unlike ISIPP (and some valuable SA contributors!) I really don't care much about ESPs' intentions or responsiveness to complaints, only about actual spamming behavior. So I have made substantial modification on my own system to how IADB results are scored, but those specific adjustments are probably not fit for most other sites. Thank you for a detailed reply. Like you as well, I don't put much weight on what ESP's say they do or intend to do. I'm afraid the email marketing industry is rather murky and the line between legitimate marketing and spamming is often pretty much non-existent - with apologies to those few operators who actually run an honest operation. I see daily examples of supposedly legit operators who don't actually act on unsubscribe requests, or 'magically' re-subscribe after a while, or simply get around rules by creating a new list and re-subscribing everybody who unsubscribed. And frankly, the whole issue of consent is blurred beyond any usefulness. If you have ever made the mistake of leaving the tick box selected for "receive offers from our carefully selected partners", it is virtually impossible to take that consent back, as your email address gets passed from database to database, never to be removed again. Besides, with most people purchasing things from so many different sources, and creating accounts on so many websites, how many would actually be able to say for sure (and prove it) that they never gave consent to be emailed by "carefully selected partners"? So you will excuse me if I take any whitelist which helps marketing emailing lists "improve deliverability" with a very big dollop of salt.
Re: IADB whitelist
On 25/12/17 10:45, Reindl Harald wrote: Am 25.12.2017 um 09:28 schrieb Sebastian Arcus: On 23/12/17 10:01, Kevin A. McGrail wrote: The 1st step is that a representaive of the rbl asks us to consider for inclusion. Thank you. If enough people receive spam sanctioned by a particular whitelist, will the minus scores associated with their rule(s) be reduced over time? maybe, but why not just override the score in local.cf /etc/mail/spamassassin/local-*.cf score RCVD_IN_IADB_DK -0.3 score RCVD_IN_IADB_DOPTIN -1.0 score RCVD_IN_IADB_DOPTIN_GT50 -0.5 score RCVD_IN_IADB_DOPTIN_LT50 -0.1 score RCVD_IN_IADB_LISTED -0.001 score RCVD_IN_IADB_ML_DOPTIN -2.5 score RCVD_IN_IADB_OPTIN -0.05 score RCVD_IN_IADB_OPTIN_GT50 -0.2 score RCVD_IN_IADB_OPTIN_LT50 -0.1 score RCVD_IN_IADB_RDNS -0.05 score RCVD_IN_IADB_SENDERID -0.5 score RCVD_IN_IADB_SPF -0.1 score RCVD_IN_IADB_VOUCHED -2.0 I know I can override the scores for all sorts of things in local.cf. The reason I was raising the question is because I was wondering if whitelists can be used by unscrupulous marketing organisations to effectively undo what is one of the main functions of SA - to reduce or stop unsolicited email. Also, any idea why are there 6 different rules associated with this particular whitelist? these are 6 different lists, just read the description you even posted on the right side of the score Well, they might be technically 6 different lists, but IADB is one single entity, and including 6 different whitelists from them only looks like a way to reduce the SA score for email from their "certified" senders further. After all SA already checks separately for things like RDNS, DKIM, SPF. On December 23, 2017 3:03:26 AM EST, Sebastian Arcus <s.ar...@open-t.co.uk> wrote: What is the process of including whitelists in SA default configs? It is not the first time I see pretty obvious mailing list spam which has quite high minus scores from 2-3 whitelists included in SA: -1.5 RCVD_IN_IADB_OPTIN RBL: IADB: All mailing list mail is opt-in [205.201.128.83 <http://205.201.128.83> listed iniadb.isipp.com <http://iadb.isipp.com>] -0.1 RCVD_IN_IADB_DK RBL: IADB: Sender publishes Domain Keys record -0.2 RCVD_IN_IADB_RDNS RBL: IADB: Sender has reverse DNS record -0.0 RCVD_IN_IADB_SENDERID RBL: IADB: Sender publishes Sender ID record -2.2 RCVD_IN_IADB_VOUCHED RBL: ISIPP IADB lists as vouched-for sender -0.1 RCVD_IN_IADB_SPF RBL: IADB: Sender publishes SPF record -0.0 RCVD_IN_IADB_LISTED RBL: Participates in the IADB system -0.0 RCVD_IN_IADB_OPTIN_GT50 RBL: IADB: Opt-in used more than 50% of the time For the same message, Pyzor has a high score - which is correct: 2.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% [cf: 100] 2.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
Re: IADB whitelist
On 23/12/17 10:01, Kevin A. McGrail wrote: The 1st step is that a representaive of the rbl asks us to consider for inclusion. Thank you. If enough people receive spam sanctioned by a particular whitelist, will the minus scores associated with their rule(s) be reduced over time? Also, any idea why are there 6 different rules associated with this particular whitelist? Regards, KAM On December 23, 2017 3:03:26 AM EST, Sebastian Arcus <s.ar...@open-t.co.uk> wrote: What is the process of including whitelists in SA default configs? It is not the first time I see pretty obvious mailing list spam which has quite high minus scores from 2-3 whitelists included in SA: -1.5 RCVD_IN_IADB_OPTIN RBL: IADB: All mailing list mail is opt-in [205.201.128.83 <http://205.201.128.83> listed iniadb.isipp.com <http://iadb.isipp.com>] -0.1 RCVD_IN_IADB_DKRBL: IADB: Sender publishes Domain Keys record -0.2 RCVD_IN_IADB_RDNS RBL: IADB: Sender has reverse DNS record -0.0 RCVD_IN_IADB_SENDERID RBL: IADB: Sender publishes Sender ID record -2.2 RCVD_IN_IADB_VOUCHED RBL: ISIPP IADB lists as vouched-for sender -0.1 RCVD_IN_IADB_SPF RBL: IADB: Sender publishes SPF record -0.0 RCVD_IN_IADB_LISTEDRBL: Participates in the IADB system -0.0 RCVD_IN_IADB_OPTIN_GT50 RBL: IADB: Opt-in used more than 50% of the time For the same message, Pyzor has a high score - which is correct: 2.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% [cf: 100] 2.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
IADB whitelist
What is the process of including whitelists in SA default configs? It is not the first time I see pretty obvious mailing list spam which has quite high minus scores from 2-3 whitelists included in SA: -1.5 RCVD_IN_IADB_OPTIN RBL: IADB: All mailing list mail is opt-in [205.201.128.83 listed in iadb.isipp.com] -0.1 RCVD_IN_IADB_DKRBL: IADB: Sender publishes Domain Keys record -0.2 RCVD_IN_IADB_RDNS RBL: IADB: Sender has reverse DNS record -0.0 RCVD_IN_IADB_SENDERID RBL: IADB: Sender publishes Sender ID record -2.2 RCVD_IN_IADB_VOUCHED RBL: ISIPP IADB lists as vouched-for sender -0.1 RCVD_IN_IADB_SPF RBL: IADB: Sender publishes SPF record -0.0 RCVD_IN_IADB_LISTEDRBL: Participates in the IADB system -0.0 RCVD_IN_IADB_OPTIN_GT50 RBL: IADB: Opt-in used more than 50% of the time For the same message, Pyzor has a high score - which is correct: 2.5 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% [cf: 100] 2.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
Re: HTML_IMAGE_ONLY_* generating too many FP's
On 02/12/17 18:45, David Jones wrote: On 12/02/2017 11:22 AM, Sebastian Arcus wrote: On 02/12/17 13:06, Matus UHLAR - fantomas wrote: On 12/01/2017 11:17 AM, Sebastian Arcus wrote: -0.2 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [212.227.126.131 listed in wl.mailspike.net] 0.4 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME 1.6 HTML_IMAGE_ONLY_24 BODY: HTML: images with 2000-2400 bytes of words 2.0 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.4808] 0.8 MPART_ALT_DIFF BODY: HTML and text parts are different 0.0 HTML_MESSAGE BODY: HTML included in message 2.5 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/) -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [212.227.126.131 listed in list.dnswl.org] On 01/12/17 10:54, Axb wrote: you've changed SA default scores and now complain about one which hasn't been touched as cause for FPs? compare the defaults with yours... score PYZOR_CHECK 0 1.985 0 1.392 # n=0 n=2 score BAYES_50 0 0 2.0 0.8 h maybe you should rethink those changes. On 01.12.17 12:23, Sebastian Arcus wrote: Indeed, I did amend some of the default SA scores, to catch more spam for the type of email received at this particular site. That doesn't change the fact that 1.6 seems to me a pretty high score for a rule which would be triggered on such a large number of ham emails. Just saying. You should understand that when you start tuning scores, you can get to hell very fast. unless you do your own mass-checks and tune according to them. I'm not too sure I understand this attitude. The whole reason I started to tweak the scores for certain rules is that too much spam was going through. The false negatives have gone down considerably since I have altered the scores - and yes, I do keep an eye on them constantly and adjust depending on the number of false positive and negatives, and what triggers what. I also use network tests / RBL's as well and Bayes. The simple fact of the matter is that on plenty of spam emails, only one significant rule might get triggered - be it a high bayes score, one of the DNS RBL's or something else. If the rule doesn't have a high enough score, the email passes through. Spammers change their tactics and content of their emails all the time - and the rule scores haven't been updated in months - because of the problems with the updating system (which is not a criticism - I understand the situation). So for people to advise sticking religiously to the default scores, well, frankly I don't get it. The rulesets and dynamic scores in 72_scores.cf are updating again for the past 2 weeks. I recommend only changing a few of the default scores and make meta rules that combine the hits to add points when you see a pattern of 2 or more rules being hit. If you add enough add-ons to your SA instance, then you shouldn't be impacted too much by the default scores. SA has to be generic out of the box to cover all types of mail flow. You have to tune it a bit for your particular recipients, language, and location. See my email moments ago about tuning suggestions. I used to constantly adjust scores to react to new spam campaigns but found I was always behind the spammers. The more RBLs and meta rules you can setup, the more you can stay ahead of them. Compromised accounts are the exception to this with zero-hour spam that is very difficult to block so try to keep that separate in your mind and not chase after those with score adjustments. These tend to stop automatically after 30 minutes or so when RBLs and DCC catch up to them or the account gets locked or it's password changed. I report these to Spamcop as quickly as I can. Thank you David. Those are useful tips.
Re: HTML_IMAGE_ONLY_* generating too many FP's
On 02/12/17 13:06, Matus UHLAR - fantomas wrote: On 12/01/2017 11:17 AM, Sebastian Arcus wrote: -0.2 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [212.227.126.131 listed in wl.mailspike.net] 0.4 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME 1.6 HTML_IMAGE_ONLY_24 BODY: HTML: images with 2000-2400 bytes of words 2.0 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.4808] 0.8 MPART_ALT_DIFF BODY: HTML and text parts are different 0.0 HTML_MESSAGE BODY: HTML included in message 2.5 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/) -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [212.227.126.131 listed in list.dnswl.org] On 01/12/17 10:54, Axb wrote: you've changed SA default scores and now complain about one which hasn't been touched as cause for FPs? compare the defaults with yours... score PYZOR_CHECK 0 1.985 0 1.392 # n=0 n=2 score BAYES_50 0 0 2.0 0.8 h maybe you should rethink those changes. On 01.12.17 12:23, Sebastian Arcus wrote: Indeed, I did amend some of the default SA scores, to catch more spam for the type of email received at this particular site. That doesn't change the fact that 1.6 seems to me a pretty high score for a rule which would be triggered on such a large number of ham emails. Just saying. You should understand that when you start tuning scores, you can get to hell very fast. unless you do your own mass-checks and tune according to them. I'm not too sure I understand this attitude. The whole reason I started to tweak the scores for certain rules is that too much spam was going through. The false negatives have gone down considerably since I have altered the scores - and yes, I do keep an eye on them constantly and adjust depending on the number of false positive and negatives, and what triggers what. I also use network tests / RBL's as well and Bayes. The simple fact of the matter is that on plenty of spam emails, only one significant rule might get triggered - be it a high bayes score, one of the DNS RBL's or something else. If the rule doesn't have a high enough score, the email passes through. Spammers change their tactics and content of their emails all the time - and the rule scores haven't been updated in months - because of the problems with the updating system (which is not a criticism - I understand the situation). So for people to advise sticking religiously to the default scores, well, frankly I don't get it.
Re: HTML_IMAGE_ONLY_* generating too many FP's
On 01/12/17 10:54, Axb wrote: On 12/01/2017 11:17 AM, Sebastian Arcus wrote: On 30/11/17 12:45, Matus UHLAR - fantomas wrote: On 28.11.17 19:39, Sebastian Arcus wrote: I'm having more and more problems with the HTML_IMAGE_ONLY_* set of rules recently generating false positives. Plenty of business emails will include a logo at the bottom - and not everybody is a graphics expert to make their logo a tiny optimised gif or png - so some of these are slightly bigger than they should be. However, this seems to be sufficiently wide spread. Also, many business emails can be just a few words reply - so the ratio of words to images triggers the filter in SA. Could the scores on HTML_IMAGE_ONLY_* set of rules be lowered a bit - or is there anything else to be done - aside from educating all the internet on optimising logos in the email signatures? :-) those have lower scorew with BAYES and network rules enabled. configure BAYES and enable netowrk rules... Hi. I have BAYES enabled and DNSBL's enabled (I assume that's what you mean by network rules?). I still think that a score of 1.6 is quite a lot, considering that so many emails nowadays contain either an embedded logo in the signature, with just a few words (in a quick email reply, for example), or even images inserted, instead of attached to the email. Please see below an example of a SA report: -0.2 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [212.227.126.131 listed in wl.mailspike.net] 0.4 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME 1.6 HTML_IMAGE_ONLY_24 BODY: HTML: images with 2000-2400 bytes of words 2.0 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.4808] 0.8 MPART_ALT_DIFF BODY: HTML and text parts are different 0.0 HTML_MESSAGE BODY: HTML included in message 2.5 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/) -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [212.227.126.131 listed in list.dnswl.org] you've changed SA default scores and now complain about one which hasn't been touched as cause for FPs? compare the defaults with yours... score PYZOR_CHECK 0 1.985 0 1.392 # n=0 n=2 score BAYES_50 0 0 2.0 0.8 h maybe you should rethink those changes. Indeed, I did amend some of the default SA scores, to catch more spam for the type of email received at this particular site. That doesn't change the fact that 1.6 seems to me a pretty high score for a rule which would be triggered on such a large number of ham emails. Just saying.
Re: HTML_IMAGE_ONLY_* generating too many FP's
On 30/11/17 12:45, Matus UHLAR - fantomas wrote: On 28.11.17 19:39, Sebastian Arcus wrote: I'm having more and more problems with the HTML_IMAGE_ONLY_* set of rules recently generating false positives. Plenty of business emails will include a logo at the bottom - and not everybody is a graphics expert to make their logo a tiny optimised gif or png - so some of these are slightly bigger than they should be. However, this seems to be sufficiently wide spread. Also, many business emails can be just a few words reply - so the ratio of words to images triggers the filter in SA. Could the scores on HTML_IMAGE_ONLY_* set of rules be lowered a bit - or is there anything else to be done - aside from educating all the internet on optimising logos in the email signatures? :-) those have lower scorew with BAYES and network rules enabled. configure BAYES and enable netowrk rules... Hi. I have BAYES enabled and DNSBL's enabled (I assume that's what you mean by network rules?). I still think that a score of 1.6 is quite a lot, considering that so many emails nowadays contain either an embedded logo in the signature, with just a few words (in a quick email reply, for example), or even images inserted, instead of attached to the email. Please see below an example of a SA report: -0.2 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [212.227.126.131 listed in wl.mailspike.net] 0.4 MIME_HTML_MOSTLY BODY: Multipart message mostly text/html MIME 1.6 HTML_IMAGE_ONLY_24 BODY: HTML: images with 2000-2400 bytes of words 2.0 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.4808] 0.8 MPART_ALT_DIFF BODY: HTML and text parts are different 0.0 HTML_MESSAGE BODY: HTML included in message 2.5 PYZOR_CHECKListed in Pyzor (http://pyzor.sf.net/) -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [212.227.126.131 listed in list.dnswl.org]
HTML_IMAGE_ONLY_* generating too many FP's
I'm having more and more problems with the HTML_IMAGE_ONLY_* set of rules recently generating false positives. Plenty of business emails will include a logo at the bottom - and not everybody is a graphics expert to make their logo a tiny optimised gif or png - so some of these are slightly bigger than they should be. However, this seems to be sufficiently wide spread. Also, many business emails can be just a few words reply - so the ratio of words to images triggers the filter in SA. Could the scores on HTML_IMAGE_ONLY_* set of rules be lowered a bit - or is there anything else to be done - aside from educating all the internet on optimising logos in the email signatures? :-)
Re: The rise of highly targeted spam emails
On 16/11/17 12:16, Martin Gregorie wrote: On Thu, 2017-11-16 at 09:15 +, Sebastian Arcus wrote: On 15/11/17 18:11, Martin Gregorie wrote: On Wed, 2017-11-15 at 14:44 +, Sebastian Arcus wrote: I initially decided that an archive was A Good Thing to have, simply because retrieving mail from it should be a lot faster than searching through huge mail folders. This turned out to be true in practice: the archive currently holds 183,000 emails and a worst case search takes around 30 seconds to return a list of hits (running on a 3 GHz dual Athlon system with 4GB RAM and Fedora 25 as its OS). Thank you for the details. How do you search the archive? With grep directly on the server? Using SQL queries. The two main tables in the database hold e-mail addresses and messages respectively plus there are many-to-many links between the two that are implemented with a third table that holds the link type ('To' or 'From') and an additional table containing subject text - this has a one-to-many relationship with the messages. The SA plugin just looks at the From header in the message being checked and, if it finds that address in the database, sees if there are any 'To' links associated with it. If there are, then the message gets negative points. As I said, this SQL query is actually run against a database view that combines the address and link tables. Since the rows on these tables are small and the tables are indexed on address and link type, the query is very fast. If you want to know more about the archive, look here: http://www.libelle-systems.c3487738.myzen.co.uk/mailarchive/ Ignore the licensing stuff: I initially thought I might be onto a revenue source, but remarkably few people use mail archives. I should remove the license management code and open source the archive but so far haven't got round to doing that. Thank you for the info. I haven't considered it before, but it makes sense to store large mail archives in SQL databases. I suppose it is one of the few ways to efficiently search such a large volume of data - much faster than searching Maildir or MBOX archives. I guess one aspect that is less than ideal is the fact that it wouldn't be possible to give archive access to users through their normal mail software interface - such as Thunderbird for example.
Re: The rise of highly targeted spam emails
On 15/11/17 18:11, Martin Gregorie wrote: On Wed, 2017-11-15 at 14:44 +, Sebastian Arcus wrote: I initially decided that an archive was A Good Thing to have, simply because retrieving mail from it should be a lot faster than searching through huge mail folders. This turned out to be true in practice: the archive currently holds 183,000 emails and a worst case search takes around 30 seconds to return a list of hits (running on a 3 GHz dual Athlon system with 4GB RAM and Fedora 25 as its OS). Thank you for the details. How do you search the archive? With grep directly on the server?
Re: The rise of highly targeted spam emails
On 15/11/17 15:16, Reindl Harald wrote: Am 15.11.2017 um 15:47 schrieb Sebastian Arcus: On 15/11/17 09:56, Reindl Harald wrote: Am 15.11.2017 um 09:41 schrieb Sebastian Arcus: I can't really train the bayesian filter on these emails, as it would start to affect ham emails classification this is a unproven claim! we have here phishings in bayes which are classified with BAYES_99 where my human eyes hardly can distinct them between origin messages classified with BAYES_00 - you just need to train both and bayes will find the differences over time I'm not sure I understand this? In my limited knowledge of how bayesian filters work, I assumed that if the words are the same/similar between emails, they should produce similar bayes scores, no? Do you have any links to explanations of how this would work - as I am keen not to affect the wrong way the bayes databases I built over time bayes also takes headers into account as well as a lot of invisible stuff, fact is that we block all the DHL phishings which existed the last years and short ago i saw some appearently new with a foreign envelope/from address failing SPF where a dhl.com server sent on behalf of the customer and that thing was even without whitelist_auth correctly classified with BAYES_00 and yes, i have QA scriptts iterating over all the spam and ham samples collected since 2014, test the current bayes classification, alerts if spam does not get BAYES_99 or ham not BAYES_00 and in that case "sa-retrain.sh smaple-path" which makes 5 copies with some modified headers like message-id and retrains them Interesting - thank you for the details. Is this your person mailbox(es) - or a larger setup?
Re: The rise of highly targeted spam emails
On 15/11/17 09:56, Reindl Harald wrote: Am 15.11.2017 um 09:41 schrieb Sebastian Arcus: I can't really train the bayesian filter on these emails, as it would start to affect ham emails classification this is a unproven claim! we have here phishings in bayes which are classified with BAYES_99 where my human eyes hardly can distinct them between origin messages classified with BAYES_00 - you just need to train both and bayes will find the differences over time I'm not sure I understand this? In my limited knowledge of how bayesian filters work, I assumed that if the words are the same/similar between emails, they should produce similar bayes scores, no? Do you have any links to explanations of how this would work - as I am keen not to affect the wrong way the bayes databases I built over time.
Re: The rise of highly targeted spam emails
On 15/11/17 09:55, Martin Gregorie wrote: On Wed, 2017-11-15 at 08:41 +, Sebastian Arcus wrote: The emails often contain links to various popular cloud platforms - such as SharePoint, DropBox etc. Most of the emails come from clean domains, or from large webmail providers. I'd say there is not a lot you can do if the legit solicitors and accountants you and your clients deal with normally use these public dropboxes to deliver documents. OTOH, if they don't do that, then if the mail claims to be from a solicitor or accountant you can use the presence those links as a spam recogniser, or go even further and treat any link that *doesn't* point to the sender's own domain as a spam indication. Whether doing this is safe or not depends pretty much on what's in your normal mail stream and on what is seen as normal practice for the solicitors and accountants your users deal with. I use a mail archive as another way of finding spam: anybody in the archive who I've sent mail to gets tagged by a negative-scoring rule, but this may not work for you and your users. However, system performance isn't an issue. My archive is in a Postgres database and the view it uses to recognise addresses that have received mail from my domain is fast because the my DB schema was designed to support this type of query. Thank you - that is an interesting idea. Do you use a software to extract the emails from the Sent archives, or do you add them to the database on-the-fly, when the sent emails go out through your MTA? If you have any links or example scripts available I would be very much interested. I suppose one side risk is that if the domain of one of your regular correspondents gets compromised, the spam coming from it will almost be guaranteed to arrive in the Inbox?
The rise of highly targeted spam emails
I have noticed in the last half a year or so the rise in much more focused email campaigns. I have some solicitor and accountant clients who receive these scam emails which are a notch above the rest. The English is good and correctly spelled. The footers look professional and just like the ones from other offices in the trade. The wording is very similar to the usual emails they receive - such as a reminders for payments, or an enquiry about documentation for the sale of a house. The emails often contain links to various popular cloud platforms - such as SharePoint, DropBox etc. Most of the emails come from clean domains, or from large webmail providers. I can't really train the bayesian filter on these emails, as it would start to affect ham emails classification. I also assume that RBL's can't do much, as they would have to block everything from DropBox or SharePoint if they start blacklisting these emails. Is there anything else that could be done to block this stuff? Have others seen these types of emails?
Re: OT - Hotmail/Outlook.com marking most of our email as Junk
On 21/09/17 11:13, Zulma Pape wrote: It means that your ip is greylisted in their end. There are many solutions to fix this issue, but the easiest and cheapest one is the get a new ip, and refill the form and see their feedback about it. If it qualifies for mitigation then you'll start friendly with them, then they'll build a new reputation on your historic. If not, you can get a new ip and do the same steps until you get a friendly IP. Thank you for the suggestions. I'm afraid we can't just keep on changing IP addresses, as there is other infrastructure tied to this IP address (vpn, external laptops etc.) - so it would involve quite a bit of reconfiguration. Also, I doubt that it would do much good, as we've had this IP address for 5 years - so it is clean. There is the possibility that Hotmail doesn't like our IP address because it is a consumer/ADSL/end-user IP - although I've removed it from the Spamhaus PBL database. I guess Hotmail must be using an internal database. In this case changing to another end-user IP wouldn't do much good. Another solution is, since your volume is very low at the moment, it should be quite easy for you to ask from your list to add your Sender to their contact list. This will prevent your emails from going to junk folder, and at the same time this will increase the reputation of your IP. I will ask a number of contacts to mark our emails as safe - who knows, maybe it will help. Thank you.
Re: MISSING_SUBJECT not triggered if subject contains whitespace
On 19/09/17 15:05, Kevin A. McGrail wrote: On 9/19/2017 9:11 AM, David Jones wrote: I have had these in place for years. Maybe Kevin can consolidate and integrate this into his KAM.cf so I could remove them or we could eventually get them into the default SA ruleset after some testing. Hi David, Thanks. In addition to KAM.cf, I maintain a nonKAMrules.cf which I've added these attributing them with the idea to test. It's where I throw rules in the PD from lists and things like that so I'm not claiming ownership but like the ideas. Note, I lowered the score on the 1st two. I'm pretty sure those might cause more FPs than intended. https://www.pccc.com/downloads/SpamAssassin/contrib/nonKAMrules.cf That looks like really useful stuff. Is it likely that any of these rules will make their way into SA - or should we include them ourselves in local.cf?
Re: OT - Hotmail/Outlook.com marking most of our email as Junk
On 21/09/17 10:28, Zulma Pape wrote: Here is the link to the forms I talked about. Good luck ! https://support.microsoft.com/en-us/getsupport?oaspworkflow=start_1.0.0.0=capsub=edfsmsbl3=en-us=635622755123113400 Thank you for that - I've just managed to find that form in the maize of MS website about an hour ago. I've filled it out and submitted it - and just received an email saying that the IP address doesn't qualify for mitigation. I'm not sure if that means that the IP address is already clean at their end, or it is blacklisted or greylisted, but they don't want to unblock it. On Thu, Sep 21, 2017 at 8:40 AM, Sebastian Arcus <s.ar...@open-t.co.uk <mailto:s.ar...@open-t.co.uk>> wrote: On 19/09/17 10:29, Zulma Pape wrote: There are tons of ways to get your IP a good reputation with Hotmail. Start setting up the SNDS, this will help you monitor your reputation directly with Microsoft. Hi - thank you for the suggestions. I have signed up for the SNDS programme - which looks potentially useful. Unfortunately, SNDS does not show mail and spam traffic stats for IP addresses sending less than 100 mails per day - which seems to be the case of this particular site. The IP Status page lists our IP as having normal status - so I guess it's all good there. You should also try filling their support forms, they will check your IP's historic reputation and act accordingly to it, and since your background is good, the feedback should be positive for you. I have seen references to this form on various historic forum posts - but the links I followed are all dead. Has this form been removed? On Tue, Sep 19, 2017 at 7:25 AM, Sebastian Arcus <s.ar...@open-t.co.uk <mailto:s.ar...@open-t.co.uk> <mailto:s.ar...@open-t.co.uk <mailto:s.ar...@open-t.co.uk>>> wrote: This is a bit off topic as it is not directly related to SA, but I'm hoping that with the email and spam expertise on this group, someone might throw in a useful idea - which would be much appreciated. I have this problem on one site where most emails we send to Hotmail/Outlook.com/Live.com email addresses end up in Junk at the recipient's end. Things I have tried: 1. I've setup SPF, DKIM, DMARC (and set it to 'reject'). 2. We used to smart relay outbound email through the hosting provider (1and1), but now changed to send directly from our own IP address, so that we can control the reputation of the sending IP - no change. 3. I've checked our public IP and the domain name at mxtoolbox.com <http://mxtoolbox.com> <http://mxtoolbox.com> - all tests pass (the public IP has been delisted from the Spamhaus non-MX/end-user IP database). 4. I've setup forward and reverse DNS entries for our IP address. 5. I've checked with all DNS blocklists/blacklists I could find - our domain or IP address is not flagged up anywhere. 6. This is a small network which I've been managing for years - the domain name has not been used to send marketing/lists email of any sort - so the historic reputation should be fine. 7. I've setup a monitor and block on port 25 outbound on the network firewall - in case there is a trojan on a machine on the network sending out spam and ruining the reputation of our IP - it's never been triggered. 8. I've checked the contents of outgoing emails - this is an accountants practice - the email content is standard, there is nothing there which should trigger bayesian filters. 9. I've sent emails to other servers under my control running SA - the scores come out perfect at the receiving end. 10. The emails we send are operational and notices emails to customers - who need them. They call on the phone and complain they haven't received them - just to discover they were sent, but ended up in the junk. 11. Emails we send to any other domains are never a problem spam-wise. I can't really think of anything else to try - have I missed anything? Are Hotmail/Outlook.com spam filters a complete lottery?
Re: OT - Hotmail/Outlook.com marking most of our email as Junk
On 19/09/17 17:17, Jerry Malcolm wrote: My recommendation as a first step is to go to mail-tester.com. They will tell you to send an email to a temp email address, and they will analyze and grade your email as to 'spamy-ness'. Outlook, gmail, etc were flagging a lot of my emails. After I finally fixed everything and got mail-tester.com to give me a perfect score, I haven't had any problem with getting flagged. Hi - and thanks for the suggestion. I've tried in the past another similar service - and now I've tried mail-tester.com - it returned a score of 10/10 Jerry On 9/19/2017 1:44 AM, G Roach wrote: Microsoft use their own methods of detection including based on reputation and 'length of service' - ie, if you have only just started sending emails out from your own address (which you have) then they may well consider you suspicious. Theres not much yo can do about it. More info here: https://mail.live.com/mail/troubleshooting.aspx On 19/09/2017 07:25, Sebastian Arcus wrote: This is a bit off topic as it is not directly related to SA, but I'm hoping that with the email and spam expertise on this group, someone might throw in a useful idea - which would be much appreciated. I have this problem on one site where most emails we send to Hotmail/Outlook.com/Live.com email addresses end up in Junk at the recipient's end. Things I have tried: 1. I've setup SPF, DKIM, DMARC (and set it to 'reject'). 2. We used to smart relay outbound email through the hosting provider (1and1), but now changed to send directly from our own IP address, so that we can control the reputation of the sending IP - no change. 3. I've checked our public IP and the domain name at mxtoolbox.com - all tests pass (the public IP has been delisted from the Spamhaus non-MX/end-user IP database). 4. I've setup forward and reverse DNS entries for our IP address. 5. I've checked with all DNS blocklists/blacklists I could find - our domain or IP address is not flagged up anywhere. 6. This is a small network which I've been managing for years - the domain name has not been used to send marketing/lists email of any sort - so the historic reputation should be fine. 7. I've setup a monitor and block on port 25 outbound on the network firewall - in case there is a trojan on a machine on the network sending out spam and ruining the reputation of our IP - it's never been triggered. 8. I've checked the contents of outgoing emails - this is an accountants practice - the email content is standard, there is nothing there which should trigger bayesian filters. 9. I've sent emails to other servers under my control running SA - the scores come out perfect at the receiving end. 10. The emails we send are operational and notices emails to customers - who need them. They call on the phone and complain they haven't received them - just to discover they were sent, but ended up in the junk. 11. Emails we send to any other domains are never a problem spam-wise. I can't really think of anything else to try - have I missed anything? Are Hotmail/Outlook.com spam filters a complete lottery?
Re: OT - Hotmail/Outlook.com marking most of our email as Junk
On 19/09/17 10:29, Zulma Pape wrote: There are tons of ways to get your IP a good reputation with Hotmail. Start setting up the SNDS, this will help you monitor your reputation directly with Microsoft. Hi - thank you for the suggestions. I have signed up for the SNDS programme - which looks potentially useful. Unfortunately, SNDS does not show mail and spam traffic stats for IP addresses sending less than 100 mails per day - which seems to be the case of this particular site. The IP Status page lists our IP as having normal status - so I guess it's all good there. You should also try filling their support forms, they will check your IP's historic reputation and act accordingly to it, and since your background is good, the feedback should be positive for you. I have seen references to this form on various historic forum posts - but the links I followed are all dead. Has this form been removed? On Tue, Sep 19, 2017 at 7:25 AM, Sebastian Arcus <s.ar...@open-t.co.uk <mailto:s.ar...@open-t.co.uk>> wrote: This is a bit off topic as it is not directly related to SA, but I'm hoping that with the email and spam expertise on this group, someone might throw in a useful idea - which would be much appreciated. I have this problem on one site where most emails we send to Hotmail/Outlook.com/Live.com email addresses end up in Junk at the recipient's end. Things I have tried: 1. I've setup SPF, DKIM, DMARC (and set it to 'reject'). 2. We used to smart relay outbound email through the hosting provider (1and1), but now changed to send directly from our own IP address, so that we can control the reputation of the sending IP - no change. 3. I've checked our public IP and the domain name at mxtoolbox.com <http://mxtoolbox.com> - all tests pass (the public IP has been delisted from the Spamhaus non-MX/end-user IP database). 4. I've setup forward and reverse DNS entries for our IP address. 5. I've checked with all DNS blocklists/blacklists I could find - our domain or IP address is not flagged up anywhere. 6. This is a small network which I've been managing for years - the domain name has not been used to send marketing/lists email of any sort - so the historic reputation should be fine. 7. I've setup a monitor and block on port 25 outbound on the network firewall - in case there is a trojan on a machine on the network sending out spam and ruining the reputation of our IP - it's never been triggered. 8. I've checked the contents of outgoing emails - this is an accountants practice - the email content is standard, there is nothing there which should trigger bayesian filters. 9. I've sent emails to other servers under my control running SA - the scores come out perfect at the receiving end. 10. The emails we send are operational and notices emails to customers - who need them. They call on the phone and complain they haven't received them - just to discover they were sent, but ended up in the junk. 11. Emails we send to any other domains are never a problem spam-wise. I can't really think of anything else to try - have I missed anything? Are Hotmail/Outlook.com spam filters a complete lottery?
MISSING_SUBJECT not triggered if subject contains whitespace
I've had a number of emails with no subject not triggering the MISSING_SUBJECT rule - only to discover that the spammers have added a white space after 'Subject:' - which appears to fool the code into thinking that there is an actual subject. Would it be possible to 'smarten up' the code a bit to recognise this?
OT - Hotmail/Outlook.com marking most of our email as Junk
This is a bit off topic as it is not directly related to SA, but I'm hoping that with the email and spam expertise on this group, someone might throw in a useful idea - which would be much appreciated. I have this problem on one site where most emails we send to Hotmail/Outlook.com/Live.com email addresses end up in Junk at the recipient's end. Things I have tried: 1. I've setup SPF, DKIM, DMARC (and set it to 'reject'). 2. We used to smart relay outbound email through the hosting provider (1and1), but now changed to send directly from our own IP address, so that we can control the reputation of the sending IP - no change. 3. I've checked our public IP and the domain name at mxtoolbox.com - all tests pass (the public IP has been delisted from the Spamhaus non-MX/end-user IP database). 4. I've setup forward and reverse DNS entries for our IP address. 5. I've checked with all DNS blocklists/blacklists I could find - our domain or IP address is not flagged up anywhere. 6. This is a small network which I've been managing for years - the domain name has not been used to send marketing/lists email of any sort - so the historic reputation should be fine. 7. I've setup a monitor and block on port 25 outbound on the network firewall - in case there is a trojan on a machine on the network sending out spam and ruining the reputation of our IP - it's never been triggered. 8. I've checked the contents of outgoing emails - this is an accountants practice - the email content is standard, there is nothing there which should trigger bayesian filters. 9. I've sent emails to other servers under my control running SA - the scores come out perfect at the receiving end. 10. The emails we send are operational and notices emails to customers - who need them. They call on the phone and complain they haven't received them - just to discover they were sent, but ended up in the junk. 11. Emails we send to any other domains are never a problem spam-wise. I can't really think of anything else to try - have I missed anything? Are Hotmail/Outlook.com spam filters a complete lottery?
Re: FORGED_YAHOO_RCVD still causing false positives
On 15/09/17 14:34, Kevin A. McGrail wrote: On 9/15/2017 8:26 AM, RW wrote: The rule was created and scored when spoofing Yahoo was very common, but it isn't any more. I don't think it's worth keeping as it is - high maintenance and error prone. Agreed. Score FORGED_YAHOO_RCVD to zero locally and will get a bug open to deprecate it. Regards, KAM Much appreciated - thank you both!
Re: SA not receiving fixed FORGED_MUA_MOZILLA update?
On 15/09/17 12:21, Kevin A. McGrail wrote: On 9/15/2017 6:54 AM, Sebastian Arcus wrote: Thank you for the reply. Does that mean that no new rules have been pushed to SA installations in the past 5 months - or only some rules get pushed through? The system has been "down" since March 15 in that everything is working but we are purposefully not changing the DNS entries. We've resurrected it a few times and Dave Jones has done some work to get a new system running but it published incorrect score files. He did some massaging and published a new rule set with the old score file a few months ago. But since them we've been battling the machine randomly hanging. And work to resurrect the old boxes failed. We are looking for some little darn insidious issue not processing rule scores right. Thank you for the update Kevin - and for the hard work of everyone involved. Let's hope the rules updates will be operational again soon.
Re: SA not receiving fixed FORGED_MUA_MOZILLA update?
On 15/09/17 11:41, Kevin A. McGrail wrote: On 9/15/2017 6:11 AM, Sebastian Arcus wrote: I am having problems with false positives for FORGED_MUA_MOZILLA for Yahoo emails. I see this has been already dealt with here and pushed to the 3.4 and trunk branches: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7411 However, even after running sa-update, the file 20_meta_tests.cf still doesn't have the changes on my servers. Has this bugfix not been applied for some reason? 20_meta_tests.cf on my machines is dated May 17 and June 24 for the 3.00xx and 4.00xx dirs under /var/lib/spamassassin - which is after the date the bugfix was pushed (2017-04-22). I run SA 3.4.1 and 4.0.0 on different machines Hi Sebastian, The Rule Promotion and Generation of tarballs with correct scores has been an ongoing issue for the SpamAssassin team. I would suggest you simply copy the rule to your local.cf at this time. Hi Kevin, Thank you for the reply. Does that mean that no new rules have been pushed to SA installations in the past 5 months - or only some rules get pushed through?
FORGED_YAHOO_RCVD still causing false positives
I see this has come up again and again. Since FORGED_YAHOO_RCVD seems to work by checking the address of the Yahoo smtp server in the headers against a predefined list of Yahoo servers in SA, and Yahoo seems to add new servers all the time - which causes false positives, is there much point to this check? If not, maybe the default score should be lowered at least to something like 0.2 or 0.3 (I think is at 1.5 at the moment).
SA not receiving fixed FORGED_MUA_MOZILLA update?
I am having problems with false positives for FORGED_MUA_MOZILLA for Yahoo emails. I see this has been already dealt with here and pushed to the 3.4 and trunk branches: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7411 However, even after running sa-update, the file 20_meta_tests.cf still doesn't have the changes on my servers. Has this bugfix not been applied for some reason? 20_meta_tests.cf on my machines is dated May 17 and June 24 for the 3.00xx and 4.00xx dirs under /var/lib/spamassassin - which is after the date the bugfix was pushed (2017-04-22). I run SA 3.4.1 and 4.0.0 on different machines
Re: In anyone else getting 325KB spams from cont...@cron-job.org?
On 14/09/17 19:59, Loren Wilton wrote: Should be easy to block. Just block the cron-job.org domain. As someone else mentioned that address is an obvious joe-job. And scoring it high doesn't help that much. It worked for the first few weeks, then they went to contact@ to presumably get around that. I was surprised to see in the last few that they had gone back to the cron-job.org domain for the fake sender. For some reason these are bypassing SA on my system, I suspect due to the size. I had to add on my systems a while ago an /etc/mail/spamassassin/spamc.conf containing: -s 200 to increase the maximum size of emails passed to SA. It seems some spammers have cottoned onto the fact that 256KB is still hardwired somewhere in SA, and started sending spam just above that threshold to bypass the filter.
Re: Config option to skip pyzor check on empty body emails?
On 12/09/17 12:33, RW wrote: On Tue, 12 Sep 2017 08:41:01 +0100 Sebastian Arcus wrote: The confusing part is that left to its devices, Pyzor creates a .pyzor dir in the home dir of the user it is run as. But if --homedir is specified, it dumps stuff directly there, instead of creating a .pyzor dir.In the end I got rid of the "pyzor_options --homedir" option in local.cf and it worked fine. It is a bit confusing, but it's not that the .pyzor directory is use inconsistently, it's that pyzor defines --homedir=HOMEDIR configuration directory so the default homedir is $HOME/.pyzor/ not $HOME/. If you want to use pyzor_options you could use: pyzor_options --homedir /var/spool/spamd/.pyzor Like with everything, it all makes sense after you fully understand what's going on :-) I just made the wrong assumptions about how the option would work. Like Ian says, the word "home" in the option name makes it easy to assume that everything will be arranged as subdirectories under it. No matter - I'm happy I've finally found a solution to the empty bodied emails hitting PYZOR_CHECK :-) Thanks again for all the help.
Re: Config option to skip pyzor check on empty body emails?
On 12/09/17 00:56, RW wrote: On Tue, 12 Sep 2017 00:37:40 +0100 Sebastian Arcus wrote: On 11/09/17 20:20, RW wrote: This is why pyzor has the local_whitelist command. At very least it's a good idea to pipe an empty string through "pyzor local_whitelist" (probably as the user running spamassassin). I have spotted that command in the docs - and if it worked, it would seem like a good solution. But it doesn't seem to. I have added the hash of the empty string to the local whitelist. If I try to re-add the same hash, or the hash of the problem emails - I get a message stating that it is already in the whitelist - so it would appear to be working. But when running the email message through SA, it still hits PYZOR_CHECK. I have found the location of Pyzor's local whitelist - and the permissions are correct. It appears that SA completely ignores the fact that the digest is whitelisted locally: SA can't ignore it, if a hash is whitelisted pyzor returns a dummy result. e.g.: $ echo "" | pyzor check public.pyzor.org:24441 (200, 'OK') 0 0 compared with: $ echo "" | pyzor --local-whitelist=/nonextistent check public.pyzor.org:24441 (200, 'OK') 2749671 82562 Thank you for that. I finally gotten to the bottom of my problem. It was the Pyzor homedir. Although I have set it up in /etc/mail/spamassassin/local.cf, I ended up confusing myself. If I ran as root: #pyzor local_whitelist < /email.eml it placed the whitelist in /root/.pyzor/whitelist When I ran: #su - spamd -c "pyzor local_whitelist < /email.eml" it placed it in /var/spool/spamd/.pyzor/whitelist (/var/spool/spamd is the homedir of the 'spamd' user on this system) But when I ran: #su - spamd -c "pyzor --homedir /var/spool/spamd < /email.eml" it placed it in /var/spool/spamd/whitelist The confusing part is that left to its devices, Pyzor creates a .pyzor dir in the home dir of the user it is run as. But if --homedir is specified, it dumps stuff directly there, instead of creating a .pyzor dir. In the end I got rid of the "pyzor_options --homedir" option in local.cf and it worked fine. I was just tying myself in knots there :-) Thanks again
Re: Config option to skip pyzor check on empty body emails?
On 11/09/17 20:20, RW wrote: On Mon, 11 Sep 2017 17:39:16 +0100 Sebastian Arcus wrote: Is there any way to tell SA to skip pyzor checks on emails with an empty body (even if there are attachments). I've noticed for a while now that emails which don't contain any text in their bodies seem to automatically trigger PYZOR_CHECK (even if they have an attachment) - although they are private emails so can't possibly match the digest of spam emails. I can only guess that Pyzor matches the digest of empty emails automatically. It's because pyzor is based only on a simplified version of the body text. This includes stripping any URIs or email addresses from the text. It's not just emails with no body text there are also variants of this that reduce to common phrases such as "Sent from my iPhone" I have clients who receive important emails from their customers just with an attachment and a subject line - and they all seem to go to Junk - because they trigger the PYZOR_CHECK rule - which is causing problems. Any way to deal with this? This is why pyzor has the local_whitelist command. At very least it's a good idea to pipe an empty string through "pyzor local_whitelist" (probably as the user running spamassassin). I have spotted that command in the docs - and if it worked, it would seem like a good solution. But it doesn't seem to. I have added the hash of the empty string to the local whitelist. If I try to re-add the same hash, or the hash of the problem emails - I get a message stating that it is already in the whitelist - so it would appear to be working. But when running the email message through SA, it still hits PYZOR_CHECK. I have found the location of Pyzor's local whitelist - and the permissions are correct. It appears that SA completely ignores the fact that the digest is whitelisted locally: su - spamd -c "spamassassin -D 2>&1 < /test1.eml" | grep -i pyzor Sep 12 00:31:49.080 [23559] dbg: plugin: loading Mail::SpamAssassin::Plugin::Pyzor from @INC Sep 12 00:31:49.090 [23559] dbg: pyzor: network tests on, attempting Pyzor Sep 12 00:31:50.679 [23559] dbg: config: fixed relative path: /var/lib/spamassassin/3.004001/updates_spamassassin_org/25_pyzor.cf Sep 12 00:31:50.679 [23559] dbg: config: using "/var/lib/spamassassin/3.004001/updates_spamassassin_org/25_pyzor.cf" for included file Sep 12 00:31:50.680 [23559] dbg: config: read file /var/lib/spamassassin/3.004001/updates_spamassassin_org/25_pyzor.cf Sep 12 00:31:57.411 [23559] dbg: util: executable for pyzor was found at /usr/bin/pyzor Sep 12 00:31:57.412 [23559] dbg: pyzor: pyzor is available: /usr/bin/pyzor Sep 12 00:31:57.413 [23559] dbg: pyzor: opening pipe: /usr/bin/pyzor --homedir /var/spool/spamd check < /tmp/.spamassassin23559DIrl4Ktmp Sep 12 00:31:58.154 [23559] dbg: pyzor: [23560] finished: exit 1 Sep 12 00:31:58.155 [23559] dbg: pyzor: got response: public.pyzor.org:24441 (200, 'OK') 2749542 82562 Sep 12 00:31:58.156 [23559] dbg: check: tagrun - tag PYZOR is now ready, value: Whitelisted. Sep 12 00:31:58.157 [23559] dbg: pyzor: listed: COUNT=2749542/5 WHITELIST=82562 Sep 12 00:31:58.159 [23559] dbg: rules: ran eval rule PYZOR_CHECK ==> got hit (1) * 2.5 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/) 2.5 PYZOR_CHECKListed in Pyzor (http://pyzor.sf.net/)
Config option to skip pyzor check on empty body emails?
Is there any way to tell SA to skip pyzor checks on emails with an empty body (even if there are attachments). I've noticed for a while now that emails which don't contain any text in their bodies seem to automatically trigger PYZOR_CHECK (even if they have an attachment) - although they are private emails so can't possibly match the digest of spam emails. I can only guess that Pyzor matches the digest of empty emails automatically. I have clients who receive important emails from their customers just with an attachment and a subject line - and they all seem to go to Junk - because they trigger the PYZOR_CHECK rule - which is causing problems. Any way to deal with this?
Re: SA not performing DNSBL queries correctly
On 17/05/17 18:11, Sebastian Arcus wrote: On 17/05/17 16:53, David Mehler wrote: Hi, I don't see your SA issue here, but since your running 3.41 can I get a look at your SA configuration to compare against mine? Thanks. Dave. Yes - you are correct. As I pointed out in my last email, it looks like there might be an issue with the package supplied by Slackware at slackbuilds.org - and I am chasing it up with them there. But thanks to the advice on this list, I've managed to narrow things down - so I am grateful for the hints. Just a follow-up and clarification on this issue - after more testing, it seems that it was the Spamassassin version which was the problem. I have had to upgrade SA on 7 servers running 3.4.1 on Slackware - as the dns rbl's weren't working on any of them. The only server I had with SA 3.4.0 *was* actually working correctly. After upgrading all the boxes to 4.0.0, the dns rbl's are now working correctly. I have *not* changed any configuration options in SA - I left all the servers as they were in this respect - so it seems it was not a configuration issue. I'm afraid I haven't been able to narrow it down further than this. The servers were all running various kernels, both x86 and x86_64 architectures, and several different versions of Perl - so I would guess the SA version was the common factor and the likely culprit.
Re: SA not performing DNSBL queries correctly
On 17/05/17 16:53, David Mehler wrote: Hi, I don't see your SA issue here, but since your running 3.41 can I get a look at your SA configuration to compare against mine? Thanks. Dave. Yes - you are correct. As I pointed out in my last email, it looks like there might be an issue with the package supplied by Slackware at slackbuilds.org - and I am chasing it up with them there. But thanks to the advice on this list, I've managed to narrow things down - so I am grateful for the hints. On 5/17/17, Sebastian Arcus <s.ar...@open-t.co.uk> wrote: On 17/05/17 14:54, Sebastian Arcus wrote: On 17/05/17 14:21, Kevin A. McGrail wrote: On 5/17/2017 8:22 AM, Sebastian Arcus wrote: I have 2 servers with SA 3.4.1 running on Slackware, with Bind in caching/recursive mode. For months one of them has been unable to correctly do dns blocklists (but the queries are not blocked). I have pored over the logs, and the main difference is that, although both of them pick up on the bad urls in the body of the message, the bad server is unable to resolve the url to an IP address for some reason (but dig works fine on the command line on both servers): What version of Net::DNS on the two boxes? Does the 3.4 branch from SVN work? There have been changes to Net::DNS that are my likely first guess. Thank you for the suggestion. I have Net::DNS 1.10. I have just recompiled SA from SVN and it is using dnsrbl's correctly. Have there been some changes in the way SA works recently? A small update to this - I recompiled 3.4.1 by hand - and this is working fine as well. This would suggest that the Slackware package is somehow the problem - unless it is all coincidental and I am somehow chasing my own tail. I will update here if I find out more. Thank you again for the suggestion.
Re: SA not performing DNSBL queries correctly
On 17/05/17 14:54, Sebastian Arcus wrote: On 17/05/17 14:21, Kevin A. McGrail wrote: On 5/17/2017 8:22 AM, Sebastian Arcus wrote: I have 2 servers with SA 3.4.1 running on Slackware, with Bind in caching/recursive mode. For months one of them has been unable to correctly do dns blocklists (but the queries are not blocked). I have pored over the logs, and the main difference is that, although both of them pick up on the bad urls in the body of the message, the bad server is unable to resolve the url to an IP address for some reason (but dig works fine on the command line on both servers): What version of Net::DNS on the two boxes? Does the 3.4 branch from SVN work? There have been changes to Net::DNS that are my likely first guess. Thank you for the suggestion. I have Net::DNS 1.10. I have just recompiled SA from SVN and it is using dnsrbl's correctly. Have there been some changes in the way SA works recently? A small update to this - I recompiled 3.4.1 by hand - and this is working fine as well. This would suggest that the Slackware package is somehow the problem - unless it is all coincidental and I am somehow chasing my own tail. I will update here if I find out more. Thank you again for the suggestion.
Re: SA not performing DNSBL queries correctly
On 17/05/17 14:21, Kevin A. McGrail wrote: On 5/17/2017 8:22 AM, Sebastian Arcus wrote: I have 2 servers with SA 3.4.1 running on Slackware, with Bind in caching/recursive mode. For months one of them has been unable to correctly do dns blocklists (but the queries are not blocked). I have pored over the logs, and the main difference is that, although both of them pick up on the bad urls in the body of the message, the bad server is unable to resolve the url to an IP address for some reason (but dig works fine on the command line on both servers): What version of Net::DNS on the two boxes? Does the 3.4 branch from SVN work? There have been changes to Net::DNS that are my likely first guess. Thank you for the suggestion. I have Net::DNS 1.10. I have just recompiled SA from SVN and it is using dnsrbl's correctly. Have there been some changes in the way SA works recently?
SA not performing DNSBL queries correctly
I have 2 servers with SA 3.4.1 running on Slackware, with Bind in caching/recursive mode. For months one of them has been unable to correctly do dns blocklists (but the queries are not blocked). I have pored over the logs, and the main difference is that, although both of them pick up on the bad urls in the body of the message, the bad server is unable to resolve the url to an IP address for some reason (but dig works fine on the command line on both servers): On the good server: dbg: uridnsbl: complete_ns_lookup NS:spamdomain.com dbg: uridnsbl: got(1) NS for spamdomain.com: spamdomain.com. 45 IN NS ns3.bkdns.vn. dbg: uridnsbl: complete_a_lookup A:spamdomain.com dbg: uridnsbl: complete_a_lookup got(1) A for spamdomain.com: spamdomain.com. 45 IN A 1.2.3.4 On the broken server I only get: dbg: uridnsbl: complete_ns_lookup NS:spamdomain.com dbg: dns: dns reply 62167 is OK, 0 answer records dbg: async: calling callback on key A:spamdomain.com dbg: uridnsbl: complete_a_lookup A:spamdomain.com dbg: dns: dns reply 36552 is OK, 0 answer records Would anybody know why the broken server is unable to resolve domains to IP's in SA (but works ok through dig)? There are no error messages anywhere that I can find and spamassassin -D --lint is not complaining of anything.
Re: Dns Blocklists always returning 0 records
On 27/03/17 11:10, Kevin A. McGrail wrote: On 3/27/2017 5:28 AM, Sebastian Arcus wrote: And yet, no dns block lists make it to the final scores I have only filed the thread briefly but check your versions of Net::DNS. The good server has Net::DNS 0.83 - so way out of date. The problem server has Net::DNS 1.06 - so not quite latest, but still much newer than the sever where SA works fine. I've just upgraded Net::DNS on the problem server to 1.09 - I'm afraid SA is still reporting zero hits from dns blocklists: Mar 27 21:24:05.900 [31500] dbg: async: calling callback on key dns:A:109.150.73.212.zen.spamhaus.org Mar 27 21:24:05.930 [31500] dbg: dns: dns reply 17643 is OK, 0 answer records Bug dig still gets a hit on the same server: #dig 109.150.73.212.zen.spamhaus.org ; <<>> DiG 9.10.4-P1 <<>> 109.150.73.212.zen.spamhaus.org ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55153 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;109.150.73.212.zen.spamhaus.org. INA ;; ANSWER SECTION: 109.150.73.212.zen.spamhaus.org. 808 IN A 127.0.0.4 ;; Query time: 0 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Mon Mar 27 21:48:08 BST 2017 ;; MSG SIZE rcvd: 76
Re: Dns Blocklists always returning 0 records
On 26/03/17 14:12, David Jones wrote: From: Sebastian Arcus <s.ar...@open-t.co.uk> Sent: Sunday, March 26, 2017 4:23 AM To: users@spamassassin.apache.org Subject: Dns Blocklists always returning 0 records I have a server with SA where I just can't seem to get DNS based block lists / RBL working. I have tested the same email message against another server, and it gets hits from DNS block lists. But on this particular server they just don't seem to work - but the dns queries are not blocked either. 1. Both servers are on SA 3.4.1 2. I've ran sa-update on both of them. 3. Both servers have Perl Net::DNS installed 4. Both servers have Bind configured locally and running fine as a caching name server. 5. On the problematic server, the dns based checks are being run, not being blocked, but always returning 0 records. What else can I check in the SA config or more widely on the server? What could possible cause this? Any suggestions would be much appreciated. I attach below a snippet of spamassassin -D output from the problem server - but I'm happy to enclose here, or upload the whole thing somewhere else if it helps: #spamassassin -D 2>&1 < /test_email.eml | grep -i -A 3 "answer records" Mar 26 10:12:39.060 [7061] dbg: async: calling callback on key dns:A:109.150.73.212.bb.barracudacentral.org Mar 26 10:12:39.062 [7061] dbg: dns: dns reply 61164 is OK, 0 answer records Mar 26 10:12:39.062 [7061] dbg: async: calling callback on key dns:A:109.150.73.212.zen.spamhaus.org Mar 26 10:12:39.064 [7061] dbg: dns: dns reply 20939 is OK, 0 answer records Mar 26 10:12:39.064 [7061] dbg: async: calling callback on key dns:TXT:109.150.73.212.sa-accredit.habeas.com Mar 26 10:12:39.066 [7061] dbg: dns: dns reply 56465 is OK, 0 answer records Mar 26 10:12:39.066 [7061] dbg: async: calling callback on key dns:A:109.150.73.212.iadb.isipp.com Mar 26 10:12:39.069 [7061] dbg: dns: dns reply 19262 is OK, 0 answer records I get this response on my working SA servers for the IP address above: ;; ANSWER SECTION: 109.150.73.212.zen.spamhaus.org. 300 IN A 127.0.0.4 What does the output of this commnd say on your SA server? dig test.dbl.spamhaus.org Compare the output on both servers. I suspect this will point you in the right direction. For example, "SERVER:" should point to 127.0.0.1. On the problem server, if I run: #dig 109.150.73.212.zen.spamhaus.org I get: ;; ANSWER SECTION: 109.150.73.212.zen.spamhaus.org. 337 IN A 127.0.0.4 And I can also see it is using 127.0.0.1 as the server. I can even see in the SA debug output (on the problem server): Mar 27 10:25:12.173 [23914] dbg: dns: hit 127.0.0.4 And yet, no dns block lists make it to the final scores.
Dns Blocklists always returning 0 records
I have a server with SA where I just can't seem to get DNS based block lists / RBL working. I have tested the same email message against another server, and it gets hits from DNS block lists. But on this particular server they just don't seem to work - but the dns queries are not blocked either. 1. Both servers are on SA 3.4.1 2. I've ran sa-update on both of them. 3. Both servers have Perl Net::DNS installed 4. Both servers have Bind configured locally and running fine as a caching name server. 5. On the problematic server, the dns based checks are being run, not being blocked, but always returning 0 records. What else can I check in the SA config or more widely on the server? What could possible cause this? Any suggestions would be much appreciated. I attach below a snippet of spamassassin -D output from the problem server - but I'm happy to enclose here, or upload the whole thing somewhere else if it helps: #spamassassin -D 2>&1 < /test_email.eml | grep -i -A 3 "answer records" Mar 26 10:12:39.060 [7061] dbg: async: calling callback on key dns:A:109.150.73.212.bb.barracudacentral.org Mar 26 10:12:39.062 [7061] dbg: dns: dns reply 61164 is OK, 0 answer records Mar 26 10:12:39.062 [7061] dbg: async: calling callback on key dns:A:109.150.73.212.zen.spamhaus.org Mar 26 10:12:39.064 [7061] dbg: dns: dns reply 20939 is OK, 0 answer records Mar 26 10:12:39.064 [7061] dbg: async: calling callback on key dns:TXT:109.150.73.212.sa-accredit.habeas.com Mar 26 10:12:39.066 [7061] dbg: dns: dns reply 56465 is OK, 0 answer records Mar 26 10:12:39.066 [7061] dbg: async: calling callback on key dns:A:109.150.73.212.iadb.isipp.com Mar 26 10:12:39.069 [7061] dbg: dns: dns reply 19262 is OK, 0 answer records
Re: Different bayes results from command line and through MTA
On 23/12/16 17:02, Andrzej A. Filip wrote: Sebastian Arcus <s.ar...@open-t.co.uk> wrote: On 23/12/16 10:12, Sebastian Arcus wrote: I know this hot potato has been discussed before - but I'm afraid it's back to haunt me and I can't fathom it out. I'm getting again different bayes results if I test a message on the command line, compared to it going through exim -> spamassassin. OK - after staring for a good while at debug logs, I think I finally found the culprit. The saved .eml file which I pass through spamc contains the report embedded by spamassassin in the headers (that's how my Exim is configured). This report includes the first few lines of the actual email body. This in turn has the effect of effectively doubling the Bayes score, as spamassassin tokenizes these sample lines on top of the actual email body. As the email body for these particular spam emails is small - the sample in the header is almost equal in size with the text in the email body itself. As soon as I manually delete the SA headers and report in the .eml file, and pass the message again through spamc, I get identical Bayes scores to the ones when the message passes initially through Exim -> SA. However, this raises some interesting questions. It would appear that SA is incapable of recognising it's own reports in the header of the emails, and tokenizes them as well and adds them to the Bayes report. Is that right? Also, does it mean that, as SA tokenizes all the info in the headers, my own email address, as the recipient of the email, will also be added to the database of spam tokens - when I ask SA to learn a message as spam? I seem to have ended up with more questions than I started :-) Have you considered using bayes_ignore_header in spamassassin configuration file? https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Conf.html Many thanks for the suggestion - I didn't know about bayes_ignore_header One quick question - does anybody know if bayes_ignore_header takes effect both when classifying email *and* when learning spam/ham?
Re: Different bayes results from command line and through MTA
On 23/12/16 17:18, Paul Stead wrote: On 23/12/2016, 13:35, "Sebastian Arcus" <s.ar...@open-t.co.uk> wrote: As soon as I manually delete the SA headers and report in the .eml file, and pass the message again through spamc, I get identical Bayes scores to the ones when the message passes initially through Exim -> SA. http://svn.apache.org/repos/asf/spamassassin/trunk/rulesrc/sandbox/axb/23_bayes_ignore_header.cf this is a sandbox ruleset but it answers your question here and also prevents other potentially bad signals. However, this raises some interesting questions. It would appear that SA is incapable of recognising it's own reports in the header of the emails, and tokenizes them as well and adds them to the Bayes report. Is that right? Spamassassin ignores certain headers - http://svn.apache.org/repos/asf/spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/Bayes.pm - note here that within $IGNORED_HDRS we have - ---8<--- |X-Spam(?:-(?:Status|Level|Flag|Report|Hits|Score|Checker-Version))? ---8<--- Really SA should be ignoring the headers it puts there – do the headers match anything in that list? No - I use my own customer headers. But I will configure SA to ignore them - as per suggestion from Andrzej Also, does it mean that, as SA tokenizes all the info in the headers, my own email address, as the recipient of the email, will also be added to the database of spam tokens - when I ask SA to learn a message as spam? As above, headers like “X-Envelope-To” and “X-Delivered-To” etc etc are ignored, however the To: header is not as this can be a good indicator – for example, if a ‘spoofed’ To header isn’t matching the actual recipient of the email within your system… *mumble* numbers and things Thank you very much for the explanation
Re: Different bayes results from command line and through MTA
On 23/12/16 10:12, Sebastian Arcus wrote: I know this hot potato has been discussed before - but I'm afraid it's back to haunt me and I can't fathom it out. I'm getting again different bayes results if I test a message on the command line, compared to it going through exim -> spamassassin. > > OK - after staring for a good while at debug logs, I think I finally found the culprit. The saved .eml file which I pass through spamc contains the report embedded by spamassassin in the headers (that's how my Exim is configured). This report includes the first few lines of the actual email body. This in turn has the effect of effectively doubling the Bayes score, as spamassassin tokenizes these sample lines on top of the actual email body. As the email body for these particular spam emails is small - the sample in the header is almost equal in size with the text in the email body itself. As soon as I manually delete the SA headers and report in the .eml file, and pass the message again through spamc, I get identical Bayes scores to the ones when the message passes initially through Exim -> SA. However, this raises some interesting questions. It would appear that SA is incapable of recognising it's own reports in the header of the emails, and tokenizes them as well and adds them to the Bayes report. Is that right? Also, does it mean that, as SA tokenizes all the info in the headers, my own email address, as the recipient of the email, will also be added to the database of spam tokens - when I ask SA to learn a message as spam? I seem to have ended up with more questions than I started :-)
Different bayes results from command line and through MTA
I know this hot potato has been discussed before - but I'm afraid it's back to haunt me and I can't fathom it out. I'm getting again different bayes results if I test a message on the command line, compared to it going through exim -> spamassassin. The header of the message received in the Inbox contains the following report: Content analysis details: (10.5 points, 4.2 required) pts rule name description -- -- 0.4 STOX_REPLY_TYPENo description available. 3.0 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: date 3.2 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.5000] 0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 1.8 STOX_REPLY_TYPE_WITHOUT_QUOTES No description available. 2.1 FREEMAIL_FORGED_REPLYTO Freemail in Reply-To, but not From While if I test it on the command line (spamc -R < /test_message.eml), I get really different results: ontent analysis details: (20.2 points, 4.2 required) pts rule name description -- -- 4.9 BAYES_99 BODY: Bayes spam probability is 99 to 100% [score: 1.] 0.4 STOX_REPLY_TYPENo description available. 3.0 DATE_IN_FUTURE_03_06 Date: is 3 to 6 hours after Received: date 8.0 BAYES_999 BODY: Bayes spam probability is 99.9 to 100% [score: 1.] 0.0 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 76 chars 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 1.8 STOX_REPLY_TYPE_WITHOUT_QUOTES No description available. 2.1 FREEMAIL_FORGED_REPLYTO Freemail in Reply-To, but not From On the command line it is hitting BAYES_99 and BAYES_999 - while through Exim it doesn't. I know the first thing is to look for is file permissions for the bayes databases. I've checked them. Also, I have spamassassin listening on a TCP port - and both Exim and spamc connect to it this way (I believe) - so permissions shouldn't make a difference between the two methods of testing the email - is that correct? Also, I use a site-wide bayes database - so only one set of files. I'm running spamd under the "spamd" user - which owns the bayes database files and directory: /usr/bin/spamd -d -l --pidfile=/var/run/spamd/spamd.pid --username=spamd What could possibly account for the large discrepancy in bayes results?
Re: Spamassassin uses bayes, but spamd doesn't
On 17/06/16 14:49, RW wrote: On Fri, 17 Jun 2016 14:07:33 +0100 Sebastian Arcus wrote: Site-wide bayes files are owned by spamd. Regarding the daemon, it is started with --socketowner=spamd and socketpath=spamd. Is this enough, or should it be actually started with "su" as "spamd" user? If you start it as root with the -u spamd (or --username) it will drop privileges to spamd. Starting it as root allows it to bind to a low port should you need that. "socketpath=spamd" sounds idiotic, hpwever for a site-wide setup there is no point in start it as root instead directly as the correct user, see below, can#t say anything about "su" in service files since i don't touch sysvinit for 5 years now That is probably so - I've taken another look at my startup scripts, and I have to say it feels like I've been tying myself in knots with --socketowner and --socketgroup and --username. I was thinking that for my setup using: --username=spamd --socketownder=exim --socketgroup=exim might be the most suitable. Is it better to run it instead with --socketmode=666 You should use -u,--username unless you need to access per user data from unix home directories. You need this even if you start directly as spamd. and not bother with setting owner and group for the socket? Is there any particular reason for even using a socket file? A good point - if I leave them out, spamd will talk on the default IP port, and Exim can do that as well. Thank you for suggesting!
Re: Spamassassin uses bayes, but spamd doesn't
On 16/06/16 18:46, Sebastian Arcus wrote: I have a particular server running spamd which uses bayes every time I test it by hand, but apparently never when it goes through exim/spamd. I run everything (both the spamd daemon and the manual tests) as user spamd. I checked the permissions on the bayes database. I use a global bayes database in /var/spool/spamd/bayes/. I ran "spamassassin -D --lint" - and I get no failures - both as root and as the user spamd. In spite of all of the above, it looks pretty clear that bayes is only used when I run an email manually through spamassassin, but not when it goes from exim through spamd. Here is the report when ran from the command line: Content analysis details: (5.4 points, 5.0 required) pts rule name description -- -- 2.0 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.5000] 0.0 HTML_IMAGE_RATIO_06BODY: HTML has a low ratio of text to image area 0.0 HTML_MESSAGE BODY: HTML included in message 0.0 HTML_FONT_LOW_CONTRAST BODY: HTML font color similar or identical to background 0.8 MPART_ALT_DIFF BODY: HTML and text parts are different 0.0 T_KAM_HTML_FONT_INVALID BODY: Test for Invalidly Named or Formatted Colors in HTML 0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not necessarily valid 0.2 RDNS_NONE Delivered to internal network by a host with no rDNS 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 0.0 LOTS_OF_MONEY Huge... sums of money 1.5 SUBJ_ILLEGAL_CHARS Subject: has too many raw illegal characters 0.0 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME parts 0.0 SUBJECT_NEEDS_ENCODING Subject is encoded but does not specify the encoding And here is the report included in the same email message when it comes through exim: Content analysis details: (1.9 points, 5.0 required) pts rule name description -- -- 0.7 MPART_ALT_DIFF BODY: HTML and text parts are different 0.0 HTML_IMAGE_RATIO_06BODY: HTML has a low ratio of text to image area 0.0 HTML_MESSAGE BODY: HTML included in message 0.0 HTML_FONT_LOW_CONTRAST BODY: HTML font color similar or identical to background 0.0 T_KAM_HTML_FONT_INVALID BODY: Test for Invalidly Named or Formatted Colors in HTML 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not necessarily valid 0.0 LOTS_OF_MONEY Huge... sums of money 0.2 RDNS_NONE Delivered to internal network by a host with no rDNS 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 0.0 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME parts Bayes is clearly not being used when it goes through spamd. Does anybody know what could be causing this? OK - thank you to everybody who helped with hints and info. Bayes is finally working now. What I initially had in place is: 1. Site-wide bayes db in /var/spool/spamd/bayes/ and owned by spamd.spamd 2. Spamd socket owned by spamd.spamd - which turns out that didn't make much sense 3. Spamd ran as root - for some reason I got confused and thought setting the owner/group for the socket meant spamd was run as non-root user. What I have now: 1. Spamd socket owned by exim.exim (as it is the only piece of software which needs to talk to spamd) - and mode set to 0660. 2. Spamd runs as "spamd" user. 3. Bayes db still in the same place as above and with the same ownership - but I've set them as 0660 In conclusion it would appear that running the spamd as root was the cause of the problem - although root should have been able to access the bayes database anyway. I'm a little lost on that point I'm afraid. But I think it's been a good opportunity to straighten the setup both on the server and in my head :-) Thank you again.
Re: Spamassassin uses bayes, but spamd doesn't
On 17/06/16 04:46, Bill Cole wrote: On 16 Jun 2016, at 13:46, Sebastian Arcus wrote: I have a particular server running spamd Which must run on a particular platform. Since SpamAssassin and Exim can run on a decade's worth of versions of at least 9 different OSs and one of those (Linux) has about a half-dozen distinctly different families of distributions that have become quite divergent, it would help to identify your OS and version (or if Linux, which distro & its version) when seeking help from people who don't keep track of what sorts of systems you run. This helps constrain the scope of sane guessing... (However, the ability to run arbitrary programs as 'root' implies a POSIX-y platform with a true-root security model, so I'll assume this isn't some Windows-Frankenstein abomination or El Capitan) which uses bayes every time I test it by hand, but apparently never when it goes through exim/spamd. I run everything (both the spamd daemon and the manual tests) as user spamd. I checked the permissions on the bayes database. I use a global bayes database in /var/spool/spamd/bayes/. Provide `ls -la /var/spool/spamd/bayes/`, please. Or if the problem that reveals is obvious, just fix it and you're welcome. :) I ran "spamassassin -D --lint" - and I get no failures - both as root and as the user spamd. And when you run spamassassin as root, you risk having root steal the Bayes and AWL DBs. Presumably this is why some misguided articles online documenting SA setup for system-wide use recommend deeply wrong things like 'chmod -R 777' on your database directory. Don't do that. Ever. On any directory. Use an ad hoc group, BSD directory setgid semantics or fileflags, ACLs, a script that runs from cron every minute, or whatever else can work on your platform to assure that spamd can always read and write to everything in that directory, but DO NOT 777 it. In spite of all of the above, it looks pretty clear that bayes is only used when I run an email manually through spamassassin, but not when it goes from exim through spamd. Is spamd configured to do any logging? By default on POSIX platforms it logs under the mail facility and if it can't open the BayesDB it will log that fact. If it does so but there's no ownership/permission problem it could also be due to SELinux, running spamd in a chroot jail (bad idea,) or maybe AppArmor (about which I know nothing other than that it's an alternative to SELinux.) These are solvable problems. Thank you for all the suggestions above - and you are right, I should have been more specific about my setup. I'll report back to the list with progress or when it is solved.
Re: Spamassassin uses bayes, but spamd doesn't
On 17/06/16 13:42, Reindl Harald wrote: Am 17.06.2016 um 14:29 schrieb Sebastian Arcus: On 17/06/16 00:03, Reindl Harald wrote: Am 16.06.2016 um 19:46 schrieb Sebastian Arcus: I have a particular server running spamd which uses bayes every time I test it by hand, but apparently never when it goes through exim/spamd then you need to run it as the correct user or train it as the correct Thank you for the suggestion. There is no training involved, and auto-learn is switched off in local.cf how do you imagine bayes working then? These are bayes databases from another server - any training happens there - so training and auto-learning is disabled on this particular server. Site-wide bayes files are owned by spamd. Regarding the daemon, it is started with --socketowner=spamd and socketpath=spamd. Is this enough, or should it be actually started with "su" as "spamd" user? "socketpath=spamd" sounds idiotic, hpwever for a site-wide setup there is no point in start it as root instead directly as the correct user, see below, can#t say anything about "su" in service files since i don't touch sysvinit for 5 years now That is probably so - I've taken another look at my startup scripts, and I have to say it feels like I've been tying myself in knots with --socketowner and --socketgroup and --username. I was thinking that for my setup using: --username=spamd --socketownder=exim --socketgroup=exim might be the most suitable. Is it better to run it instead with --socketmode=666 and not bother with setting owner and group for the socket?
Re: Spamassassin uses bayes, but spamd doesn't
On 17/06/16 00:03, Reindl Harald wrote: Am 16.06.2016 um 19:46 schrieb Sebastian Arcus: I have a particular server running spamd which uses bayes every time I test it by hand, but apparently never when it goes through exim/spamd then you need to run it as the correct user or train it as the correct user Thank you for the suggestion. There is no training involved, and auto-learn is switched off in local.cf. Site-wide bayes files are owned by spamd. Regarding the daemon, it is started with --socketowner=spamd and socketpath=spamd. Is this enough, or should it be actually started with "su" as "spamd" user?
Re: Spamassassin uses bayes, but spamd doesn't
On 17/06/16 03:46, Yu Qian wrote: you can use spamd -D to check the log for exactly what bayes db path your spamd was using. Thank Yu. Based on the output below, it appears to find and use the sitewide bayes files ok: # spamd -D 2>&1 | grep -i bayes Jun 17 13:32:51.719 [4380] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC Jun 17 13:32:52.058 [4380] dbg: config: fixed relative path: /var/lib/spamassassin/3.004001/updates_spamassassin_org/23_bayes.cf Jun 17 13:32:52.058 [4380] dbg: config: using "/var/lib/spamassassin/3.004001/updates_spamassassin_org/23_bayes.cf" for included file Jun 17 13:32:52.058 [4380] dbg: config: read file /var/lib/spamassassin/3.004001/updates_spamassassin_org/23_bayes.cf Jun 17 13:32:53.370 [4380] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0xa936c48) implements 'learner_new', priority 0 Jun 17 13:32:53.371 [4380] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0xa936c48), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM Jun 17 13:32:53.390 [4380] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0xab6a6a0) Jun 17 13:32:53.391 [4380] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0xa936c48) implements 'learner_is_scan_available', priority 0 Jun 17 13:32:53.391 [4380] dbg: bayes: tie-ing to DB file R/O /var/spool/spamd/bayes/bayes_toks Jun 17 13:32:53.392 [4380] dbg: bayes: tie-ing to DB file R/O /var/spool/spamd/bayes/bayes_seen Jun 17 13:32:53.393 [4380] dbg: bayes: found bayes db version 3 Jun 17 13:32:53.394 [4380] dbg: bayes: DB journal sync: last sync: 1466097119 Jun 17 13:32:55.405 [4380] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0xa936c48) implements 'learner_close', priority 0 Jun 17 13:32:55.405 [4380] dbg: bayes: untie-ing Jun 17 13:32:55.487 [4380] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0xa936c48) implements 'prefork_init', priority 0 Jun 17 13:32:55.492 [4385] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0xa936c48) implements 'spamd_child_init', priority 0 Jun 17 13:32:55.497 [4386] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0xa936c48) implements 'spamd_child_init', priority 0
Spamassassin uses bayes, but spamd doesn't
I have a particular server running spamd which uses bayes every time I test it by hand, but apparently never when it goes through exim/spamd. I run everything (both the spamd daemon and the manual tests) as user spamd. I checked the permissions on the bayes database. I use a global bayes database in /var/spool/spamd/bayes/. I ran "spamassassin -D --lint" - and I get no failures - both as root and as the user spamd. In spite of all of the above, it looks pretty clear that bayes is only used when I run an email manually through spamassassin, but not when it goes from exim through spamd. Here is the report when ran from the command line: Content analysis details: (5.4 points, 5.0 required) pts rule name description -- -- 2.0 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.5000] 0.0 HTML_IMAGE_RATIO_06BODY: HTML has a low ratio of text to image area 0.0 HTML_MESSAGE BODY: HTML included in message 0.0 HTML_FONT_LOW_CONTRAST BODY: HTML font color similar or identical to background 0.8 MPART_ALT_DIFF BODY: HTML and text parts are different 0.0 T_KAM_HTML_FONT_INVALID BODY: Test for Invalidly Named or Formatted Colors in HTML 0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not necessarily valid 0.2 RDNS_NONE Delivered to internal network by a host with no rDNS 0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 0.0 LOTS_OF_MONEY Huge... sums of money 1.5 SUBJ_ILLEGAL_CHARS Subject: has too many raw illegal characters 0.0 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME parts 0.0 SUBJECT_NEEDS_ENCODING Subject is encoded but does not specify the encoding And here is the report included in the same email message when it comes through exim: Content analysis details: (1.9 points, 5.0 required) pts rule name description -- -- 0.7 MPART_ALT_DIFF BODY: HTML and text parts are different 0.0 HTML_IMAGE_RATIO_06BODY: HTML has a low ratio of text to image area 0.0 HTML_MESSAGE BODY: HTML included in message 0.0 HTML_FONT_LOW_CONTRAST BODY: HTML font color similar or identical to background 0.0 T_KAM_HTML_FONT_INVALID BODY: Test for Invalidly Named or Formatted Colors in HTML 1.1 MIME_HTML_ONLY BODY: Message only has text/html MIME parts -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNEDMessage has a DKIM or DK signature, not necessarily valid 0.0 LOTS_OF_MONEY Huge... sums of money 0.2 RDNS_NONE Delivered to internal network by a host with no rDNS 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 0.0 MIME_HTML_ONLY_MULTI Multipart message only has text/html MIME parts Bayes is clearly not being used when it goes through spamd. Does anybody know what could be causing this?
[Solved] Re: Error when trying to re-use Bayes database from one server to another
On 13/02/16 18:58, Bill Cole wrote: On 13 Feb 2016, at 3:49, Sebastian Arcus wrote: Thank you. The donor machine has db42, db44 and db44 packages installed, Based on the question below, I'll assume the second db44 above was a typo for db48, i.e. a Berkeley DB v4.8.x package. Tangentially: that's a risky mess. It's a common problem but you should try to fix it to leave just one version, which probably means rebuilding a number of pieces of software. Using db48 for everything isn't a bad choice, despite the current version being 6.something, because there are still perfectly good pieces of software that use db4x but nothing later. In any case, you have a potentially fragile system there which may have different programs using diverse Berkeley DB versions which may be broken by otherwise routine updates. If you choose to leave a working system alone rather than proactively clean it up, be sure to while the recipient machine only db42 and db44. Would it be enough to install db48 on the recipient machine, or are there also any glue/library Perl modules involved which SA uses for db access and would need to be updated as well? Any answer to that has so many conditional branches that I'm unwilling to attempt a definitive one. You definitely need to install db48 on the recipient machine if you want it to be able to read hash files created elsewhere by db48. Depending on what other software is using db42 dn db44 there, installing db48 and doing nothing else MIGHT break something. Depending on how Perl was built and/or installed on that machine and how the various db* packages are installed it MIGHT be necessary to rebuild your core Perl package and/or non-core packages which may include BerkeleyDB or (probably not) DB_File and maybe (but most likely not) SpamAssassin itself. Figuring out what exactly depends on which package on a specific system (which you've not described in any detail) is an opportunity to exercise your core system administration skills :). Thank you everybody who pitched in with suggestions. Just to confirm that in the end I decided not to mess too much with a working system and didn't upgrade to db48 on the older system. I went down the route of backing up and restoring the bayes database using sa-learn - which worked perfectly fine. There is still the question of the initial sa-learn error message which started all this. In my opinion it looks like a bug - as it says "missing file" - which is clearly not the case. Something more helpful such as "can't decode file, possible wrong format" - or anything else along those lines would be more relevant and helpful. Should I log this as a bug somewhere?
Re: Error when trying to re-use Bayes database from one server to another
On 13/02/16 18:58, Bill Cole wrote: On 13 Feb 2016, at 3:49, Sebastian Arcus wrote: Thank you. The donor machine has db42, db44 and db44 packages installed, Based on the question below, I'll assume the second db44 above was a typo for db48, i.e. a Berkeley DB v4.8.x package. Yes - sorry, you are right Tangentially: that's a risky mess. It's a common problem but you should try to fix it to leave just one version, which probably means rebuilding a number of pieces of software. Slackware current comes with all three versions - that's a default install and I checked the package list at Slackware.com. I'm afraid I'm not sure why, but I assume there is some logic to it - as the package choice in Slackware always seems to have some reasoning behind it. I also don't know why Slackware current doesn't include version 6.x (or even 4.6) - maybe something to do with the current politics of Oracle - or some other technical reason. Using db48 for everything isn't a bad choice, despite the current version being 6.something, because there are still perfectly good pieces of software that use db4x but nothing later. In any case, you have a potentially fragile system there which may have different programs using diverse Berkeley DB versions which may be broken by otherwise routine updates. If you choose to leave a working system alone rather than proactively clean it up, be sure to while the recipient machine only db42 and db44. Would it be enough to install db48 on the recipient machine, or are there also any glue/library Perl modules involved which SA uses for db access and would need to be updated as well? Any answer to that has so many conditional branches that I'm unwilling to attempt a definitive one. You definitely need to install db48 on the recipient machine if you want it to be able to read hash files created elsewhere by db48. Depending on what other software is using db42 dn db44 there, installing db48 and doing nothing else MIGHT break something. Depending on how Perl was built and/or installed on that machine and how the various db* packages are installed it MIGHT be necessary to rebuild your core Perl package and/or non-core packages which may include BerkeleyDB or (probably not) DB_File and maybe (but most likely not) SpamAssassin itself. Figuring out what exactly depends on which package on a specific system (which you've not described in any detail) is an opportunity to exercise your core system administration skills :). Thank you :-)
Re: Error when trying to re-use Bayes database from one server to another
On 13/02/16 04:32, Bill Cole wrote: On 12 Feb 2016, at 17:34, Sebastian Arcus wrote: Thanks for that suggestion. I think we might be getting somewhere. On original machine: #file bayes_seen bayes_seen: Berkeley DB (Hash, version 9, native byte-order) # file bayes_toks bayes_toks: Berkeley DB (Hash, version 9, native byte-order) On the receiver machine, but with bayes files created locally: #file bayes_seen bayes_seen: Berkeley DB (Hash, version 8, native byte-order) # file bayes_toks bayes_toks: Berkeley DB (Hash, version 8, native byte-order) Could the hash version account for the errors I am seeing? Absolutely. The BDB hash storage version number only changes when a change is NOT backwards-compatible, i.e. *BY DESIGN* a library version that creates v8 files cannot read v9 files. If my recollection is correct, the v8->9 change was in BDB 4.6 and actually provided substantial performance improvements. You probably want to upgrade BDB and anything using it on the machine with the old version. Thank you. The donor machine has db42, db44 and db44 packages installed, while the recipient machine only db42 and db44. Would it be enough to install db48 on the recipient machine, or are there also any glue/library Perl modules involved which SA uses for db access and would need to be updated as well?
Error when trying to re-use Bayes database from one server to another
As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. I re-transferred them several times over ssh, to make sure they were not corrupted. The database files are in the correct location, with correct permissions and owned by the correct user: # ls -l /var/spool/spamd/bayes/ total 5912 -rw-rw-rw- 1 spamd spamd 1310720 2016-02-09 08:42 bayes_seen -rw-rw-rw- 1 spamd spamd 4739072 2016-02-09 08:43 bayes_toks The version of SA on both the donor and receiving servers is 3.4.1. When I try to learn a new message on the receiving server (where I moved the bayes files), I get the following error: # su - spamd -c "/usr/bin/sa-learn -D --spam /New\ UnansweredSexHookup\ Request.eml" Feb 12 16:20:53.777 [12973] dbg: locker: mode is 438 Feb 12 16:20:53.778 [12973] dbg: locker: safe_lock: created /var/spool/spamd/bayes/bayes.lock.mdr-server.mdrinteriors.co.uk.12973 Feb 12 16:20:53.778 [12973] dbg: locker: safe_lock: trying to get lock on /var/spool/spamd/bayes/bayes with 0 retries Feb 12 16:20:53.778 [12973] dbg: locker: safe_lock: link to /var/spool/spamd/bayes/bayes.lock: link ok Feb 12 16:20:53.778 [12973] dbg: bayes: tie-ing to DB file R/W /var/spool/spamd/bayes/bayes_toks Feb 12 16:20:53.779 [12973] dbg: bayes: untie-ing DB file toks Feb 12 16:20:53.779 [12973] dbg: locker: safe_unlock: unlink /var/spool/spamd/bayes/bayes.lock bayes: cannot open bayes databases /var/spool/spamd/bayes/bayes_* R/W: tie failed: No such file or directory Learned tokens from 0 message(s) (1 message(s) examined) Feb 12 16:20:53.779 [12973] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0x93106d0) implements 'learner_close', priority 0 ERROR: the Bayes learn function returned an error, please re-run with -D for more information at /usr/bin/sa-learn line 498.
Re: Error when trying to re-use Bayes database from one server to another
On 12/02/16 16:59, Reindl Harald wrote: Am 12.02.2016 um 17:29 schrieb Sebastian Arcus: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. I re-transferred them several times over ssh, to make sure they were not corrupted. The database files are in the correct location, with correct permissions and owned by the correct user: # ls -l /var/spool/spamd/bayes/ total 5912 -rw-rw-rw- 1 spamd spamd 1310720 2016-02-09 08:42 bayes_seen -rw-rw-rw- 1 spamd spamd 4739072 2016-02-09 08:43 bayes_toks The version of SA on both the donor and receiving servers is 3.4.1. When I try to learn a new message on the receiving server (where I moved the bayes files), I get the following error: su - spamd stat /var stat /var/spool stat /var/spool/spamd stat /var/spool/spamd/bayes Linux is not like Windows - if ou don't have access to a parent folder you just don't have access root@mdr-server:/# su - spamd No directory, logging in with HOME=/ spamd@mdr-server:/$ stat /var File: `/var' Size: 4096 Blocks: 8 IO Block: 4096 directory Device: 900h/2304dInode: 12 Links: 16 Access: (0755/drwxr-xr-x) Uid: (0/root) Gid: (0/ root) Access: 2016-01-18 09:28:23.0 + Modify: 2016-01-18 09:22:47.0 + Change: 2016-01-18 09:28:23.744774236 + spamd@mdr-server:/$ stat /var/spool File: `/var/spool' Size: 4096 Blocks: 8 IO Block: 4096 directory Device: 900h/2304dInode: 118 Links: 22 Access: (0755/drwxr-xr-x) Uid: (0/root) Gid: (0/ root) Access: 2015-02-03 14:28:33.0 + Modify: 2015-12-03 17:41:28.859794403 + Change: 2015-12-03 17:41:28.859794403 + spamd@mdr-server:/$ stat /var/spool/spamd File: `/var/spool/spamd' Size: 4096 Blocks: 8 IO Block: 4096 directory Device: 900h/2304dInode: 15473107Links: 3 Access: (0770/drwxrwx---) Uid: ( 1037/ spamd) Gid: ( 252/ spamd) Access: 2015-12-03 17:41:28.859794403 + Modify: 2015-12-03 17:41:32.011239989 + Change: 2015-12-03 17:46:59.187806044 + spamd@mdr-server:/$ stat /var/spool/spamd/bayes/ File: `/var/spool/spamd/bayes/' Size: 4096 Blocks: 8 IO Block: 4096 directory Device: 900h/2304dInode: 15473106Links: 3 Access: (0776/drwxrwxrw-) Uid: ( 1037/ spamd) Gid: ( 252/ spamd) Access: 2015-12-03 17:41:32.011239989 + Modify: 2016-02-12 16:20:53.778709980 + Change: 2016-02-12 16:20:53.778709980 +
Re: Error when trying to re-use Bayes database from one server to another
On 12/02/16 16:59, Reindl Harald wrote: Am 12.02.2016 um 17:29 schrieb Sebastian Arcus: As per advice from this list, I have been re-using my bayes databases on several different servers running SA. On one of the servers though, the database is not accepted. I re-transferred them several times over ssh, to make sure they were not corrupted. The database files are in the correct location, with correct permissions and owned by the correct user: # ls -l /var/spool/spamd/bayes/ total 5912 -rw-rw-rw- 1 spamd spamd 1310720 2016-02-09 08:42 bayes_seen -rw-rw-rw- 1 spamd spamd 4739072 2016-02-09 08:43 bayes_toks The version of SA on both the donor and receiving servers is 3.4.1. When I try to learn a new message on the receiving server (where I moved the bayes files), I get the following error: su - spamd stat /var stat /var/spool stat /var/spool/spamd stat /var/spool/spamd/bayes Linux is not like Windows - if ou don't have access to a parent folder you just don't have access Sorry - previous reply sent in HTML format by mistake: root@mdr-server:/# su - spamd No directory, logging in with HOME=/ spamd@mdr-server:/$ stat /var File: `/var' Size: 4096 Blocks: 8 IO Block: 4096 directory Device: 900h/2304dInode: 12 Links: 16 Access: (0755/drwxr-xr-x) Uid: (0/root) Gid: (0/ root) Access: 2016-01-18 09:28:23.0 + Modify: 2016-01-18 09:22:47.0 + Change: 2016-01-18 09:28:23.744774236 + spamd@mdr-server:/$ stat /var/spool File: `/var/spool' Size: 4096 Blocks: 8 IO Block: 4096 directory Device: 900h/2304dInode: 118 Links: 22 Access: (0755/drwxr-xr-x) Uid: (0/root) Gid: (0/ root) Access: 2015-02-03 14:28:33.0 + Modify: 2015-12-03 17:41:28.859794403 + Change: 2015-12-03 17:41:28.859794403 + spamd@mdr-server:/$ stat /var/spool/spamd File: `/var/spool/spamd' Size: 4096 Blocks: 8 IO Block: 4096 directory Device: 900h/2304dInode: 15473107Links: 3 Access: (0770/drwxrwx---) Uid: ( 1037/ spamd) Gid: ( 252/ spamd) Access: 2015-12-03 17:41:28.859794403 + Modify: 2015-12-03 17:41:32.011239989 + Change: 2015-12-03 17:46:59.187806044 + spamd@mdr-server:/$ stat /var/spool/spamd/bayes/ File: `/var/spool/spamd/bayes/' Size: 4096 Blocks: 8 IO Block: 4096 directory Device: 900h/2304dInode: 15473106Links: 3 Access: (0776/drwxrwxrw-) Uid: ( 1037/ spamd) Gid: ( 252/ spamd) Access: 2015-12-03 17:41:32.011239989 + Modify: 2016-02-12 16:20:53.778709980 + Change: 2016-02-12 16:20:53.778709980 +
Re: Error when trying to re-use Bayes database from one server to another
On 12/02/16 19:14, Reindl Harald wrote: Am 12.02.2016 um 20:06 schrieb Marc Perkel: Any chance that the parent directory structure doesn't have enough permissions? The error message says it can't access it so there's your clue. Since the files themselves seem to have good permissions I would look at the directories. see previous mail - that was already verified looking closer "No such file or directory" is not a permission problem there was a hint "please re-run with -D" at least re-use bayes on different servers, even over different operating systems is no problem, or bayes is running on 3 own and 2 foreign machines for a long time now with great results I've checked and triple checked everything. Unless I'm missing something blindingly obvious, I don't think that error message is accurate. If I delete the bayes files and restart spamd, on running sa-learn, new ones are created in exactly the same place, with same name and same permissions - and they work fine. But the ones brought over from the other server don't work. PS - Regarding the "re-run with -D for more information" - I guess that message is slightly pointless, as it keeps on saying that even when you run it with "-D"