Re: Dinged for .Date
Hi, On Mon, Jan 15, 2024 at 05:06:11PM -0800, Cabel Sasser wrote: > If you believe every new gTLD is garbage (and I get that!), why isn’t > SpamAssassin automatically dinging, say, 1,200+ of them? I have to second the advice to send email from a different domain. It's just going to be the case that the .date TLD is abused by people sending shadier dating-related emails and the operators of that TLD have poisoned the well by making it cheap and easy to do so. Even if you somehow got negative scoring in SpamAssassin fixed for your specific domain, there's going to be countless private, non-SpamAssassin-based rule sets out there that penalise .date domains. It is a similar argument to "why can't I send email out from $ARBITRARY_GHETTO_HOSTER ? I'm not a spammer!" You can argue forever that as you're not a spammer and can prove you've never sent any spam, ever, why would receivers penalise you just for being at an hoster that is popular with a problematic class of clientele? The answer being that recipients are just working with what info they have, and it'll be hard work to convince a significant number of them that you're different. Is the work worth it? Generally not; other options exist. Thanks, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: Question about forwarding email (not specifically SA, pointers greatly appreciated)
Hello, On Wed, Jan 03, 2024 at 01:24:02PM -0600, Thomas Cameron via users wrote: > On 1/2/24 17:51, Andy Smith wrote: > > - Have your users collect their your-org email by some means other > >than SMTP, such as running an IMAP server and having them view > >both their gmail mailbox and their your-org inbox in one place (I > >have no idea if that is feasible with gmail). > > This is what *I* would do, for sure. But the members of the association are > incredibly non-technical, and trying to walk them through setting up an > email client like Thunderbird or Outlook is a recipe for disaster. I understand their point of view but maybe it needs putting to them from the angle that the org is like any other workplace. They would not expect their employer's internal emails to be forwarded to them at $freemail. Though then that does invite them to ask if they can have a dedicated device to manage org email then. (Which in many ways in not unreasonable either…) Thanks, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: Question about forwarding email (not specifically SA, pointers greatly appreciated)
Hi Thomas, On Tue, Jan 02, 2024 at 04:24:37PM -0600, Thomas Cameron via users wrote: > I built email servers for a non-profit I volunteer for. If email comes into > the server for presid...@myassociation.org, I would normally just create an > alias in /etc/aliases so that emails to president@ get forwarded to the > president's "real" email address, say presidents_real_em...@gmail.com. This causes your server to pass on email without changing envelope sender, so your server is purporting to be whoever the email is originally from. Any email authentication measure working on the envelope sender, such as SPF, will then fail, as your server is indistinguishable from a random host forging the original sender's domain. > How can I make this work? Is there a good way to use something like > /etc/aliases to forward emails to the domain I manage to another recipient? > Or is there something better I can do? You need to give up on /etc/aliases for external routing of email unless you control all the original sender domains and can for example add your server IPs to its authentication mechanisms (e.g. SPF). Since you probably can't do that for any recipient domain that expects to receive Internet email, you need to either: - Implement Sender Rewriting Scheme (SRS) so that your server takes responsibility for forwarded emails with its own envelope sender. https://en.wikipedia.org/wiki/Sender_Rewriting_Scheme Or: - Have your users collect their your-org email by some means other than SMTP, such as running an IMAP server and having them view both their gmail mailbox and their your-org inbox in one place (I have no idea if that is feasible with gmail). Thanks, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: Correct way to allowlist an IP from DNSBL checks when it's not the final Received?
Hello, On Sat, Sep 30, 2023 at 11:52:13AM -0400, Jared Hall wrote: > On 9/29/2023 10:59 AM, Andy Smith wrote: > > 3.4.2. I know, it's ancient. An upgrade is planned but I'd still > > like to know what the behaviour is. I understand if no one wants to > > help and if so I might come back with questions after an upgrade. > My distro packages stopped at v3.4.2 so I'm on SA v3.4.6 via CPAN. Just > inspecting the comments in DNSEval.pm, you'll need SA version 3.4.4 > (minimum). […] > The code for the current versions of DNSEval.pm is clean, much more > code-function oriented, and less-prone to race conditions. This actual > comment in SA 3.4.2's DNSEval.pm module says it all: > > "# Very hacky stuff and direct rbl_evals usage for now, TODO rewrite > everything" > > An upgrade is in order. Okay, thanks Jared! I'll work on that and see what it looks like after an upgrade. Thanks, Andy
Re: Correct way to allowlist an IP from DNSBL checks when it's not the final Received?
Hello, On Thu, Sep 28, 2023 at 09:08:30PM -0400, Jared Hall wrote: > 1) Are you using native SA or the spamhaus-dqs plugin? Just native SA in spamd mode. > 2) What version of SpamAssassin? 3.4.2. I know, it's ancient. An upgrade is planned but I'd still like to know what the behaviour is. I understand if no one wants to help and if so I might come back with questions after an upgrade. > 3) Parse the message from the command line. Something like: > 'cat message | spamassassin -D &> dbgout.txt' > Then: 'grep external dbgout.txt' > > It should show something like "full-external: 170.10.129.124, 66.187.233.73 > untrusted: 170.10.129.124, 66.187.233.73 originating:" if your Internal > networks are setup properly in SA. grep full-external: dbgout.txt produces 15 lines all of which are identical: Sep 29 14:36:57.221 [2611] dbg: dns: IPs found: full-external: 170.10.129.124, 66.187.233.73, 10.11.54.8, 10.30.29.100, ::1, 10.11.54.6, 10.11.55.25, 207.211.31.120, 209.85.128.43 untrusted: 170.10.129.124, 66.187.233.73, 207.211.31.120, 209.85.128.43 originating: (except for timestamps) 66.187.233.73 still seems to be listed in SBL-CSS and ios detected as such. I can see from: grep 73.233.187.66 dbgout.txt that it does check 66.187.233.73 against all the usual DNSBLs, e.g. Sep 29 14:36:57.157 [2611] dbg: check: tagrun - tag RELAYSUNTRUSTEDREVIP is now ready, value: ARY:[124.129.10.170,73.233.187.66,120.31.211.207,43.128.85.209] Sep 29 14:36:57.157 [2611] dbg: check: tagrun - tag RELAYSEXTERNALREVIP is now ready, value: ARY:[124.129.10.170,73.233.187.66,120.31.211.207,43.128.85.209] […] Sep 29 14:36:57.218 [2611] dbg: async: launching A/73.233.187.66.zen.spamhaus.org for dns:A:73.233.187.66.zen.spamhaus.org Sep 29 14:36:57.219 [2611] dbg: dns: providing a callback for id: 31199/IN/A/73.233.187.66.zen.spamhaus.org Sep 29 14:36:57.219 [2611] dbg: async: starting: DNSBL-A, dns:A:73.233.187.66.zen.spamhaus.org (timeout 15.0s, min 3.0s) Sep 29 14:36:57.378 [2611] dbg: async: calling callback on key dns:A:73.233.187.66.zen.spamhaus.org Sep 29 14:36:57.378 [2611] dbg: dns: hit 127.0.0.3 So this is normal behaviour then, for v3.4.2 at least? Thanks, Andy
Re: Correct way to allowlist an IP from DNSBL checks when it's not the final Received?
Hello, On Thu, Sep 28, 2023 at 06:48:54AM -0400, Jared Hall wrote: > Do you mind if I redirect the below back onto the spamassassin list > and respond to it there? Well I was going to do that, but fair enough! > On Thu, Sep 28, 2023 at 12:02:47AM -0400, Jared Hall wrote: > > SpamAssassin doesn't arbitrarily pick a header to look at. lastexternal is > > used per the defaults in 20_dnsbl_tests.cf Okay so here is what I have: Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by barenjager.bitfolk.com with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qlVVV-0001zW-Jc for a...@strugglers.net; Wed, 27 Sep 2023 14:27:18 + Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-473-x2wpeAY1NVC4XPDK8dEpYA-1; Wed, 27 Sep 2023 10:27:10 -0400 In the SpamAssassin report is: * 3.6 RCVD_IN_SBL_CSS RBL: Received via a relay in Spamhaus SBL-CSS * [66.187.233.73 listed in zen.spamhaus.org] barenjager.bitfolk.com is my MX which is running spamassassin called from Exim using its built in means of calling out to SA from the check_data ACL: acl_check_data: # … warn message = X-barenjager.bitfolk.com-Spam-Report: $spam_report spam = Debian-exim:true/defer_ok What I gathered from Jared's reply is that SA shouldn't be doing DNSBL checks against all of the IPs in all of the Received headers, only the lastexternal one. Here though, the lastexternal one should be 170.10.129.124 as that is not in my internal_networks, but it seems to have done a check of the one before it, 66.187.233.73, and found it in Spamhaus SBL-CSS. Is that expected? I guess I can allowlist from SPF as the envelope sender will be the mailing list in question (linux-lvm-boun...@redhat.com) and it did get a "SPF_PASS SPF: sender matches SPF record" so redhat.com must have mimecast's relays correctly in it. Thanks, Andy
Correct way to allowlist an IP from DNSBL checks when it's not the final Received?
Hi, The IP address of a supplier is currently listed by Spamhaus SBL-CSS. This is not directly causing me to reject their emails, because they are actually sending out through Mimecast. However, SpamAssassin is finding that IP in the headers as the Received line *before* Mimecast's, i.e. their listed host is the one handing off to Mimecast, who are connecting to my MX. How would I go about allowlisting this IP address against DNSBL hits? Ideally for a specified range of from addresses and/or envelope senders, but for every sender if necessary. I think I would be okay with exempting such an IP address from *all* negative DNSBL hits, at least temporarily. My first thought was "allowlist_from rcvd", but I do not think this will work as I think it only checks the first Received header outside of my internal_networks. The employees of the supplier send email with all manner of addresses and the supplier also hosts mailing lists that are open to the public so I cannot predict any from address for allowlisting purposes. I expect they will be delisted by the time I work this out, but it would be good to know for the future! Thanks, Andy
Re: new rule for kam :)
Hi, On Wed, Aug 23, 2023 at 06:14:45PM -0700, John Hardin wrote: > On Wed, 23 Aug 2023, Andy Smith wrote: > > On Wed, Aug 23, 2023 at 03:24:22PM +0200, Benny Pedersen wrote: > > > # test for empty src="" or empty href="" > > > rawbody __HREF_EMPTY /href=\"\"/ > > > rawbody __SRC_EMPTY /src=\"\"/ > > > > I checked this against about 80k of my recent personal emails and it > > matched quite a lot of previously not found spam, but did also match > > on every auto response from one of my suppliers. It seems after > > every customer service interaction they send a "how did we do? fill > > in this survey" email from qualtrics.com which contains: > > > > > > > > It wouldn't be much of a loss, but it's not spam either. > > How did they perform individually? The only non-spam that matched for me was the above, with src="". Everything with href="" was spam. There was some overlap — some spam had both — but some spam had only href="" and some spam had only src="". I'm sure KAM has a much bigger corpus to do automated tests on… Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: new rule for kam :)
Hello, On Wed, Aug 23, 2023 at 03:24:22PM +0200, Benny Pedersen wrote: > # test for empty src="" or empty href="" > rawbody __HREF_EMPTY /href=\"\"/ > rawbody __SRC_EMPTY /src=\"\"/ I checked this against about 80k of my recent personal emails and it matched quite a lot of previously not found spam, but did also match on every auto response from one of my suppliers. It seems after every customer service interaction they send a "how did we do? fill in this survey" email from qualtrics.com which contains: It wouldn't be much of a loss, but it's not spam either. Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: ip2location.com
Hi Benny, On Thu, Jan 28, 2021 at 03:06:12PM +0100, Benny Pedersen wrote: > https://lite.ip2location.com/database/ip-asn > > is it possible to use it in spamassassin ? SpamAssassin already has an IP to ASN plugin: https://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_ASN.html Is there some other info from the above database(s) that you were interested in? The above plugin only exposes the ASN of the connecting IP address. The DNS database that it uses (asn.routeviews.org) only provides the ASN: $ dig -t txt +noall +answer 232.80.119.85.asn.routeviews.org 232.80.119.85.asn.routeviews.org. 86400 IN TXT "8943" "85.119.80.0" "21" but there is the Cymru IP-to-ASN database that does provide more info: $ dig -t txt +noall +answer 232.80.119.85.origin.asn.cymru.com 232.80.119.85.origin.asn.cymru.com. 14400 IN TXT "8943 | 85.119.80.0/21 | GB | ripencc | 2010-03-03" So perhaps another plugin along the lines of Plugin::ASN could be used to get some of that info? Cymru database docs at: https://team-cymru.com/community-services/ip-asn-mapping/#dns Cheers, Andy
Re: The most efficient SPAM implementation ever
Hello, On Sun, Oct 11, 2020 at 10:20:32AM -0500, Ramon F Herrera wrote: > On 10/11/2020 10:07 AM, Marc Roos wrote: > >Now you can decide to reject email coming from (the whole of) sendgrid. > > I am the one who is a client of sendgrid. Are you aware that you've posted this to a list where it is an ongoing topic of discussion for the last year or so how to block the torrent of spam and phishing from SendGrid without blocking the legitimate email? > They provide legitimacy. Highly recommended >From my point of view I'd prefer if people didn't use SendGrid as then it would become more feasible to block them entirely. They currently "provide legitimacy" only on the basis of them being "too big to block"; I am not sure if that is something to be encouraged by throwing them more business. Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
Re: mark emails as being spam originating from an ip range owner
Hello, On Tue, Sep 29, 2020 at 10:49:36AM +0200, Marc Roos wrote: > How can I mark emails as being spam originating from an ip range owned > by xserver.ua? > > % Abuse contact for '176.103.48.0 - 176.103.63.255' is I' not sure if blacklist_from accepts IP addresses or CIDR ranges, but if it does: blacklist_from 176.103.48.0/20 Or consider using ASN plugin: https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Plugin_ASN.html and then adding a rule that penalises everything from ASN 48031: header LOCAL_SPAMMY_ASN_XSERVERX-ASN =~ /\b48031\b/ score LOCAL_SPAMMY_ASN_XSERVER5.0 describeLOCAL_SPAMMY_ASN_XSERVERToo much spam from xserver.ua (AS48031) Cheers, Andy -- https://bitfolk.com/ -- No-nonsense VPS hosting
SendGrid (Was: Re: Freshdesk (again))
Hello, On Fri, Jun 26, 2020 at 07:32:09PM -0600, Grant Taylor wrote: > I've got to say, between NANOG, SDLU, and SpamAssassin, I see a LOT of > complaints about Sendgrid. Also mailop. Have personally received phishing mails through SendGrid in the last 2 weeks in the name of citrix.com, microsoft.com and netflix.com. The Citrix one was to a hostmaster@ address. It's hard to comprehend how SendGrid could be doing a worse job of this, for so many months now. Yet their list of legit clients is large, so they remain unblockable for me. I just wish those clients knew how little SendGrid would do to prevent their other customers sending out phishing emails in their name. Cheers, Andy
ASN plugin matches IPv6 addresses against IPv4 DNS lists
Hi, I'm subscribed to this long-standing bug and saw it had an update today basically saying that it's still broken in 3.4.2: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7211 And I agree, it is still broken in 3.4.2. An IPv6 address will be looked up in a DNS list that contains IPv4 addresses and will sometimes match. So, firstly, could the bug be re-opened? Secondly, can we discuss how to fix it? Back in 2017 (comment 6) I proposed adding an "asn_lookup_v6" as well as the existing "asn_lookup" and querying IPv6 addresses only in asn_lookup_v6 if set. I've never developed a plugin before but if that approach is acceptable then I can look into doing it. Cheers, Andy
Re: Scans and Invoice spam containg HREF to something bad
This has literally just come through to me, zero BAYES and got passed my custom rule as the HREF URL has changed: https://pastebin.com/pBfhXd6B thanks, Andy. On 19-06-2018 17:33, Kevin A. McGrail wrote: > Well you are welcome to send me new Spamples to look at. As I noted, I've > never seen these variants and RBLs aren't hitting them which ALSO means you > have some new variants. > > Regards, > KAM
Re: Scans and Invoice spam containg HREF to something bad
Hi Kevin, I'm not really getting any joy with the RBLs. I have, for example, a sample from the 14th and, taking away my custom rule, Bayes and KAM scores, the default score would be "0" :( Content here: https://pastebin.com/dthDn8yb thanks, Andy. On 19-06-2018 17:12, Kevin A. McGrail wrote: > The warnings are OK though make sure you have the nonKAMrules.cf as well. > > I'm not seeing really any of these spamples for us and agree. It's scoring > in the 1.2 range for me. > > Clearly seems to be compromised url so RBLs are you likely bet but you might > be a patient 0 for a new engine. > > -- > Kevin A. McGrail
Re: Scans and Invoice spam containg HREF to something bad
Hi Kevin, No I wasn't. I just added it, I get a lot of errors like "meta test KAM_WARRANTY3 has dependency 'CBJ_GiveMeABreak' with a zero score", is this normal? Testing despite these errors the only rule I'm getting a hit on from KAM is JMQ_SPF_NEUTRAL_ALL thanks, Andy. On 19-06-2018 16:51, Kevin A. McGrail wrote: > Are you using the KAM.cf ruleset?
Scans and Invoice spam containg HREF to something bad
Hi all, the last week or so we are having a lot of problems with emails either with subjects like "New Approach Contractors Ltd wants to share Scan" or "Invoice INV-03056 from Encompass Environmental Ltd" which contian an HREF to see your "scan" or "invoice" at a URL ending /share or /directory respectively. These aren't detected by Spamassassin, I have Razor and iHash configured running on Spamassassin 3.4.1. Even when I have Bayes learn a few examples, subsequent Spams can get Bayes as low as 50%. Example: https://pastebin.com/85v2nHkF My question is does anyone have any ideas/tips/rules for catching these. I've created a custom rule that checks for the subject and HREF, but ever time a new variant comes out I'll have to update this. Anyone got any better solutions? thanks in advance, Andy.
Re: what is triggering NO_DNS_FOR_FROM
Thanks all who replied to my question, sorry for the late reply. It seems this was a temporary error on the senders DNS servers (I assume as I've only seen this issue on their email). Rerunning spamassassin on the same message now doesn't trigger NO_DNS_FOR_FROM. Thanks Matus, yes I know the MX isn't the same as the senders IP, in Exim if the sending IP PTR doesn't match a subsequent lookup of the returned FQDN in the PTR then Exim marks the mail as being sent from a server without rDNS (even though a PTR exists) and therefore triggers RDNS_NONE in spamassassin. Not sure if this behaviour is typical in other SMTP servers. Thanks also RW for the tips about "-D" and envelope_sender_header documentation. Noted for future reference! many thanks, Andy.
what is triggering NO_DNS_FOR_FROM
Hi all, I have a some genuine emails getting marked with NO_DNS_FOR_FROM from one particular domain and I'd like to know exactly why. I've had a dig in the Spamassasin Dns.pm but I can't work out exactly what process_dnsbl_result is doing. What exactly does it check WRT MX and A records? I can see that the domain in question does have A and MX records, possibly issues are that the A record doesn't match the PTR for the IP returned by the A record and that one of the MX records doesn't have a PTR. I'd be keen to know if one or both of these are the issue, and what the situation WRT RFCs on email DNS says about what are required for proper operation of email. I've already had to ask the owners of the domain to correct an issue where their sending server's A record didn't match the PTR and was triggering the RDNS_NONE rule (as detected by Exim), so if I'm going to convince them to do more modifications I'd prefer to know what I was talking about, thanks, Andy.
ASN plugin and IPv6 addresses
Hi, I'm using version 3.4.0 on Debian stable. I noticed that when presented with some IPv6 addresses, the ASN plugin is actually querying them as an IPv4 address e.g. turning 2600:… into 2.0.0.0 and coming back with the wrong ASN. This appears to already be documented in the bugzilla: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7211 but the conclusion there seems to be that the plugin just needs to be configured correctly. So how would one do that? Looking at the perldoc I see: asn_lookup asn-zone.example.com [ _ASNTAG_ _ASNCIDRTAG_ ] Use this to lookup the ASN info in the specified zone for the first external IP address and add the AS number to the first specified tag and routing info to the second specified tag. […] If two or more asn_lookups use the same set of template tags, the results of their lookups will be appended to each other in the template tag values in no particular order. Duplicate results will be omitted when combining results. In a similar fashion, you can also use the same template tag for both the AS number tag and the routing info tag. The thing is, I can't find one DNS zone that will answer queries for both IPv4 and IPv6. I can add asn_lookup directives for both, e.g.: asn_lookup origin.asn.cymru.com _ASN_ _ASNCIDR_ asn_lookup origin6.asn.cymru.com_ASN_ _ASNCIDR_ but what then happens is that an erroneous v6-as-v4 result from the first one gets included together with the (correct) answer from origin6.asn.cymru.com. What is the correct way of configuring this? Doesn't the plugin need two different asn_lookup directives, one for IPv4 and one for IPv6, with only the relevant queries being directed at each? Cheers, Andy
collecting mail for sa-learn, how to?
Hi, for a mail server running email for multiple domains what is the typical/recommended way to collect emails which arent detected as spam to be processed by sa-learn? Users are downloading mail via POP3, so once a users sees a mail and decides that it is in fact spam its already been removed from the mail server. If the user forwards the mail to a special mailbox for processing then the mail is obviously now different from the original spam, the user is the sender etc. Will sa-learn still work using this method? and if not what else can I implement that would work? thanks for any comments, Andy :P
Re: collecting mail for sa-learn, how to?
Soz, I just saw that. Until today my attempts to mail the subscibe address on this list were'nt resulting in an autoreply etc. I only recieved confirmation I was subscribed to this list some 20 mins ago, im taking a look now at the replys thanks Andy. - Original Message - From: Karsten Bräckelmann [EMAIL PROTECTED] To: Andy Smith [EMAIL PROTECTED] Cc: users@spamassassin.apache.org Sent: Thursday, July 17, 2008 2:23 PM Subject: Re: collecting mail for sa-learn, how to? Are you actually READING this list? Sent Jul 11, Jul 14, and now again Jul 17. Identical text, including typos. Got quite a few replies and discussion. No follow up by you, though. Please stop sending the same question over and over again, if you are not reading the replies. guenther -- char *t=[EMAIL PROTECTED]; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: collecting mail for sa-learn, how to?
Hi All, thanks very much for all the replies and discussion around my original post, and appologies for not replying more promptly, Ive only just managed to successfully subscribe to the list and managed to confuse myself looking at the forum archives (I think there had been some delays to when my posts appeared blah blah blah :P ) Anyway, thanks for clarifying the requirements of sa-learn. I think the best options sounds like it will be this: This is what I do: Forwarding the unrecognised message to an account which will process the message through sal-wrapper.pl. You will find further informations here: https://po2.uni-stuttgart.de/~rusjako/sal-wrapper Thanks to Stefan for this suggestion. Reason being it doesnt impose the need for an IMAP client on the users and I think its the simplest option from the point of view of the end users, ie if you recieve spam and wish to report it please simply forward the email and it will be analysed by the spam filter software Sounds good to me :P Ive downloaded this and and will do some eval on my systems. thanks alot!! Andy.
Re: problems using haproxy for spamd
On Mon, Apr 30, 2007 at 01:23:23AM +, Andy Smith wrote: Hi, I'm trying to use haproxy (http://haproxy.1wt.eu/) to load balance 3 spamd servers on the same network. [...] Unfortunately I seem to be intermittently getting connection failures. The haproxy log looks like this: Apr 28 05:13:49 localhost haproxy[11683]: Proxy spamd started. Apr 28 05:14:57 localhost haproxy[11684]: 212.13.194.70:55827 [28/Apr/2007:05:14:57] spamd corona 0/0/148 765 -- 0/0/0 0/0 Apr 28 05:14:57 localhost haproxy[11684]: 212.13.194.70:55828 [28/Apr/2007:05:14:57] spamd curacao 0/-1/1 0 CC 0/0/0 0/0 It turned out to be a bug in haproxy. I sent an strace to the author, Willy Tarreau, and he replied in less than 24 hours with a full annotation of the strace and a patch to fix it. That's service! The bug manifested itself when the client would connect, send all its data and shutdown before haproxy had successfully established a connection with the backend server. If haproxy managed to establish a connection before the client fnished sending then it would work fine. Here's the simple fix: diff --git a/haproxy.c b/haproxy.c index 8e57700..357a37a 100644 --- a/haproxy.c +++ b/haproxy.c @@ -5589,7 +5589,7 @@ int process_srv(struct session *t) { else if (s == SV_STCONN) { /* connection in progress */ if (c == CL_STCLOSE || c == CL_STSHUTW || (c == CL_STSHUTR -(t-req-l == 0 || t-proxy-options PR_O_ABRT_CLOSE))) { /* give up */ +((t-req-l == 0 t-res_sw == RES_SILENT) || t-proxy-options PR_O_ABRT_CLOSE))) { /* give up */ tv_eternity(t-cnexpire); fd_delete(t-srv_fd); if (t-srv) Cheers, Andy signature.asc Description: Digital signature
problems using haproxy for spamd
Hi, I'm trying to use haproxy (http://haproxy.1wt.eu/) to load balance 3 spamd servers on the same network. Here's my haproxy config: global log 127.0.0.1 local0 debug maxconn 100 ulimit-n 512 uid 999 gid 999 daemon pidfile /var/run/haproxy-spamd.pid listen spamd bind 212.13.194.5:783 mode tcp option tcplog log global balance roundrobin source 212.13.194.5:0 clitimeout 15 srvtimeout 15 contimeout 3 server corona 212.13.194.122:783 weight 5 server curacao 212.13.194.71:783 weight 5 server islay 212.13.194.96:783 weight 6 Unfortunately I seem to be intermittently getting connection failures. The haproxy log looks like this: Apr 28 05:13:49 localhost haproxy[11683]: Proxy spamd started. Apr 28 05:14:57 localhost haproxy[11684]: 212.13.194.70:55827 [28/Apr/2007:05:14:57] spamd corona 0/0/148 765 -- 0/0/0 0/0 Apr 28 05:14:57 localhost haproxy[11684]: 212.13.194.70:55828 [28/Apr/2007:05:14:57] spamd curacao 0/-1/1 0 CC 0/0/0 0/0 Apr 28 05:16:07 localhost haproxy[11684]: 212.13.194.70:55858 [28/Apr/2007:05:16:07] spamd islay 0/-1/0 0 CC 0/0/0 0/0 Apr 28 05:16:08 localhost haproxy[11684]: 212.13.194.70:55859 [28/Apr/2007:05:16:07] spamd corona 0/0/327 4369 -- 0/0/0 0/0 Apr 28 05:17:04 localhost haproxy[11684]: 212.13.194.70:55863 [28/Apr/2007:05:17:02] spamd curacao 0/0/2419 839 -- 0/0/0 0/0 Apr 28 05:17:04 localhost haproxy[11684]: 212.13.194.70:55864 [28/Apr/2007:05:17:04] spamd islay 0/-1/0 0 CC 0/0/0 0/0 Apr 28 05:25:38 localhost haproxy[11684]: 212.13.194.70:54248 [28/Apr/2007:05:25:37] spamd corona 0/0/492 3930 -- 0/0/0 0/0 Apr 28 05:26:12 localhost haproxy[11684]: 212.13.194.70:54254 [28/Apr/2007:05:26:12] spamd islay 0/-1/4 0 CC 0/0/0 0/0 Apr 28 05:26:12 localhost haproxy[11684]: 212.13.194.70:54255 [28/Apr/2007:05:26:12] spamd curacao 0/-1/10 0 CC 0/0/0 0/0 According to http://haproxy.1wt.eu/download/1.2/doc/haproxy-en.txt state CC means that the client aborted the connection before it could be passed to any backend server. As you can see above this does not happen to every connection. Yet on the connections that aborted with status CC, the server did actually receive them and deal with them: Apr 28 05:26:12 islay spamd[861]: spamd: connection from 212.13.194.5 [212.13.194.5] at port 48949 Apr 28 05:26:13 islay spamd[861]: spamd: processing message [EMAIL PROTECTED] aka [EMAIL PROTECTED] for Debian-exim:102 Apr 28 05:26:17 islay spamd[861]: spamd: clean message (-2.2/5.0) for Debian-exim:102 in 4.9 seconds, 4055 bytes. Apr 28 05:26:17 islay spamd[861]: spamd: result: . -2 - AWL,BAYES_00,FORGED_RCVD_HELO scantime=4.9,size=4055,user=Debian-exim,uid=102,required_score=5.0,rhost=212.13.194.5,raddr=212.13.194.5,rport=48949,mid=[EMAIL PROTECTED],rmid=[EMAIL PROTECTED],bayes=0,autolearn=ham in Exim this was reported as a protocol error though: 2007-04-28 05:26:12 1HhfRA-00065X-P0 spam acl condition: cannot parse spamd output 2007-04-28 05:26:12 1HhfRA-00065X-P0 = [EMAIL PROTECTED] H=murphy.debian.org [70.103.162.31] P=esmtp S=4176 [EMAIL PROTECTED] 2007-04-28 05:26:14 1HhfRA-00065X-P0 = andy [EMAIL PROTECTED] R=procmail T=procmail_pipe 2007-04-28 05:26:14 1HhfRA-00065X-P0 Completed Seems like Exim must have sent data to spamd, but the saw some problem and aborted the connection. I've tried telnetting to the listen address/port over and over and never see anything other than what I expect. If I give Exim the IPs of the spamd servers directly then it works fine. I'm using version 3.1.7-1~bpo.1 from Debian backports. Do anyone have any ideas what I might be doing wrong here? Any tips for getting more info on what might be going wrong? Alternatively, can anyone recommend some other open source software load balancing solution? Preferably one that will let me direct to least busy server or to set a per-server concurrent connection limit. Cheers, Andy signature.asc Description: Digital signature
Re: rejectlog
On Thu, Nov 10, 2005 at 04:08:56PM +0100, nick wrote: Rejecting the mail after DATA? Spamassassin runs behind my MTA, if the sender passes blacklist checks and any other obvious no-nos, it's then passed to spamassassin which NEVER discards email, but places them in a spam folder. Discarding emails based on a spam score is a bad idea. As you can see quite clearly, the reasons behind the discard/tagging aren't logged, so false positives can't be corrected. It is a bad idea if you set it up so it doesn't log anything, yes. Anything done badly is a bad idea. It is however perfectly possible to set up Exim and sa-exim to use spamassassin to reject mail after DATA giving a full reason why in the log file and the reject message and still keeping a copy on disk. A reject with a useful message combined with keeping the message on disk for a reasonable period of time is in many cases BETTER than accepting and silently filing away in a spam folder, because the entity with the most desire to see the mail delivered -- the sender -- is the one who gets notified via the usual SMTP mechanism that it did not get delivered. Having the spare time to look through my spamassassin thinks this is spam folder for false positives is a thing of the past; I would much rather reject as much as possible and only have to check the borderline stuff. Andy signature.asc Description: Digital signature
Re: Stopping Rules
On Sat, Oct 22, 2005 at 11:05:07AM -0400, Chris L. Franklin wrote: For starters AWL, white lists and black lists in my option ar ethe worst things ever. I disable them from the start. If your going to whitelist some one, why would you want them to even go though SA. (I don't) Because a source that regularly sends you legit email, e.g. a mailing list, might send email that is borderline spammy and the only thing that tips it back into legitimate territory is the autowhitelist and bayes based on what YOUR users consider ham. if there blaklisted I don't want them even want the server accepting a email for me / the user if they are black listed. There are lots of blacklists and DNSBLs that work best as contributors, not as absolute yes/no arbiters of what should be accepted. And again negative-scoring is useless if u need to write a negative score you problitly should rethink your positive scoring rules. I don't understand why you are using SpamAssassin if you really believe the above. All this taking into a account Removing AWL, and negative-scoring. There are no real problems. And as a side note about net rules, if your really into using these then you'll probabliy just want to tune the server not to accept email from non-RDNS or invaild dns lookups. Masses of legitimate email comes from hosts with no reverse DNS, incorrect HELO and other borderline or actual RFC violations. I don't think you have thought this through and I believe that you would do well to accept some of the wisdom of those that have. If not, well, try it, and report back as to how well that works out for you, so that everyone else can see how wrong they are. signature.asc Description: Digital signature
Individual timings of spamassassin rules?
Hi, On one of my machines I'm running v3.0.3 under spamd with a fairly default config for debian sarge. This is a reasonable spec machine, a 3GHz P4 that is not swapping, but I'm seeing that each message seems to take quite a while to check, between 3.5 and 15 seconds each (I'd say averaging at about 5 seconds per message) I have read: http://wiki.apache.org/spamassassin/FasterPerformance but I'm wondering if there is any easier way to get an overview of which rules are taking the longest period of time to complete, other than removing the rulesets one by one? I'm sure there is some timeout somewhere that will be easy to fix. Thanks, Andy
Re: Individual timings of spamassassin rules?
On Thu, Oct 13, 2005 at 05:17:49AM -0700, Loren Wilton wrote: On one of my machines I'm running v3.0.3 under spamd with a fairly default config for debian sarge. This is a reasonable spec machine, a 3GHz P4 that is not swapping, but I'm seeing that each message seems to take quite a while to check, between 3.5 and 15 seconds each (I'd say averaging at about 5 seconds per message) You don't mention if you are CPU bound. This doesn't appear to be the case: low load average, mostly idle cpu, no significant iowait. If you aren't swapping, then generally slow processing for a given email is more likely related to the time it takes to get the responses for network checks than actual CPU time required to scan the mail. Often the correct solution is a local caching name server. I'm sorry I didn't specify that there is a local caching server too, and network access is otherwise snappy. I was hoping to be able to get a list of DNS/URIBLs and other external checks (razor2, pyzor, dcc) along with their timings to see where the problems lie. Possibly there is something for me to fix, or DNLSlists I could locally host, etc. But I realise the parallel nature of the checks makes this more difficult. signature.asc Description: Digital signature