Re: Dinged for .Date

2024-01-16 Thread Andy Smith
Hi,

On Mon, Jan 15, 2024 at 05:06:11PM -0800, Cabel Sasser wrote:
> If you believe every new gTLD is garbage (and I get that!), why isn’t 
> SpamAssassin automatically dinging, say, 1,200+ of them?

I have to second the advice to send email from a different domain.
It's just going to be the case that the .date TLD is abused by
people sending shadier dating-related emails and the operators of
that TLD have poisoned the well by making it cheap and easy to do
so.

Even if you somehow got negative scoring in SpamAssassin fixed for
your specific domain, there's going to be countless private,
non-SpamAssassin-based rule sets out there that penalise .date
domains.

It is a similar argument to "why can't I send email out from
$ARBITRARY_GHETTO_HOSTER ? I'm not a spammer!" You can argue forever
that as you're not a spammer and can prove you've never sent any
spam, ever, why would receivers penalise you just for being at an
hoster that is popular with a problematic class of clientele? The
answer being that recipients are just working with what info they
have, and it'll be hard work to convince a significant number of
them that you're different. Is the work worth it? Generally not;
other options exist.

Thanks,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting


Re: Question about forwarding email (not specifically SA, pointers greatly appreciated)

2024-01-04 Thread Andy Smith
Hello,

On Wed, Jan 03, 2024 at 01:24:02PM -0600, Thomas Cameron via users wrote:
> On 1/2/24 17:51, Andy Smith wrote:
> > - Have your users collect their your-org email by some means other
> >than SMTP, such as running an IMAP server and having them view
> >both their gmail mailbox and their your-org inbox in one place (I
> >have no idea if that is feasible with gmail).
> 
> This is what *I* would do, for sure. But the members of the association are
> incredibly non-technical, and trying to walk them through setting up an
> email client like Thunderbird or Outlook is a recipe for disaster.

I understand their point of view but maybe it needs putting to them
from the angle that the org is like any other workplace. They would
not expect their employer's internal emails to be forwarded to them
at $freemail.

Though then that does invite them to ask if they can have a
dedicated device to manage org email then. 

(Which in many ways in not unreasonable either…)

Thanks,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting


Re: Question about forwarding email (not specifically SA, pointers greatly appreciated)

2024-01-02 Thread Andy Smith
Hi Thomas,

On Tue, Jan 02, 2024 at 04:24:37PM -0600, Thomas Cameron via users wrote:
> I built email servers for a non-profit I volunteer for. If email comes into
> the server for presid...@myassociation.org, I would normally just create an
> alias in /etc/aliases so that emails to president@ get forwarded to the
> president's "real" email address, say presidents_real_em...@gmail.com.

This causes your server to pass on email without changing envelope
sender, so your server is purporting to be whoever the email is
originally from. Any email authentication measure working on the
envelope sender, such as SPF, will then fail, as your server is
indistinguishable from a random host forging the original sender's
domain.

> How can I make this work? Is there a good way to use something like
> /etc/aliases to forward emails to the domain I manage to another recipient?
> Or is there something better I can do?

You need to give up on /etc/aliases for external routing of email
unless you control all the original sender domains and can for
example add your server IPs to its authentication mechanisms (e.g.
SPF).

Since you probably can't do that for any recipient domain that
expects to receive Internet email, you need to either:

- Implement Sender Rewriting Scheme (SRS) so that your server takes
  responsibility for forwarded emails with its own envelope sender.
  https://en.wikipedia.org/wiki/Sender_Rewriting_Scheme

Or:

- Have your users collect their your-org email by some means other
  than SMTP, such as running an IMAP server and having them view
  both their gmail mailbox and their your-org inbox in one place (I
  have no idea if that is feasible with gmail).

Thanks,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting


Re: Correct way to allowlist an IP from DNSBL checks when it's not the final Received?

2023-09-30 Thread Andy Smith
Hello,

On Sat, Sep 30, 2023 at 11:52:13AM -0400, Jared Hall wrote:
> On 9/29/2023 10:59 AM, Andy Smith wrote:
> > 3.4.2. I know, it's ancient. An upgrade is planned but I'd still
> > like to know what the behaviour is. I understand if no one wants to
> > help and if so I might come back with questions after an upgrade.
> My distro packages stopped at v3.4.2 so I'm on SA v3.4.6 via CPAN. Just
> inspecting the comments in DNSEval.pm, you'll need SA version 3.4.4
> (minimum).

[…]

> The code for the current versions of DNSEval.pm is clean, much more
> code-function oriented, and less-prone to race conditions.  This actual
> comment in SA 3.4.2's DNSEval.pm module says it all:
> 
> "# Very hacky stuff and direct rbl_evals usage for now, TODO rewrite
> everything"
> 
> An upgrade is in order.

Okay, thanks Jared! I'll work on that and see what it looks like
after an upgrade.

Thanks,
Andy


Re: Correct way to allowlist an IP from DNSBL checks when it's not the final Received?

2023-09-29 Thread Andy Smith
Hello,

On Thu, Sep 28, 2023 at 09:08:30PM -0400, Jared Hall wrote:
> 1) Are you using native SA or the spamhaus-dqs plugin?

Just native SA in spamd mode.

> 2) What version of SpamAssassin?

3.4.2. I know, it's ancient. An upgrade is planned but I'd still
like to know what the behaviour is. I understand if no one wants to
help and if so I might come back with questions after an upgrade.

> 3) Parse the message from the command line.  Something like:
> 'cat message | spamassassin -D &> dbgout.txt'
> Then: 'grep external dbgout.txt'
> 
> It should show something like "full-external: 170.10.129.124, 66.187.233.73
> untrusted: 170.10.129.124, 66.187.233.73 originating:" if your Internal
> networks are setup properly in SA.

grep full-external: dbgout.txt

produces 15 lines all of which are identical:

Sep 29 14:36:57.221 [2611] dbg: dns: IPs found: full-external: 170.10.129.124, 
66.187.233.73, 10.11.54.8, 10.30.29.100, ::1, 10.11.54.6, 10.11.55.25, 
207.211.31.120, 209.85.128.43 untrusted: 170.10.129.124, 66.187.233.73, 
207.211.31.120, 209.85.128.43 originating:

(except for timestamps)

66.187.233.73 still seems to be listed in SBL-CSS and ios detected
as such.

I can see from:

grep 73.233.187.66 dbgout.txt

that it does check 66.187.233.73 against all the usual DNSBLs,
e.g.

Sep 29 14:36:57.157 [2611] dbg: check: tagrun - tag RELAYSUNTRUSTEDREVIP is now 
ready, value: ARY:[124.129.10.170,73.233.187.66,120.31.211.207,43.128.85.209]
Sep 29 14:36:57.157 [2611] dbg: check: tagrun - tag RELAYSEXTERNALREVIP is now 
ready, value: ARY:[124.129.10.170,73.233.187.66,120.31.211.207,43.128.85.209]
[…]
Sep 29 14:36:57.218 [2611] dbg: async: launching 
A/73.233.187.66.zen.spamhaus.org for dns:A:73.233.187.66.zen.spamhaus.org
Sep 29 14:36:57.219 [2611] dbg: dns: providing a callback for id: 
31199/IN/A/73.233.187.66.zen.spamhaus.org
Sep 29 14:36:57.219 [2611] dbg: async: starting: DNSBL-A, 
dns:A:73.233.187.66.zen.spamhaus.org (timeout 15.0s, min 3.0s)
Sep 29 14:36:57.378 [2611] dbg: async: calling callback on key 
dns:A:73.233.187.66.zen.spamhaus.org
Sep 29 14:36:57.378 [2611] dbg: dns: hit  
127.0.0.3

So this is normal behaviour then, for v3.4.2 at least?

Thanks,
Andy


Re: Correct way to allowlist an IP from DNSBL checks when it's not the final Received?

2023-09-28 Thread Andy Smith
Hello,

On Thu, Sep 28, 2023 at 06:48:54AM -0400, Jared Hall wrote:
> Do you mind if I redirect the below back onto the spamassassin list
> and respond to it there?

Well I was going to do that, but fair enough!

> On Thu, Sep 28, 2023 at 12:02:47AM -0400, Jared Hall wrote:
> > SpamAssassin doesn't arbitrarily pick a header to look at. lastexternal is
> > used per the defaults in 20_dnsbl_tests.cf

Okay so here is what I have:

Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124])
  by barenjager.bitfolk.com with esmtps 
(TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256)
  (Exim 4.92)
  (envelope-from )
  id 1qlVVV-0001zW-Jc
  for a...@strugglers.net; Wed, 27 Sep 2023 14:27:18 +
Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73])
  by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2,
  cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
  us-mta-473-x2wpeAY1NVC4XPDK8dEpYA-1; Wed, 27 Sep 2023 10:27:10 -0400

In the SpamAssassin report is:

  *  3.6 RCVD_IN_SBL_CSS RBL: Received via a relay in Spamhaus SBL-CSS
  *  [66.187.233.73 listed in zen.spamhaus.org]

barenjager.bitfolk.com is my MX which is running spamassassin called
from Exim using its built in means of calling out to SA from the
check_data ACL:

acl_check_data:

# …

warn message = X-barenjager.bitfolk.com-Spam-Report: $spam_report
spam = Debian-exim:true/defer_ok

What I gathered from Jared's reply is that SA shouldn't be doing
DNSBL checks against all of the IPs in all of the Received headers,
only the lastexternal one.

Here though, the lastexternal one should be 170.10.129.124 as that
is not in my internal_networks, but it seems to have done a check of
the one before it, 66.187.233.73, and found it in Spamhaus SBL-CSS.

Is that expected?

I guess I can allowlist from SPF as the envelope sender will be the
mailing list in question (linux-lvm-boun...@redhat.com) and it did
get a "SPF_PASS SPF: sender matches SPF record" so redhat.com must
have mimecast's relays correctly in it.

Thanks,
Andy


Correct way to allowlist an IP from DNSBL checks when it's not the final Received?

2023-09-27 Thread Andy Smith
Hi,

The IP address of a supplier is currently listed by Spamhaus
SBL-CSS.

This is not directly causing me to reject their emails,
because they are actually sending out through Mimecast. However,
SpamAssassin is finding that IP in the headers as the Received line
*before* Mimecast's, i.e. their listed host is the one handing off
to Mimecast, who are connecting to my MX.

How would I go about allowlisting this IP address against DNSBL
hits? Ideally for a specified range of from addresses and/or
envelope senders, but for every sender if necessary. I think I would
be okay with exempting such an IP address from *all* negative DNSBL
hits, at least temporarily.

My first thought was "allowlist_from rcvd", but I do not think this
will work as I think it only checks the first Received header
outside of my internal_networks.

The employees of the supplier send email with all manner of
addresses and the supplier also hosts mailing lists that are open to
the public so I cannot predict any from address for allowlisting
purposes.

I expect they will be delisted by the time I work this out, but it
would be good to know for the future!

Thanks,
Andy


Re: new rule for kam :)

2023-08-24 Thread Andy Smith
Hi,

On Wed, Aug 23, 2023 at 06:14:45PM -0700, John Hardin wrote:
> On Wed, 23 Aug 2023, Andy Smith wrote:
> > On Wed, Aug 23, 2023 at 03:24:22PM +0200, Benny Pedersen wrote:
> > > # test for empty src="" or empty href=""
> > > rawbody __HREF_EMPTY /href=\"\"/
> > > rawbody __SRC_EMPTY /src=\"\"/
> > 
> > I checked this against about 80k of my recent personal emails and it
> > matched quite a lot of previously not found spam, but did also match
> > on every auto response from one of my suppliers. It seems after
> > every customer service interaction they send a "how did we do? fill
> > in this survey" email from qualtrics.com which contains:
> > 
> >
> > 
> > It wouldn't be much of a loss, but it's not spam either.
> 
> How did they perform individually?

The only non-spam that matched for me was the above, with src="".
Everything with href="" was spam.

There was some overlap — some spam had both — but some spam had only
href="" and some spam had only src="".

I'm sure KAM has a much bigger corpus to do automated tests on…

Cheers,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting


Re: new rule for kam :)

2023-08-23 Thread Andy Smith
Hello,

On Wed, Aug 23, 2023 at 03:24:22PM +0200, Benny Pedersen wrote:
> # test for empty src="" or empty href=""
> rawbody __HREF_EMPTY /href=\"\"/
> rawbody __SRC_EMPTY /src=\"\"/

I checked this against about 80k of my recent personal emails and it
matched quite a lot of previously not found spam, but did also match
on every auto response from one of my suppliers. It seems after
every customer service interaction they send a "how did we do? fill
in this survey" email from qualtrics.com which contains:



It wouldn't be much of a loss, but it's not spam either.

Cheers,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting


Re: ip2location.com

2021-01-28 Thread Andy Smith
Hi Benny,

On Thu, Jan 28, 2021 at 03:06:12PM +0100, Benny Pedersen wrote:
> https://lite.ip2location.com/database/ip-asn
> 
> is it possible to use it in spamassassin ?

SpamAssassin already has an IP to ASN plugin:


https://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Plugin_ASN.html

Is there some other info from the above database(s) that you were
interested in?

The above plugin only exposes the ASN of the connecting IP address.
The DNS database that it uses (asn.routeviews.org) only provides the
ASN:

$ dig -t txt +noall +answer 232.80.119.85.asn.routeviews.org
232.80.119.85.asn.routeviews.org. 86400 IN TXT  "8943" "85.119.80.0" "21"

but there is the Cymru IP-to-ASN database that does provide more
info:

$ dig -t txt +noall +answer 232.80.119.85.origin.asn.cymru.com
232.80.119.85.origin.asn.cymru.com. 14400 IN TXT "8943 | 85.119.80.0/21 | GB | 
ripencc | 2010-03-03"

So perhaps another plugin along the lines of Plugin::ASN could be
used to get some of that info?

Cymru database docs at:
https://team-cymru.com/community-services/ip-asn-mapping/#dns

Cheers,
Andy


Re: The most efficient SPAM implementation ever

2020-10-11 Thread Andy Smith
Hello,

On Sun, Oct 11, 2020 at 10:20:32AM -0500, Ramon F Herrera wrote:
> On 10/11/2020 10:07 AM, Marc Roos wrote:
> >Now you can decide to reject email coming from (the whole of) sendgrid.
> 
> I am the one who is a client of sendgrid.

Are you aware that you've posted this to a list where it is an
ongoing topic of discussion for the last year or so how to block the
torrent of spam and phishing from SendGrid without blocking the
legitimate email?

> They provide legitimacy. Highly recommended

>From my point of view I'd prefer if people didn't use SendGrid as
then it would become more feasible to block them entirely. They
currently "provide legitimacy" only on the basis of them being "too
big to block"; I am not sure if that is something to be encouraged
by throwing them more business.

Cheers,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting


Re: mark emails as being spam originating from an ip range owner

2020-09-29 Thread Andy Smith
Hello,

On Tue, Sep 29, 2020 at 10:49:36AM +0200, Marc Roos wrote:
> How can I mark emails as being spam originating from an ip range owned 
> by xserver.ua?
> 
> % Abuse contact for '176.103.48.0 - 176.103.63.255' is 

I' not sure if blacklist_from accepts IP addresses or CIDR ranges,
but if it does:

blacklist_from 176.103.48.0/20

Or consider using ASN plugin:


https://spamassassin.apache.org/full/3.4.x/doc/Mail_SpamAssassin_Plugin_ASN.html

and then adding a rule that penalises everything from ASN 48031:

header  LOCAL_SPAMMY_ASN_XSERVERX-ASN =~ /\b48031\b/
score   LOCAL_SPAMMY_ASN_XSERVER5.0
describeLOCAL_SPAMMY_ASN_XSERVERToo much spam from xserver.ua (AS48031)

Cheers,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting


SendGrid (Was: Re: Freshdesk (again))

2020-06-26 Thread Andy Smith
Hello,

On Fri, Jun 26, 2020 at 07:32:09PM -0600, Grant Taylor wrote:
> I've got to say, between NANOG, SDLU, and SpamAssassin, I see a LOT of
> complaints about Sendgrid.

Also mailop. Have personally received phishing mails through
SendGrid in the last 2 weeks in the name of citrix.com,
microsoft.com and netflix.com. The Citrix one was to a hostmaster@
address. It's hard to comprehend how SendGrid could be doing a worse
job of this, for so many months now.

Yet their list of legit clients is large, so they remain unblockable
for me. I just wish those clients knew how little SendGrid would do
to prevent their other customers sending out phishing emails in
their name.

Cheers,
Andy


ASN plugin matches IPv6 addresses against IPv4 DNS lists

2018-11-26 Thread Andy Smith
Hi,

I'm subscribed to this long-standing bug and saw it had an update
today basically saying that it's still broken in 3.4.2:

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7211

And I agree, it is still broken in 3.4.2. An IPv6 address will be
looked up in a DNS list that contains IPv4 addresses and will
sometimes match.

So, firstly, could the bug be re-opened?

Secondly, can we discuss how to fix it? Back in 2017 (comment 6) I
proposed adding an "asn_lookup_v6" as well as the existing
"asn_lookup" and querying IPv6 addresses only in asn_lookup_v6 if
set.

I've never developed a plugin before but if that approach is
acceptable then I can look into doing it.

Cheers,
Andy


Re: Scans and Invoice spam containg HREF to something bad

2018-06-19 Thread Andy Smith
This has literally just come through to me, zero BAYES and got passed my
custom rule as the HREF URL has changed: 

https://pastebin.com/pBfhXd6B 

thanks, Andy.

On 19-06-2018 17:33, Kevin A. McGrail wrote:

> Well you are welcome to send me new Spamples to look at.  As I noted, I've 
> never seen these variants and RBLs aren't hitting them which ALSO means you 
> have some new variants. 
> 
> Regards, 
> KAM

Re: Scans and Invoice spam containg HREF to something bad

2018-06-19 Thread Andy Smith
Hi Kevin, 

  I'm not really getting any joy with the RBLs. I have, for example, a
sample from the 14th and, taking away my custom rule, Bayes and KAM
scores, the default score would be "0" :(  

Content here: https://pastebin.com/dthDn8yb 

thanks, Andy. 

On 19-06-2018 17:12, Kevin A. McGrail wrote:

> The warnings are OK though make sure you have the nonKAMrules.cf as well. 
> 
> I'm not seeing really any of these spamples for us and agree.  It's scoring 
> in the 1.2 range for me. 
> 
> Clearly seems to be compromised url so RBLs are you likely bet but you might 
> be a patient 0 for a new engine.   
> 
> -- 
> Kevin A. McGrail

Re: Scans and Invoice spam containg HREF to something bad

2018-06-19 Thread Andy Smith
Hi Kevin, 

  No I wasn't. I just added it, I get a lot of errors like "meta test
KAM_WARRANTY3 has dependency 'CBJ_GiveMeABreak' with a zero score", is
this normal? 

Testing despite these errors the only rule I'm getting a hit on from KAM
is JMQ_SPF_NEUTRAL_ALL 

thanks, Andy. 

On 19-06-2018 16:51, Kevin A. McGrail wrote:

> Are you using the KAM.cf ruleset?

Scans and Invoice spam containg HREF to something bad

2018-06-19 Thread Andy Smith
Hi all, 

  the last week or so we are having a lot of problems with emails either
with subjects like "New Approach Contractors Ltd wants to share Scan" or
"Invoice INV-03056 from Encompass Environmental Ltd" which contian an
HREF to see your "scan" or "invoice" at a URL ending  /share or
/directory respectively. These aren't detected by Spamassassin, I have
Razor and iHash configured running on Spamassassin 3.4.1. Even when I
have Bayes learn a few examples, subsequent Spams can get Bayes as low
as 50%. 

Example: https://pastebin.com/85v2nHkF 

My question is does anyone have any ideas/tips/rules for catching these.
I've created a custom rule that checks for the subject and HREF, but
ever time a new variant comes out I'll have to update this. Anyone got
any better solutions? 

thanks in advance, Andy.

Re: what is triggering NO_DNS_FOR_FROM

2017-03-16 Thread Andy Smith
Thanks all who replied to my question, sorry for the late reply. 

It seems this was a temporary error on the senders DNS servers (I assume
as I've only seen this issue on their email). Rerunning spamassassin on
the same message now doesn't trigger NO_DNS_FOR_FROM. 

Thanks Matus, yes I know the MX isn't the same as the senders IP, in
Exim if the sending IP PTR doesn't match a subsequent lookup of the
returned FQDN in the PTR then Exim marks the mail as being sent from a
server without rDNS (even though a PTR exists) and therefore triggers
RDNS_NONE in spamassassin. Not sure if this behaviour is typical in
other SMTP servers. 

Thanks also RW for the tips about "-D" and envelope_sender_header
documentation. Noted for future reference!

many thanks, Andy.

what is triggering NO_DNS_FOR_FROM

2017-03-13 Thread Andy Smith
Hi all, 

  I have a some genuine emails getting marked with NO_DNS_FOR_FROM from
one particular domain and I'd like to know exactly why. I've had a dig
in the Spamassasin Dns.pm but I can't work out exactly what
process_dnsbl_result is doing. What exactly does it check WRT MX and A
records? 

I can see that the domain in question does have A and MX records,
possibly issues are that the A record doesn't match the PTR for the IP
returned by the A record and that one of the MX records doesn't have a
PTR. I'd be keen to know if one or both of these are the issue, and what
the situation WRT RFCs on email DNS says about what are required for
proper operation of email. 

I've already had to ask the owners of the domain to correct an issue
where their sending server's A record didn't match the PTR and was
triggering the RDNS_NONE rule (as detected by Exim), so if I'm going to
convince them to do more modifications I'd prefer to know what I was
talking about, 

thanks, Andy.

ASN plugin and IPv6 addresses

2017-02-25 Thread Andy Smith
Hi,

I'm using version 3.4.0 on Debian stable.

I noticed that when presented with some IPv6 addresses, the ASN
plugin is actually querying them as an IPv4 address e.g. turning
2600:… into  2.0.0.0 and coming back with the wrong ASN.

This appears to already be documented in the bugzilla:

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7211

but the conclusion there seems to be that the plugin just needs to
be configured correctly. So how would one do that?

Looking at the perldoc I see:

asn_lookup asn-zone.example.com [ _ASNTAG_ _ASNCIDRTAG_ ]
Use this to lookup the ASN info in the specified zone for the first
external IP address and add the AS number to the first specified tag and
routing info to the second specified tag.

[…]

If two or more asn_lookups use the same set of template tags, the 
results
of their lookups will be appended to each other in the template tag 
values
in no particular order. Duplicate results will be omitted when combining
results. In a similar fashion, you can also use the same template tag 
for
both the AS number tag and the routing info tag.

The thing is, I can't find one DNS zone that will answer queries
for both IPv4 and IPv6. I can add asn_lookup directives for both,
e.g.:

asn_lookup origin.asn.cymru.com _ASN_ _ASNCIDR_
asn_lookup origin6.asn.cymru.com_ASN_ _ASNCIDR_

but what then happens is that an erroneous v6-as-v4 result from the
first one gets included together with the (correct) answer from
origin6.asn.cymru.com.

What is the correct way of configuring this? Doesn't the plugin need
two different asn_lookup directives, one for IPv4 and one for IPv6,
with only the relevant queries being directed at each?

Cheers,
Andy


collecting mail for sa-learn, how to?

2008-07-17 Thread Andy Smith
Hi, 
for a mail server running email for multiple domains what is the 
typical/recommended way to collect emails which arent detected as spam to be 
processed by sa-learn? Users are downloading mail via POP3, so once a users 
sees a mail and decides that it is in fact spam its already been removed from 
the mail server. If the user forwards the mail to a special mailbox for 
processing then the mail is obviously now different from the original spam, the 
user is the sender etc. Will sa-learn still work using this method? and if not 
what else can I implement that would work? 

thanks for any comments, Andy :P 



Re: collecting mail for sa-learn, how to?

2008-07-17 Thread Andy Smith
Soz, I just saw that. Until today my attempts to mail the subscibe address 
on this list were'nt resulting in an autoreply etc.
I only recieved confirmation I was subscribed to this list some 20 mins ago, 
im taking a look now at the replys


thanks Andy.


- Original Message - 
From: Karsten Bräckelmann [EMAIL PROTECTED]

To: Andy Smith [EMAIL PROTECTED]
Cc: users@spamassassin.apache.org
Sent: Thursday, July 17, 2008 2:23 PM
Subject: Re: collecting mail for sa-learn, how to?



Are you actually READING this list?

Sent Jul 11, Jul 14, and now again Jul 17. Identical text, including
typos. Got quite a few replies and discussion. No follow up by you,
though.

Please stop sending the same question over and over again, if you are
not reading the replies.

 guenther


--
char 
*t=[EMAIL PROTECTED];
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? 
c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ 
putchar(t[s]);h=m;s=0; }}}







Re: collecting mail for sa-learn, how to?

2008-07-17 Thread Andy Smith

Hi All,

 thanks very much for all the replies and discussion around my original 
post, and appologies for not replying
more promptly, Ive only just managed to successfully subscribe to the list 
and managed to confuse myself looking
at the forum archives (I think there had been some delays to when my posts 
appeared blah blah blah :P )


Anyway, thanks for clarifying the requirements of sa-learn.
I think the best options sounds like it will be this:

This is what I do:
Forwarding the unrecognised message to an account which will process the
message through sal-wrapper.pl. You will find further informations here:
https://po2.uni-stuttgart.de/~rusjako/sal-wrapper

Thanks to Stefan for this suggestion.

Reason being it doesnt impose the need for an IMAP client on the users and I 
think its the simplest
option from the point of view of the end users, ie if you recieve spam and 
wish to report it please
simply forward the email and it will be analysed by the spam filter 
software Sounds good to me :P

Ive downloaded this and and will do some eval on my systems.

thanks alot!! Andy. 



Re: problems using haproxy for spamd

2007-05-05 Thread Andy Smith
On Mon, Apr 30, 2007 at 01:23:23AM +, Andy Smith wrote:
 Hi,
 
 I'm trying to use haproxy (http://haproxy.1wt.eu/) to load balance 3
 spamd servers on the same network.

[...]

 Unfortunately I seem to be intermittently getting connection
 failures.  The haproxy log looks like this:
 
 Apr 28 05:13:49 localhost haproxy[11683]: Proxy spamd started.
 Apr 28 05:14:57 localhost haproxy[11684]: 212.13.194.70:55827 
 [28/Apr/2007:05:14:57] spamd corona 0/0/148 765 -- 0/0/0 0/0
 Apr 28 05:14:57 localhost haproxy[11684]: 212.13.194.70:55828 
 [28/Apr/2007:05:14:57] spamd curacao 0/-1/1 0 CC 0/0/0 0/0

It turned out to be a bug in haproxy.  I sent an strace to the
author, Willy Tarreau, and he replied in less than 24 hours with a
full annotation of the strace and a patch to fix it.  That's
service!

The bug manifested itself when the client would connect, send all
its data and shutdown before haproxy had successfully established a
connection with the backend server.  If haproxy managed to establish
a connection before the client fnished sending then it would work
fine.  Here's the simple fix:

diff --git a/haproxy.c b/haproxy.c
index 8e57700..357a37a 100644
--- a/haproxy.c
+++ b/haproxy.c
@@ -5589,7 +5589,7 @@ int process_srv(struct session *t) {
 else if (s == SV_STCONN) { /* connection in progress */
if (c == CL_STCLOSE || c == CL_STSHUTW ||
(c == CL_STSHUTR 
-(t-req-l == 0 || t-proxy-options  PR_O_ABRT_CLOSE))) { /* 
give up */
+((t-req-l == 0  t-res_sw == RES_SILENT) || t-proxy-options 
 PR_O_ABRT_CLOSE))) { /* give up */
tv_eternity(t-cnexpire);
fd_delete(t-srv_fd);
if (t-srv)

Cheers,
Andy


signature.asc
Description: Digital signature


problems using haproxy for spamd

2007-04-29 Thread Andy Smith
Hi,

I'm trying to use haproxy (http://haproxy.1wt.eu/) to load balance 3
spamd servers on the same network.

Here's my haproxy config:

global
log 127.0.0.1 local0 debug
maxconn 100
ulimit-n 512
uid 999
gid 999
daemon
pidfile /var/run/haproxy-spamd.pid

listen spamd
bind 212.13.194.5:783
mode tcp
option tcplog
log global
balance roundrobin
source 212.13.194.5:0
clitimeout 15
srvtimeout 15
contimeout 3
server corona  212.13.194.122:783 weight 5
server curacao 212.13.194.71:783  weight 5
server islay   212.13.194.96:783  weight 6

Unfortunately I seem to be intermittently getting connection
failures.  The haproxy log looks like this:

Apr 28 05:13:49 localhost haproxy[11683]: Proxy spamd started.
Apr 28 05:14:57 localhost haproxy[11684]: 212.13.194.70:55827 
[28/Apr/2007:05:14:57] spamd corona 0/0/148 765 -- 0/0/0 0/0
Apr 28 05:14:57 localhost haproxy[11684]: 212.13.194.70:55828 
[28/Apr/2007:05:14:57] spamd curacao 0/-1/1 0 CC 0/0/0 0/0
Apr 28 05:16:07 localhost haproxy[11684]: 212.13.194.70:55858 
[28/Apr/2007:05:16:07] spamd islay 0/-1/0 0 CC 0/0/0 0/0
Apr 28 05:16:08 localhost haproxy[11684]: 212.13.194.70:55859 
[28/Apr/2007:05:16:07] spamd corona 0/0/327 4369 -- 0/0/0 0/0
Apr 28 05:17:04 localhost haproxy[11684]: 212.13.194.70:55863 
[28/Apr/2007:05:17:02] spamd curacao 0/0/2419 839 -- 0/0/0 0/0
Apr 28 05:17:04 localhost haproxy[11684]: 212.13.194.70:55864 
[28/Apr/2007:05:17:04] spamd islay 0/-1/0 0 CC 0/0/0 0/0
Apr 28 05:25:38 localhost haproxy[11684]: 212.13.194.70:54248 
[28/Apr/2007:05:25:37] spamd corona 0/0/492 3930 -- 0/0/0 0/0
Apr 28 05:26:12 localhost haproxy[11684]: 212.13.194.70:54254 
[28/Apr/2007:05:26:12] spamd islay 0/-1/4 0 CC 0/0/0 0/0
Apr 28 05:26:12 localhost haproxy[11684]: 212.13.194.70:54255 
[28/Apr/2007:05:26:12] spamd curacao 0/-1/10 0 CC 0/0/0 0/0

According to http://haproxy.1wt.eu/download/1.2/doc/haproxy-en.txt
state CC means that the client aborted the connection before it
could be passed to any backend server.  As you can see above this
does not happen to every connection.

Yet on the connections that aborted with status CC, the server did
actually receive them and deal with them:

Apr 28 05:26:12 islay spamd[861]: spamd: connection from 212.13.194.5 
[212.13.194.5] at port 48949
Apr 28 05:26:13 islay spamd[861]: spamd: processing message [EMAIL PROTECTED] 
aka [EMAIL PROTECTED] for Debian-exim:102
Apr 28 05:26:17 islay spamd[861]: spamd: clean message (-2.2/5.0) for 
Debian-exim:102 in 4.9 seconds, 4055 bytes.
Apr 28 05:26:17 islay spamd[861]: spamd: result: . -2 - 
AWL,BAYES_00,FORGED_RCVD_HELO 
scantime=4.9,size=4055,user=Debian-exim,uid=102,required_score=5.0,rhost=212.13.194.5,raddr=212.13.194.5,rport=48949,mid=[EMAIL
 PROTECTED],rmid=[EMAIL PROTECTED],bayes=0,autolearn=ham

in Exim this was reported as a protocol error though:

2007-04-28 05:26:12 1HhfRA-00065X-P0 spam acl condition: cannot parse spamd 
output
2007-04-28 05:26:12 1HhfRA-00065X-P0 = [EMAIL PROTECTED] H=murphy.debian.org 
[70.103.162.31] P=esmtp S=4176 [EMAIL PROTECTED]
2007-04-28 05:26:14 1HhfRA-00065X-P0 = andy [EMAIL PROTECTED] R=procmail 
T=procmail_pipe
2007-04-28 05:26:14 1HhfRA-00065X-P0 Completed

Seems like Exim must have sent data to spamd, but the saw some
problem and aborted the connection.

I've tried telnetting to the listen address/port over and over and
never see anything other than what I expect.  If I give Exim the IPs
of the spamd servers directly then it works fine.  I'm using version
3.1.7-1~bpo.1 from Debian backports.

Do anyone have any ideas what I might be doing wrong here?  Any tips
for getting more info on what might be going wrong?

Alternatively, can anyone recommend some other open source software
load balancing solution?  Preferably one that will let me direct to
least busy server or to set a per-server concurrent connection
limit.

Cheers,
Andy


signature.asc
Description: Digital signature


Re: rejectlog

2005-11-11 Thread Andy Smith
On Thu, Nov 10, 2005 at 04:08:56PM +0100, nick wrote:
 Rejecting the mail after DATA?
 
 Spamassassin runs behind my MTA, if the sender passes blacklist checks 
 and any other obvious no-nos, it's then passed to spamassassin which 
 NEVER discards email, but places them in a spam folder.
 
 Discarding emails based on a spam score is a bad idea. As you can see 
 quite clearly, the reasons behind the discard/tagging aren't logged, so 
 false positives can't be corrected.

It is a bad idea if you set it up so it doesn't log anything, yes.
Anything done badly is a bad idea.

It is however perfectly possible to set up Exim and sa-exim to use
spamassassin to reject mail after DATA giving a full reason why in
the log file and the reject message and still keeping a copy on
disk.

A reject with a useful message combined with keeping the message on
disk for a reasonable period of time is in many cases BETTER than
accepting and silently filing away in a spam folder, because the
entity with the most desire to see the mail delivered -- the sender
-- is the one who gets notified via the usual SMTP mechanism that it
did not get delivered.

Having the spare time to look through my spamassassin thinks this
is spam folder for false positives is a thing of the past; I would
much rather reject as much as possible and only have to check the
borderline stuff.

Andy


signature.asc
Description: Digital signature


Re: Stopping Rules

2005-10-22 Thread Andy Smith
On Sat, Oct 22, 2005 at 11:05:07AM -0400, Chris L. Franklin wrote:
 For starters AWL, white lists and black lists in my option ar ethe worst 
 things ever. I disable them from the start. If your going to whitelist 
 some one, why would you want them to even go though SA. (I don't)

Because a source that regularly sends you legit email, e.g. a
mailing list, might send email that is borderline spammy and the
only thing that tips it back into legitimate territory is the
autowhitelist and bayes based on what YOUR users consider ham.

 if there blaklisted I don't want them even want the server
 accepting a email for me / the user if they are black listed.

There are lots of blacklists and DNSBLs that work best as
contributors, not as absolute yes/no arbiters of what should be
accepted.

 And again negative-scoring is useless if u need to write a negative 
 score you problitly should rethink your positive scoring rules.

I don't understand why you are using SpamAssassin if you really
believe the above.

 All this taking into a account Removing AWL, and negative-scoring. There 
 are no real problems.
 
 And as a side note about net rules, if your really into using these then 
 you'll probabliy just want to tune the server not to accept email from 
 non-RDNS or invaild dns lookups.

Masses of legitimate email comes from hosts with no reverse DNS,
incorrect HELO and other borderline or actual RFC violations.

I don't think you have thought this through and I believe that you
would do well to accept some of the wisdom of those that have.  If
not, well, try it, and report back as to how well that works out for
you, so that everyone else can see how wrong they are.



signature.asc
Description: Digital signature


Individual timings of spamassassin rules?

2005-10-13 Thread Andy Smith
Hi,

On one of my machines I'm running v3.0.3 under spamd with a fairly default
config for debian sarge.  This is a reasonable spec machine, a 3GHz P4 that is
not swapping, but I'm seeing that each message seems to take quite a while to
check, between 3.5 and 15 seconds each (I'd say averaging at about 5 seconds per
message)

I have read:

http://wiki.apache.org/spamassassin/FasterPerformance

but I'm wondering if there is any easier way to get an overview of which rules
are taking the longest period of time to complete, other than removing the
rulesets one by one?

I'm sure there is some timeout somewhere that will be easy to fix.

Thanks,
Andy



Re: Individual timings of spamassassin rules?

2005-10-13 Thread Andy Smith
On Thu, Oct 13, 2005 at 05:17:49AM -0700, Loren Wilton wrote:
  On one of my machines I'm running v3.0.3 under spamd with a
  fairly default config for debian sarge.  This is a reasonable
  spec machine, a 3GHz P4 that is  not swapping, but I'm seeing
  that each message seems to take quite a while  to check, between
  3.5 and 15 seconds each (I'd say averaging at about 5 seconds
  per message)
 
 You don't mention if you are CPU bound.

This doesn't appear to be the case: low load average, mostly idle
cpu, no significant iowait.

 If you aren't swapping, then generally slow processing for a given
 email is more likely related to the time it takes to get the
 responses for network checks than actual CPU time required to scan
 the mail.  Often the correct solution is a local caching name
 server.

I'm sorry I didn't specify that there is a local caching server too,
and network access is otherwise snappy.

I was hoping to be able to get a list of DNS/URIBLs and other
external checks (razor2, pyzor, dcc) along with their timings to see
where the problems lie.  Possibly there is something for me to fix,
or DNLSlists I could locally host, etc.

But I realise the parallel nature of the checks makes this more
difficult.


signature.asc
Description: Digital signature