qq.com rule false positives

2023-11-19 Thread Sean Greenslade
Hi, all. I received a mail from a qq.com user that went over the spam
threshold. From the rules that triggered, it looks like the dynamic rDNS
rules triggered on the qq.com sending server, which contributed around
4.2 points to this message (which was not spam). Relevant headers:

X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-14) on snowy
X-Spam-Flag: YES
X-Spam-Level: *
X-Spam-Status: Yes, score=5.7 required=5.2 tests=BAYES_50,DKIM_SIGNED,
DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DYN_RDNS_AND_INLINE_IMAGE,
FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FROM_EXCESS_BASE64,
HELO_DYNAMIC_IPADDR,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RDNS_DYNAMIC,
SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=disabled
version=4.0.0
X-Spam-Report:
* -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, 
no
*  trust
*  [203.205.221.192 listed in list.dnswl.org]
* -0.2 SPF_PASS SPF: sender matches SPF record
*  0.1 SPF_HELO_NONE SPF: HELO does not publish an SPF Record
* -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from 
author's
*   domain
* -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from
*  envelope-from domain
* -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
*  0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
*  valid
*  1.5 BAYES_50 BODY: Bayes spam probability is 40 to 60%
*  [score: 0.5000]
*  0.2 FREEMAIL_FROM Sender email is commonly abused enduser mail 
provider
*  [(at)qq.com]
*  0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends 
in
*  digit
*  [(at)qq.com]
*  0.0 HTML_MESSAGE BODY: HTML included in message
*  1.0 RDNS_DYNAMIC Delivered to internal network by host with
*  dynamic-looking rDNS
* -0.0 T_SCC_BODY_TEXT_LINE No description available.
*  1.2 DYN_RDNS_AND_INLINE_IMAGE Contains image, and was sent by dynamic
*  rDNS
*  0.0 FROM_EXCESS_BASE64 From: base64 encoded unnecessarily
*  2.0 HELO_DYNAMIC_IPADDR Relay HELO'd using suspicious hostname (IP 
addr
*  1)
Received: from out203-205-221-192.mail.qq.com (out203-205-221-192.mail.qq.com 
[203.205.221.192])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(No client certificate requested)
by snowy.routify.me (Postfix) with ESMTPS id B8E0C23484
for ; Thu, 16 Nov 2023 09:09:32 + (UTC)

I can totally see why that sending rDNS looks dynamic, but perhaps there
should be a special case exception for mail.qq.com, since that seems to
be their template for all sending servers.

--Sean



Re: Really hard-to-filter spam

2023-08-05 Thread Sean Greenslade
On Fri, Aug 04, 2023 at 08:38:24AM -0500, Thomas Cameron wrote:
> It was a typo, sorry. I have a cron job that uses --spam against the spam
> folder, and --ham against the ham folder. I just copied and pasted poorly.
> This is the actual script for my account:
> 
> [thomas.cameron@mail-east ~]$ cat bin/spamcheck
> #!/bin/bash
> sa-learn --progress --spam --mbox /home/thomas.cameron/mail/INBOX/spam
> sa-learn --progress --ham --mbox /home/thomas.cameron/mail/INBOX/ham
> 
> Bayes tests for other messages, like the one you sent me, looks like this:
> 
> --
> Return-Path: 
> X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
>   mail-east.camerontech.com
> X-Spam-Level:
> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,
>   DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,
>   SPF_PASS,T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham
>   autolearn_force=no version=3.4.6
> --
> 
> But messages flagged as spam look like this:
> 
> --
> Return-Path:
> 
> X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
>   mail-east.camerontech.com
> X-Spam-Flag: YES
> X-Spam-Level: 
> X-Spam-Status: Yes, score=36.8 required=5.0 tests=BAYES_99,BAYES_999,
>   DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FROM_FMBLA_NEWDOM,
>   FROM_SUSPICIOUS_NTLD,FROM_SUSPICIOUS_NTLD_FP,HTML_IMAGE_ONLY_32,
>   HTML_MESSAGE,PDS_OTHER_BAD_TLD,RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,
>   RCVD_IN_DNSWL_HI,RDNS_NONE,SH_HELO_DBL,SH_HELO_ZRD_FRESH,
>   SH_ZRD_HEADERS_FRESH,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,
>   URIBL_ABUSE_SURBL,URIBL_BLACK,URIBL_ZRD shortcircuit=no autolearn=spam
>   autolearn_force=no version=3.4.6
> --
> 
> The previous email I copied headers from as an example was just a bad
> example. Usually Bayes is /pretty/ accurate on my system. I only used that
> one because it was a message which made it through SpamAssassin. I was
> trying to demonstrate that the checks were not failing, as suggested in an
> earlier comment.
> 
> Thanks for catching that, though. I have made silly mistakes like that so I
> appreciate you checking me.

In that case, I think I can only offer some general suggestions that I
personally follow.

I have the autolearn function completely disabled. In my experience, if
you have a decent training corpus of known ham and known spam, autolearn
doesn't really add anything.

Like yours, my bayes results are usually quite accurate. At this point,
I only train messages that are actually false positives or false
negatives. I can't say for sure how effective this is, but my intuition
is that by only training on "hard" messages (meaning ones that the
non-bayes SA rules couldn't take care of on their own), I'm keeping the
bayes engine focused on the most important messages to classify
correctly. Your above spample has such a high score, my mail server
would have rejected that message at SMTP time even if it had triggered
BAYES_00. I wouldn't bother training such a message; the rest of the
rules have it covered.

Another thing to note is that spam tends to change over time. Having
really old spams in your bayes DB could be diluting its effectiveness by
having it look for signs that the current crop of spams don't show. It
might be worth starting fresh with an empty bayes db and training just a
few hundred of your most recent hams and spams.

And finally, if there's something consistent about the messages, don't
be afraid to write a manual rule. I have a few special rules in my
configs that alter the bayes scoring based on other aspects of the
messages.

--Sean



Re: Really hard-to-filter spam

2023-08-04 Thread Sean Greenslade
On Wed, Aug 02, 2023 at 04:17:22PM -0500, Thomas Cameron via users wrote:
> On 8/2/23 15:52, David B Funk wrote:
>
> 
>
> I have the users move spam to an imap folder, and then run (via the user's
> cron job):
> 
> sa-learn --mbox --spam /home/[username]/mail/spam
> 
> If something is flagged as spam and it's not supposed to be, I have them
> copy it to the ham folder and I run (also via cron job):
> 
> sa-learn --mbox --ham /home/[username]/mail/spam

  
Hopefully this is just a typo in your email, but the above line trains
your spam folder as if it's ham. That could easily cause your screwed-up
bayes scores.

--Sean



Re: Memory requirement for SpamAssassin/Postfix/Roundcube/Dovecot stack

2022-05-27 Thread Sean Greenslade
On Thu, May 26, 2022 at 02:12:01PM -0600, Grant Taylor wrote:
> On 5/26/22 8:32 AM, Ian Evans wrote:
> > Is it safe to assume that a $5/mth 1gig memory account will laugh at the
> > resources needed to run a SpamAssassin/Postfix/Roundcube/Dovecot/Nginx
> > stack and not ever break a sweat?
> 
> Sadly, I found that I needed to quit tilting at the 1GB memory windmill and
> upgraded my tiny VPSs to 2GB for SpamAssassin + ClamAV + some other milters.
> 
> You /might/ be able to get SpamAssassin in 1GB, but I don't know what else
> will be on the system.

You can quite confortably fit SA and a full SMTP + IMAP stack in less
than 1 GB. My (admittedly low volume) mail server is currently sitting
at 340 MB of used memory and is running:
- Postfix
- Dovecot
- Spamassassin
- spamass-milter
- opendkim milter
- Various python mail sorting / organizing scripts
- openssh server
- BIND9 (master DNS server)
- Radicale (DAV server)
- Weave (Firefox sync server)
- Nginx (reverse proxy)

I haven't found the need for any sort of AV scanner. Some SA rules that
reject messages with executable attachments have been more than adequate
for me.

--Sean



Re: SPF check though external relay

2017-11-13 Thread Sean Greenslade
>On 11.11.17 20:06, Sean Greenslade wrote:
>>SPF checks the final server that transmits the mail. If you are using
>a relay server, that server will need to be in the SPF records.
>
>no. Only outgoing mail servers really need to be in SPF records.

Sorry, I misread the original message and thought this was in reference to 
outgoing mail.

--Sean




Re: SPF check though external relay

2017-11-11 Thread Sean Greenslade
On November 11, 2017 5:31:08 PM PST, Stephan Herker  wrote:
>I'm running spam assassin default configuration which checks spf 
>records.  In my case I received an email and it checked if the last 
>relay was a valid sender for SPF.  The last relay was a server I have
>in 
>the cloud, so it failed SPF even though original sending server is on 
>senders SPF record.  Should I disable SPF checks or is there a 
>configuration change I need to make?

SPF checks the final server that transmits the mail. If you are using a relay 
server, that server will need to be in the SPF records.

--Sean



Re: How to undo ham-ing a message

2017-04-01 Thread Sean Greenslade
On March 31, 2017 2:36:41 PM PDT, David Niklas  wrote:
>Hello,
>I accidentally learned a single message as ham from the menu of my MUA
>claws-mail.
>I immediately re-learned it as spam, but I want to know if there is
>anything else I might want to do to reverse the ham-ing process.

Nope, that's all you need to do.

--Sean



Re: training the filter

2016-11-07 Thread Sean Greenslade
On November 7, 2016 9:26:29 AM PST, Eric Abrahamsen  
wrote:
>What a lot of people (including myself) do is have two IMAP folders
>learn/spam and learn/ham. When a message is incorrectly classified you
>put it in the right folder, then run sa-learn on a cron job, looking in
>the appropriate folder, then afterwards move the message to Junk or
>INBOX, depending.

I actually took this approach a little further. I have a script that monitors 
the learn-spam and learn-ham maildirs with inotify. As soon as a message is 
moved to those dirs, it gets learned and fed back to my sorting script. That 
way I don't have to do anything other than move to the learn dir.

--Sean





Re: TxRep very slow

2016-11-03 Thread Sean Greenslade
On November 3, 2016 11:41:07 AM PDT, Birta Levente  
wrote:
>I do not use spamassissin daemon. It's called by amavisd 2.10
>

You're probably better off asking on an amavis list in that case. I have no 
experience with amavis.

However, given that it seems to be a lock contention issue, you might see if 
there's any setting in amavis to prevent parallel tests.

--Sean




Re: TxRep very slow

2016-11-03 Thread Sean Greenslade
On October 13, 2016 5:39:50 AM PDT, Levente Birta  wrote:
>Hi
>
>I have postfix with amavisd as content_filter and spamassassin 3.4.2
>When I enable the TxRep plugin the mail stay very long in the SA check:
>
>
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: locker: mode is
>384
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: created 
>/var/spool/amavisd/.spamassassin/tx-reputation.lock.wsrv.benvenutionline.ro.24727
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 0 retries
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: link to /var/spool/amavisd/.spamassassin/tx-reputation.lock:
>
>link ok
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: auto-whitelist: 
>tie-ing to DB file of type DB_File R/W in 
>/var/spool/amavisd/.spamassassin/tx-reputation
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: auto-whitelist: 
>db-based 55ff016ac8b77d76f3c2bd742dd31c10becb6023@sa_generated|ip=none 
>scores 0/0
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: check: tagrun - 
>tag TXREP_MSG_ID is now ready, value: 0.0
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: check: tagrun - 
>tag TXREP_MSG_ID_COUNT is now ready, value: 0.0
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: check: tagrun - 
>tag TXREP_MSG_ID_PRESCORE is now ready, value: -1.7
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: TxRep: 
>reputation: none, count: 0, weight: 1.0, delta: 0.000, MSG_ID: 
>55ff016ac8b77d76f3c2bd742dd31c10becb6023@sa_generated
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: TxRep: active, 
>55ff016ac8b77d76f3c2bd742dd31c10becb6023@sa_generated pre-score: -1.72,
>
>autolearn score: -1.719, IP: 81.196.63.17, address: x...@gmail.com 
>signed by gmail.com
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: locker: mode is
>384
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: created 
>/var/spool/amavisd/.spamassassin/tx-reputation.lock.wsrv.benvenutionline.ro.24727
>Oct 13 15:28:40 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 0 retries
>Oct 13 15:28:41 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 1 retries
>Oct 13 15:28:42 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 2 retries
>Oct 13 15:28:43 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 3 retries
>Oct 13 15:28:44 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 4 retries
>Oct 13 15:28:45 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 5 retries
>Oct 13 15:28:46 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 6 retries
>Oct 13 15:28:47 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 7 retries
>Oct 13 15:28:48 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 8 retries
>Oct 13 15:28:50 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 9 retries
>Oct 13 15:28:51 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 10 retries
>Oct 13 15:28:52 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 11 retries
>Oct 13 15:28:53 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 12 retries
>Oct 13 15:28:54 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 13 retries
>Oct 13 15:28:55 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 14 retries
>Oct 13 15:28:56 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 15 retries
>Oct 13 15:28:57 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get lock on 
>/var/spool/amavisd/.spamassassin/tx-reputation with 16 retries
>Oct 13 15:28:58 wsrv amavis[24727]: (24727-01) SA dbg: locker: 
>safe_lock: trying to get loc

Re: HTTPS_HTTP_MISMATCH and explanation

2016-09-25 Thread Sean Greenslade
On Sun, Sep 25, 2016 at 07:57:37PM -0400, Alex wrote:
> I think the rule still has a use, perhaps in a meta or something.

I believe (though don't quote me on this) that a zero-weight rule will
still be checked if it's used as part of a metarule.

--Sean



Re: HTTPS_HTTP_MISMATCH and explanation

2016-09-25 Thread Sean Greenslade
On Sun, Sep 25, 2016 at 04:51:20PM -0400, Alex wrote:
> On Sun, Sep 25, 2016 at 4:41 PM, Sean Greenslade
>  wrote:
> > On Sun, Sep 25, 2016 at 03:54:53PM -0400, Alex wrote:
> >> > If you want to see what that rule's code looks like, here's a link:
> >> >
> >> > https://fossies.org/dox/Mail-SpamAssassin-3.4.1/classMail_1_1SpamAssassin_1_1Plugin_1_1HTTPSMismatch.html
> >> >
> >> > It's possible there is a bug in that rule. If you send it through
> >> > SpamAssassin with debug enabled, the rule should print out the domain
> >> > pairs that trigger it. Maybe try that, and see if what it outputs makes
> >> > sense.
> >>
> >> I should have mentioned that I tried that - it doesn't.
> >
> > If you don't mind sending the entire email, I'm curious now.
> 
> http://pastebin.com/XcGzedNk
> 
> This one also hits URIBL_SBL and a local BAD_TLD rule, so it may have
> been marked as spam anyway, but those two rules have also since been
> adjusted.
> 
> Thanks

I had to add a rule weight for this rule to get it to trigger, so
evidently the latest ML rules weights have disabled this rule.

Anyway, this is what triggered it:

>https://www.google.com/url?q=3Dhttps%=
> 3A%2F%2Fglobal.gotomeetinA.com%2Fjoin%2F726265509&sa=3DD&usd=3D2&am=
> p;usg=3DAFQjCNHlVtBtL2J4tx-l3Ej-YPjED9EKjA" target=3D"_blank">https://globa=
> l.gotomeetinB.com/join/726265509

(Note that I altered the domains so that I could tell which triggered
the rule.)

Now I've given a more thorough analysis to that code (I'm not a perl
programmer by any means), I realized what it's actually doing is
comparing the _domain_ of the text to the domain of the anchor. If the
two don't match (e.g. the link text is for gotomeeting.com but the link
goes to google.com), the rule triggers. The http/https is misleading.

Unfortunately, it's all too common nowadays for emails to include a link
click-through redirect domain hidden in the anchor tag. I personally
hate this, but it can't really be considered a sign of spam anymore
since too many legitimate emails do it. 

I would probably zero-weight this rule.

--Sean



Re: FROM_WORDY and score

2016-09-25 Thread Sean Greenslade
On Sun, Sep 25, 2016 at 04:46:28PM -0400, Alex wrote:
> Hi,
> 
> I have another rule with a questionable score that's hitting too much ham.
> 
> From: "Customer Support" 
> dbg: rules: ran header rule __FROM_WORDY ==> got hit: "Customer.Support@"
> 
> http://pastebin.com/3qw6jLZp
> 
> This rule involves a few others, including __KHOP_NO_FULL_NAME and
> __FROM_FULL_NAME, there doesn't look to be anything out of the
> ordinary in that address to me...

Generally speaking, everyone's spam is different. Part of maintaining a
SA install is tweaking the rules, weights, and thresholds for your
particular spam & ham stream.

The default score weights are based on a set of machine learning
algorithms that analyze a specific corpus of spam and ham. They are by
no means guaranteed to work perfectly for everyone.

Typically, if I find a rule seems to be misbehaving, I will reduce its
weight to [-]0.1 and let it run for a while, then do some statistics on
how many FPs / FNs happen. If there are too many mis-triggers, I'll
either zero-weight the rule, or keep it at a very low weight.

For me, the bulk of the weights in most of my spam is from DNSBLs and
bayes results, so I don't need to do a huge amount of fiddling.

--Sean



Re: HTTPS_HTTP_MISMATCH and explanation

2016-09-25 Thread Sean Greenslade
On Sun, Sep 25, 2016 at 03:54:53PM -0400, Alex wrote:
> > If you want to see what that rule's code looks like, here's a link:
> >
> > https://fossies.org/dox/Mail-SpamAssassin-3.4.1/classMail_1_1SpamAssassin_1_1Plugin_1_1HTTPSMismatch.html
> >
> > It's possible there is a bug in that rule. If you send it through
> > SpamAssassin with debug enabled, the rule should print out the domain
> > pairs that trigger it. Maybe try that, and see if what it outputs makes
> > sense.
> 
> I should have mentioned that I tried that - it doesn't.

If you don't mind sending the entire email, I'm curious now.

--Sean



Re: HTTPS_HTTP_MISMATCH and explanation

2016-09-25 Thread Sean Greenslade
On Sun, Sep 25, 2016 at 03:39:20PM -0400, Alex wrote:
> I think it must be something more than that. I've included the HTML
> component of an FP I received, and I don't see any occurrences of an
> https link where the text component is just http, or even vice-versa.
> 
> http://pastebin.com/BNM9sLRL
> 
> The HTML is a bit hard to read. Let me know if you want the whole
> email (which is even harder, consider it's encoded, so you'd have to
> actually run it through SA).

If you want to see what that rule's code looks like, here's a link:

https://fossies.org/dox/Mail-SpamAssassin-3.4.1/classMail_1_1SpamAssassin_1_1Plugin_1_1HTTPSMismatch.html

It's possible there is a bug in that rule. If you send it through
SpamAssassin with debug enabled, the rule should print out the domain
pairs that trigger it. Maybe try that, and see if what it outputs makes
sense.

--Sean



Re: HTTPS_HTTP_MISMATCH and explanation

2016-09-25 Thread Sean Greenslade
On Sun, Sep 25, 2016 at 03:12:00PM -0400, Alex wrote:
> Hi, I'm seeing quite a few FPs with HTTPS_HTTP_MISMATCH and its score
> of 2.0. Isn't that kind of high for a rule that doesn't even have a
> description?
> 
> Can someone explain what the rule does, and consider whether its score
> should be adjusted?
> 
> Thanks,
> Alex

>From my quick glance over the code, it looks like that rule is meant to
trigger when a link presents its text as an https://... link, however
the actual link is to an http://... URL. Like this:

http://spammersite.com/virus";>https://www.email-service.com/login

The only place I would imagine false positives arising from this rule
would be if an email sender uses some sort of automatic link replacement
(e.g. for click-through tracking) that doesn't support https. And I
personally am inclined to agree that an email that mis-represents
insecure links as secure should be considered suspisious.

Contact the senders of the flagged emails and ask them to fix their
systems. Spam or not, that is a real problem.

--Sean



Re: Spam by IP-address? Spamassassin with geoiplookup?

2016-09-24 Thread Sean Greenslade
On September 24, 2016 6:12:10 AM EDT, Thomas Barth  wrote:
>Instead of URIBL_BLOCKED=0.001 I see URIBL_ABUSE_SURBL=1.948, 
>URIBL_BLACK=1.7
>
>It s still not ok, is it?

That means it is working as intended, and your message has triggered hits on 
two separate blacklists.

--Sean




Re: DNS Terminology

2016-09-23 Thread Sean Greenslade
On Fri, Sep 23, 2016 at 05:03:00PM +0100, RW wrote:
> I've been wondering whether recursive is actually the correct term.
> 
> As I understand it there are two types of DNS lookup:
> 
>   1. Iterative - where results are found by working down through
>   multiple servers from the root servers.
> 
>   2. Recursive - where a request is made to a single nameserver which
>   handles the whole look-up on behalf of a client.
> 
> What this turns on is whether a forwarding server is a distinct
> class of of nameserver or a type of recursive server. I think the
> latter is most logical, since both provide a recursive interface.
> Definitions of the term "recursive server" that I've seen  contrast it
> only with "authoritative server".
> 
> One thing is certain, what you want is a name server that does
> *iterative* lookups.

A forwarding server is a recursive server. The two are more or less
synonymous. Both iterative and recursive servers may or may not cache
their results to speed up future queries for the same information.

Authoritative servers are the original source of the record data for one
or more sections of the DNS hierarchy. If they receive a request for a
record they hold authority over, they return it directly. If they
receive a request for a record they _don't_ hold authority, then it
depends on how the server is configured. It could recurse, it could
iterate, or it could reject the query. Most internet-facing authoriative
servers reject queries for parts of the domain hierarchy they don't hold
authority over.

--Sean



Re: Spam by IP-address? Spamassassin with geoiplookup?

2016-09-21 Thread Sean Greenslade
On Wed, Sep 21, 2016 at 05:23:46PM +0200, Thomas Barth wrote:
> I cant do that because I dont have spam mails. I dont make store&forward. I
> didnt thought that I need the spam uncompressed in a folder for
> autolearning, I thought it works when sa is analyzing the mail. My
> mailsystem checks mails in real time and blocks mail during connection. If
> there is a false positive the sender gets an error and I get a call of the
> sender to check it (last call was over a year ago :-). But I have a
> compressed copy in the quarantine folder so that I can check the reason
> anyway.
> 
> find /var/lib/amavis/virusmails/ -type f -name "spam-*.gz" -mmin -60 -exec
> ls -hal {} \;
> -rw-r- 1 amavis amavis 23K Sep 21 16:30
> /var/lib/amavis/virusmails/n/spam-nH0HbPBqwMoV.gz
> -rw-r- 1 amavis amavis 23K Sep 21 17:00
> /var/lib/amavis/virusmails/6/spam-6e2vFSpi_vsr.gz
> -rw-r- 1 amavis amavis 11K Sep 21 16:48
> /var/lib/amavis/virusmails/O/spam-Ojbq0dV-TYc2.gz
> -rw-r- 1 amavis amavis 22K Sep 21 17:05
> /var/lib/amavis/virusmails/O/spam-Owoyctlsyvzz.gz
> 
> so, no autolearning

You could write a script that decompresses the files and feeds them one
by one to sa-learn. Not too difficult, I would imagine.

As for your spam rejection paradigm, I can't possibly imagine that
working well unless you have a very close relationship with every single
person who emails you. If I send my resume to a job recruiter and they
get a bounce when they email me back, I highly doubt they're going to
bother to call me up and tell me my email system is broken. My resume's
going in the trash and they're moving on.

Just because you haven't received any calls doesn't mean there's no
problems...

--Sean