TxRep may increase false positives

2024-06-02 Thread Tomohiro Hosaka

Hello.

I am using TxRep with DBBasedAddrList.

If we learn the following email
ham is email address user@host with signed
spam is email address without user@host without signed

The following reputation is used
ham is  [EMAILIP: user@host, rep:xx, count: xx]
spam is [EMAIL: user@host, rep: xx, count: xx].

Because the same storage location is used, the data will be mixed up.
If more spam is learned, there is a risk of false positives for ham.
This is because the weight of EMAILIP is particularly high.

The reason is that $signedby is not used in 
DBBasedAddrList::get_addr_entry.


SQLBasedAddrList::get_addr_entry uses $signedby.
Here is a quote from the auto_welcomelist_distinguish_signed section of 
TxRep.pm.
Without this option, or for domains that do not use a DKIM signature, 
the reputation of legitimate email can get mixed with the reputation of 
forgeries.


Given the above statement, I assume the developer knows this.

If the current REPUTATION LOGICS are to be retained, then
DBBasedAddrList::get_addr_entry should use $signedby as well as 
SQLBasedAddrList::get_addr_entry?


Thanks.


qq.com rule false positives

2023-11-19 Thread Sean Greenslade
Hi, all. I received a mail from a qq.com user that went over the spam
threshold. From the rules that triggered, it looks like the dynamic rDNS
rules triggered on the qq.com sending server, which contributed around
4.2 points to this message (which was not spam). Relevant headers:

X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-14) on snowy
X-Spam-Flag: YES
X-Spam-Level: *
X-Spam-Status: Yes, score=5.7 required=5.2 tests=BAYES_50,DKIM_SIGNED,
DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DYN_RDNS_AND_INLINE_IMAGE,
FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FROM_EXCESS_BASE64,
HELO_DYNAMIC_IPADDR,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RDNS_DYNAMIC,
SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=disabled
version=4.0.0
X-Spam-Report:
* -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, 
no
*  trust
*  [203.205.221.192 listed in list.dnswl.org]
* -0.2 SPF_PASS SPF: sender matches SPF record
*  0.1 SPF_HELO_NONE SPF: HELO does not publish an SPF Record
* -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from 
author's
*   domain
* -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from
*  envelope-from domain
* -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
*  0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily
*  valid
*  1.5 BAYES_50 BODY: Bayes spam probability is 40 to 60%
*  [score: 0.5000]
*  0.2 FREEMAIL_FROM Sender email is commonly abused enduser mail 
provider
*  [(at)qq.com]
*  0.2 FREEMAIL_ENVFROM_END_DIGIT Envelope-from freemail username ends 
in
*  digit
*  [(at)qq.com]
*  0.0 HTML_MESSAGE BODY: HTML included in message
*  1.0 RDNS_DYNAMIC Delivered to internal network by host with
*  dynamic-looking rDNS
* -0.0 T_SCC_BODY_TEXT_LINE No description available.
*  1.2 DYN_RDNS_AND_INLINE_IMAGE Contains image, and was sent by dynamic
*  rDNS
*  0.0 FROM_EXCESS_BASE64 From: base64 encoded unnecessarily
*  2.0 HELO_DYNAMIC_IPADDR Relay HELO'd using suspicious hostname (IP 
addr
*  1)
Received: from out203-205-221-192.mail.qq.com (out203-205-221-192.mail.qq.com 
[203.205.221.192])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(No client certificate requested)
by snowy.routify.me (Postfix) with ESMTPS id B8E0C23484
for ; Thu, 16 Nov 2023 09:09:32 + (UTC)

I can totally see why that sending rDNS looks dynamic, but perhaps there
should be a special case exception for mail.qq.com, since that seems to
be their template for all sending servers.

--Sean



Re: DMARC Aggregate reports - false positives

2023-06-23 Thread Kenneth Porter
Mine don't get reported as spam. But I'm getting daily reports from 
mimecast.org that claim to be "Content-Type: application/gzip" but have 
file extension .zip. Examination finds that they're really PK zip files. 
So the script I use to process them tosses them as malformed. The source 
domain has no way to contact them to inform them of the error.





Re: DMARC Aggregate reports - false positives

2023-06-22 Thread Jared Hall

On 6/22/2023 6:29 AM, Simon Wilson via users wrote:


How do people work around this? I've trained Bayes, and that is 
applying a -ve offset as expected, but they still end up at over 7.

X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
tests=[BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]

My Spam threshold is higher, so not a real problem for me.  But...

1) You might plead your case to KAM off-list and see if he can bump up 
his regex length for ENA_SUBJ_LONG_WORD to something longer than 30, 
like 33.

2) Lower the score for ENA_SUBJ_LONG_WORD
3) I don't run Pyzor; maybe lower the score the a little bit also?
4) Create an off-setting rule; like:

    meta    DMARC_OFFSET    (ENA_SUBJ_LONG_WORD && DKIM_VALID)
    score   DMARC_OFFSET    -2.2

Yes, for sure, ALL Microsoft DMARC messages hit ENA_SUBJ_LONG_WORD. 
dokomo.ne.jp also hits (32 chars).  In the near-miss category, mail.ru 
comes in OK at 29 characters.



-- Jared Hall






Re: DMARC Aggregate reports - false positives

2023-06-22 Thread Simon Wilson via users


On Thursday, June 22, 2023 23:05 AEST, Bill Cole 
 wrote:
 On 2023-06-22 at 06:29:53 UTC-0400 (Thu, 22 Jun 2023 20:29:53 +1000)
Simon Wilson via users 
is rumored to have said:

> I find most DMARC reports I receive are flagged as spam by SA. 
>
> How do people work around this? I've trained Bayes, and that is
> applying a -ve offset as expected, but they still end up at over 7.

The best solution for robot-generated mail to and from predictable
addresses are the welcomelist feature(s). You can use more_spam_to or
all_spam_to for reporting addresses, or welcomelist_auth for senders.
​
Also, if you get a lot of robotic mail I would recommend that you not
use Pyzor, Razor, or DCC. All of those are engines for detecting
similarities in mail and they do that very well with regularly formatted
mail that looks much the same across many recipients.

>  X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
> tests=[BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
> BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
> DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
> ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
> MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
> PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
> RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
> T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]
>
> -- 
> Simon Wilson
> M: 0400 121 116


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Thanks BIll. I am using 3.4.6 on RHEL8, so will need to use the legacy terms 
instead of welcomelist_auth I assume. 

I'll start there.

Simon

-- 
Simon Wilson
M: 0400 121 116


Re: DMARC Aggregate reports - false positives

2023-06-22 Thread Bill Cole

On 2023-06-22 at 06:29:53 UTC-0400 (Thu, 22 Jun 2023 20:29:53 +1000)
Simon Wilson via users 
is rumored to have said:


I find most DMARC reports I receive are flagged as spam by SA. 

How do people work around this? I've trained Bayes, and that is 
applying a -ve offset as expected, but they still end up at over 7.


The best solution for robot-generated mail to and from predictable 
addresses are the welcomelist feature(s). You can use more_spam_to or 
all_spam_to for reporting addresses, or welcomelist_auth for senders.


Also, if you get a lot of robotic mail I would recommend that you not 
use Pyzor, Razor, or DCC. All of those are engines for detecting 
similarities in mail and they do that very well with regularly formatted 
mail that looks much the same across many recipients.



 X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
tests=[BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]

-- 
Simon Wilson
M: 0400 121 116



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: DMARC Aggregate reports - false positives

2023-06-22 Thread Damian

submitters? I looked at a bunch of my reports and they are all MIME_GOOD.


That one was from microsoft.


Ok, I see.

It seems to me that BASE64_LENGTH_79_INF is wrong. It is probably 
motivated by RFC5322's "SHOULD be no more than 78 characters, excluding 
the CRLF". My Microsoft reports trigger 79_INF even though they only 
have 78 characters, excluding CRLF.


Personally, I have lower PYZOR_CHECK and DKIMWL scores.


Re: DMARC Aggregate reports - false positives

2023-06-22 Thread Simon Wilson via users


On Thursday, June 22, 2023 20:37 AEST, Damian  wrote:
  I find most DMARC reports I receive are flagged as spam by SA.> Which 
submitters? I looked at a bunch of my reports and they are all MIME_GOOD.

That one was from microsoft.

-- 
Simon Wilson
M: 0400 121 116


Re: DMARC Aggregate reports - false positives

2023-06-22 Thread Damian

I find most DMARC reports I receive are flagged as spam by SA.
Which submitters? I looked at a bunch of my reports and they are all 
MIME_GOOD.

DMARC Aggregate reports - false positives

2023-06-22 Thread Simon Wilson via users

I find most DMARC reports I receive are flagged as spam by SA. 

How do people work around this? I've trained Bayes, and that is applying a -ve 
offset as expected, but they still end up at over 7.
 X-Spam-Status: Yes, score=7.215 tagged_above=-999 required=6.2
tests=[BASE64_LENGTH_78_79=0.1, BASE64_LENGTH_79_INF=1.502,
BAYES_00=-1.9, DCC_CHECK=1.1, DIGEST_MULTIPLE=0.293,
DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
ENA_SUBJ_LONG_WORD=2.2, HTML_MESSAGE=0.001, LR_DMARC_PASS=-0.1,
MIME_BASE64_TEXT=1.741, MIME_HTML_MOSTLY=0.1, MPART_ALT_DIFF=0.79,
PYZOR_CHECK=1.392, RCVD_IN_DNSWL_NONE=-0.0001,
RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001,
T_SCC_BODY_TEXT_LINE=-0.01, T_TVD_MIME_NO_HEADERS=0.01]

-- 
Simon Wilson
M: 0400 121 116


Re: Questions to False Positives

2022-08-25 Thread Bill Cole

On 2022-08-25 at 10:34:17 UTC-0400 (Thu, 25 Aug 2022 14:34:17 +)
Kerstin Thomaßen 
is rumored to have said:


Is this issue known? Why is the color white in text marked as spam?


It's not. Your message apparently hits the rule HTML_FONT_LOW_CONTRAST 
which currently has a very low score (0.001: an informational 
placeholder) when used in a default configuration with network tests 
enabled. If you're using a testing tool that doesn't require a full 
delivered message, it may be using the 'ruleset 0' score (currently 
0.713) which is more than a placeholder but far less than an absolute 
'spam' classification.


SA rules express real-world partial correlations. Hitting 
HTML_FONT_LOW_CONTRAST correlates weakly with a message being spam. 
Alone, it WILL NOT cause your email to be marked as spam.



I am happy for any tips or solutions beside changing the color.


Don't expect to get ANY message to never hit any negative SA rules at 
all. If you insist on engaging in the public nuisance of HTML mail, you 
*will* match some rules, likely totalling around 2. The default 
threshold for SA to classify a message as spam is 5.0. If your total 
score is below 5, most SA sites will deliver it unimpeded. If your total 
score is between 3 and 5, SOME more strict sites may reject, quarantine, 
or tag your mail. If your total score is below 3 and you're still 
worried about it: take a break, step outside, touch grass.


One way to avoid that specific rule is to render problematic sections as 
images. I don't advise that, as it is a path towards a complex of rules 
that look at image/text ratio which are much riskier than 
HTML_FONT_LOW_CONTRAST.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: RCVD_IN_DNSWL_HI false positives

2021-05-14 Thread Bill Cole

On 2021-05-14 at 11:35:05 UTC-0400 (Fri, 14 May 2021 18:35:05 +0300)
Henrik K 
is rumored to have said:


On Fri, May 14, 2021 at 11:14:50AM -0400, Bill Cole wrote:


True, but forwarding to any of the free public resolvers is not 
suitable for
systems intentionally exposed to connections from unknown random 
outsiders.
It is especially important for systems doing MX or spam filtering 
service to
use a resolver under common administrative control and on the same 
LAN if
not the same machine. Beyond the issues of answer legitimacy, excess 
latency

or UDP packet loss can cause performance and reliability problems,
particularly on high-traffic MTAs.


Why would there be issues of legitimacy or latency if my local 
resolver

forwards everything to 1.1.1.1 or 8.8.8.8?


Legitimacy:

Both Google (8.8.8.8) & CloudFlare (1.1.1.1) claim not to filter DNS 
results, although Google hedges that a bit in the FAQ at 
https://developers.google.com/speed/public-dns/faq. If you're going to 
forward anything to a public resolver, those 2 would be the best 
choices.


Most other public resolvers openly tout their filtering of results as 
"protection" and some have at times experimented with typo-fixing (e.g. 
correcting 'yaho' to 'yahoo') and typo-squatting: replacing all NXDOMAIN 
A/ answers with the addresses of their own ad-laden machines.



Never had any problems


Anecdotal.

I *have* seen a client's machines crippled for about an hour by Google's 
public DNS going bad for them locally. Whether it was something wrong 
with their anycasting, networking breakage at the link provider, or just 
the most-local nodes failing, I and they will never know. When it 
happens, there is no documented way to report a problem or get any idea 
about when (if ever) it will be resolved.


I doubt that there is any way for anyone outside of Google to know how 
rare such failures are. I doubt they would ever share such information 
in any useful way. I suspect that their availability in a global 
aggregate sense is good enough that they could call it 'five nines' or 
whatever, but that's not satisfying for people who are the 0.05% of 
Google's users affected for 0.05% of a year by a mystery outage.



and I'm
pretty sure the large caches answer most generic lookups faster.


An answer from a host-local cache will always be much faster than an 
answer from a cache that is on the other side of a router hop and WAN 
link. An answer from a LAN-local cache will be as well, as long as it 
isn't overloaded. DNS queries by a busy MTA which are rooted in routine 
'ham' handling have a very high cache hit rate on a mature local cache, 
while deep misses (i.e. where the resolver must go to the root zone or a 
TLD nameserver to get glue) are largely a product of spam. To the extent 
that DNS slowness affects mail, it disproportionately affects spam, if 
your cache is well-populated with names you see a lot of.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: RCVD_IN_DNSWL_HI false positives

2021-05-14 Thread Henrik K
On Fri, May 14, 2021 at 11:14:50AM -0400, Bill Cole wrote:
> 
> True, but forwarding to any of the free public resolvers is not suitable for
> systems intentionally exposed to connections from unknown random outsiders.
> It is especially important for systems doing MX or spam filtering service to
> use a resolver under common administrative control and on the same LAN if
> not the same machine. Beyond the issues of answer legitimacy, excess latency
> or UDP packet loss can cause performance and reliability problems,
> particularly on high-traffic MTAs.

Why would there be issues of legitimacy or latency if my local resolver
forwards everything to 1.1.1.1 or 8.8.8.8?  Never had any problems and I'm
pretty sure the large caches answer most generic lookups faster.  Obviously
specific DNSBLs are configured to bypass forwarding.



Re: RCVD_IN_DNSWL_HI false positives

2021-05-14 Thread Bill Cole

On 2021-05-14 at 10:10:31 UTC-0400 (Fri, 14 May 2021 15:10:31 +0100)
RW 
is rumored to have said:


Forwarding is right for almost all cases,


True, but forwarding to any of the free public resolvers is not suitable 
for systems intentionally exposed to connections from unknown random 
outsiders. It is especially important for systems doing MX or spam 
filtering service to use a resolver under common administrative control 
and on the same LAN if not the same machine. Beyond the issues of answer 
legitimacy, excess latency or UDP packet loss can cause performance and 
reliability problems, particularly on high-traffic MTAs.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: RCVD_IN_DNSWL_HI false positives

2021-05-14 Thread RW
On Thu, 13 May 2021 09:41:25 -0400
Daniel J. Luke wrote:

> On May 13, 2021, at 12:14 AM, Michael B Allen 
> wrote:
> > It is not completely trivial setup a caching name server. I
> > literally have two accounts so it's at least a serious nuisance.  
> 
> It's pretty simple to install unbound and set it up on most systems.

Actually I found "local-unbound" quite confusing on FreeBSD.

It turned-out that all you need to do is set: 

  local_unbound_enable=YES
  local_unbound_forwarders=none

but the second line was undocumented. I had to read through 2 shell
scripts to work it out. If you miss it out unbound will get
automatically configured for forwarding on first run. 

What made it doubly confusing is that it got set up with resolvconf
which I'd never given any thought to before.

Forwarding is right for almost all cases, so it wouldn't surprise me if
some Linux distributions have a similar pitfall.


Re: RCVD_IN_DNSWL_HI false positives

2021-05-14 Thread Henrik K
On Thu, May 13, 2021 at 05:04:47PM +0200, Matus UHLAR - fantomas wrote:
> > > Maybe they could just be blocked in the firewall.
> 
> On 13.05.21 16:44, Matthias Leisi wrote:
> > This would multiply the traffic due to retries.
> 
> I agree. URIBL provides special result URIBL_BLOCKED, maybe that would be a
> way (with high TTL)

We already have RCVD_IN_DNSWL_BLOCKED.

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6728



Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread Henrik K
On Thu, May 13, 2021 at 02:52:38PM -0700, John Hardin wrote:
> On Thu, 13 May 2021, Henrik K wrote:
> 
> > On Thu, May 13, 2021 at 01:34:37PM -0400, Greg Troxel wrote:
> > > 
> > > I wonder if it would be sensible for spamassassin to have a
> > > configuration option for all default-on dnsrbls (one option, applying to
> > > all):
> > > 
> > >   disabled
> > >   auto
> > >   enabled
> > > 
> > > where the default is auto, and auto means "enabled if resolver is
> > > 127.0.0.1, ::1 or localhost, else disabled".
> > 
> > No.  Local resolver could be configured to forward everything to Google.
> 
> True, but that would be a conscious configuration.
> 
> > Or all servers could have one central nameserver in the local network.
> 
> So add "on local network".

Please describe to me how randomly looking at "/etc/resolv.conf" or
"dns_server" has any relevance on what route the queries will take in the
_end_, and why only very specific values found would imply some "conscious
configuration"?

Might as well create a SpamAssassin Training Academy website and not allow
SA to start without --i_have_read_manuals_and_tutorials_and_best_practises. 
Well the first part could help since our wiki (and offcial docs) are pretty
scattered and outdated.



Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread John Hardin

On Thu, 13 May 2021, Henrik K wrote:


On Thu, May 13, 2021 at 01:34:37PM -0400, Greg Troxel wrote:


I wonder if it would be sensible for spamassassin to have a
configuration option for all default-on dnsrbls (one option, applying to
all):

  disabled
  auto
  enabled

where the default is auto, and auto means "enabled if resolver is
127.0.0.1, ::1 or localhost, else disabled".


No.  Local resolver could be configured to forward everything to Google.


True, but that would be a conscious configuration.


Or all servers could have one central nameserver in the local network.


So add "on local network".

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.org pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79


Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread Greg Troxel

Henrik K  writes:

> On Thu, May 13, 2021 at 01:34:37PM -0400, Greg Troxel wrote:
>> 
>> I wonder if it would be sensible for spamassassin to have a
>> configuration option for all default-on dnsrbls (one option, applying to
>> all):
>> 
>>   disabled
>>   auto
>>   enabled
>> 
>> where the default is auto, and auto means "enabled if resolver is
>> 127.0.0.1, ::1 or localhost, else disabled".
>
> No.  Local resolver could be configured to forward everything to Google.  Or
> all servers could have one central nameserver in the local network.

Why does the existence of that possibility mean "no'?

As it is, we have

  it's on by default

which leads to

  if the resolver SA is using is just for that instance of SA and
  somehow local, things are ok

  if the resolver chains to something big, it's not ok and you have to
  disable dnsbl queries

What I proposed merely moves the default for non-local resolver
addresses, which means relatibe to the above:

  people with non-local resolver addresses that can be used have to
  enable dnsbls

  people with non-local resolver addresses that shouldn't be used, used
  to have a duty to disable and now it will be taken care of

It doesn't change anything for anybody else.


signature.asc
Description: PGP signature


Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread Henrik K
On Thu, May 13, 2021 at 01:34:37PM -0400, Greg Troxel wrote:
> 
> I wonder if it would be sensible for spamassassin to have a
> configuration option for all default-on dnsrbls (one option, applying to
> all):
> 
>   disabled
>   auto
>   enabled
> 
> where the default is auto, and auto means "enabled if resolver is
> 127.0.0.1, ::1 or localhost, else disabled".

No.  Local resolver could be configured to forward everything to Google.  Or
all servers could have one central nameserver in the local network.



Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread Benny Pedersen

On 2021-05-13 19:34, Greg Troxel wrote:

I wonder if it would be sensible for spamassassin to have a
configuration option for all default-on dnsrbls (one option, applying 
to

all):

  disabled
  auto
  enabled

where the default is auto, and auto means "enabled if resolver is
127.0.0.1, ::1 or localhost, else disabled".


rfc 1700 enforcement ?

if this is global it does no good, but if it is added pr rule, it could 
be usefull, but i fear it will delay releases of spamaassassin 4.0.0


i see still some mx have 127.0.0.1, it would be nice to see more dns 
servers supported nullMX :/


would be nice spamassassin have default rules for this


Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread Greg Troxel

I wonder if it would be sensible for spamassassin to have a
configuration option for all default-on dnsrbls (one option, applying to
all):

  disabled
  auto
  enabled

where the default is auto, and auto means "enabled if resolver is
127.0.0.1, ::1 or localhost, else disabled".



signature.asc
Description: PGP signature


Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread Matus UHLAR - fantomas

Maybe they could just be blocked in the firewall.


On 13.05.21 16:44, Matthias Leisi wrote:

This would multiply the traffic due to retries.


I agree. URIBL provides special result URIBL_BLOCKED, maybe that would be a
way (with high TTL)

But I'm afraid top abusers could still need to get false positives in order
to stop.
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Quantum mechanics: The dreams stuff is made of.


Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread Matthias Leisi


> Maybe they could just be blocked in the firewall. 

This would multiply the traffic due to retries.


Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread Daniel J. Luke
On May 13, 2021, at 12:14 AM, Michael B Allen  wrote:
> It is not completely trivial setup a caching name server. I literally
> have two accounts so it's at least a serious nuisance.

It's pretty simple to install unbound and set it up on most systems.

> Sending false positives that allows SPAM though is a bad way to enforce 
> policy.

It sounds like they've tried other options but didn't get a response from 
abusive users so this is the 'last resort' option.

-- 
Daniel J. Luke



Re: RCVD_IN_DNSWL_HI false positives

2021-05-13 Thread RW
On Thu, 13 May 2021 00:11:52 +0200
Matthias Leisi wrote:


> We do follow RFCs, and have a number of methods (not returning an
> answer, returning REFUSED etc). But you’d be surprised how long some
> admins do not act… In these cases (ie consistent query volumes way
> above the limits, and prolonged times of inactio), returning a „hi“
> result is the last option. 

But, presumably, you can't tell the difference between that case and a
new user connecting to shared cache.

Maybe they could just be blocked in the firewall. 


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Michael B Allen
On Wed, May 12, 2021 at 10:26 PM Arne Jensen  wrote:
> Den 13-05-2021 kl. 02:19 skrev Michael B Allen:
> > On Wed, May 12, 2021 at 6:10 PM Matthias Leisi  wrote:
> >>> That is unfortunate. It's not entirely crystal clear to me that
> >>> deliberately returning false positives that allow potentially
> >>> destructive SPAM to get through filters is a good way to enforce usage
> >>> policy.
> >> We use the „return hi“ in cases where long times of using other methods 
> >> does not reduce the query load on the free nameservers.
> > I don't understand the technical details of all of this but what about
> > sending an error response just under the typical retry interval? If
> > you want to annoy someone, make it the one DNS server operator and not
> > the hundreds of SA endpoints using it. A lot of smaller companies like
> > me (I'm just me!) just use their hosting company DNS (linode for me)
> > and are completely oblivious as to what dnswl even is.
>
> See:
> https://www.mail-archive.com/users@spamassassin.apache.org/msg107949.html
> <https://www.mail-archive.com/users@spamassassin.apache.org/msg107949.html>
>
> And then try to understand how DNS works:

I understand how DNS works as well as most I at least.

I do not understand why the default SA configuration uses dnswl but
then when someone does not read every minutia of documentation about
every possible option, SPAM is then used as a stick to get people to
change or pay for the service but not before being browbeaten about
not knowing how this convoluted mess works.

It is not completely trivial setup a caching name server. I literally
have two accounts so it's at least a serious nuisance.

> In the past, I saw Spamhaus being criticized, apparently for something
> that sounded like dropping queries with a firewall, which would lead to
> long timeouts, causing the originating mail server to give up before the
> responses were received, essentially leading to mails being deferred and
> (sometimes) lost.
>
> Such query dropping does (unfortunately) also means the queries often
> will be magnified, as e.g. Linode's resolver in your case, will just try
> another authoritative server for the zone.

Then like I suggested, instead of dropping entirely, maybe a delay
just under the retry interval would make all the difference.
Presumably dnswl is custom code? You could have a large array of
structs with ip and stats pre populated with pass entries for the paid
folk. When a request comes in, you hash the addr to get the right
bucket. If they're paid they pass. If not, you update the stats and if
they're over whatever limit everything from that server goes into a
500ms delay queue. You respond with success to keep the offensive DNS
server at arms length but passivated but the SA endpoint gets an
answer of "blocked".

Sending false positives that allows SPAM though is a bad way to enforce policy.

Mike


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Arne Jensen


Den 13-05-2021 kl. 02:19 skrev Michael B Allen:
> On Wed, May 12, 2021 at 6:10 PM Matthias Leisi  wrote:
>>> That is unfortunate. It's not entirely crystal clear to me that
>>> deliberately returning false positives that allow potentially
>>> destructive SPAM to get through filters is a good way to enforce usage
>>> policy.
>> We use the „return hi“ in cases where long times of using other methods does 
>> not reduce the query load on the free nameservers.
> I don't understand the technical details of all of this but what about
> sending an error response just under the typical retry interval? If
> you want to annoy someone, make it the one DNS server operator and not
> the hundreds of SA endpoints using it. A lot of smaller companies like
> me (I'm just me!) just use their hosting company DNS (linode for me)
> and are completely oblivious as to what dnswl even is.

See:
https://www.mail-archive.com/users@spamassassin.apache.org/msg107949.html
<https://www.mail-archive.com/users@spamassassin.apache.org/msg107949.html>

And then try to understand how DNS works:

You're sending a query, for example to Quad 9 's resolver, on it's IP
address 9.9.9.9, and in this example, from a location in Denmark.

$ dig TXT o-o.myaddr.l.google.com @9.9.9.9
o-o.myaddr.l.google.com. 47 IN  TXT "188.122.68.219"

This reveals that Google sees the DNS request as coming from 188.122.68.219.

188.122.68.219 is owned/operated by i3d.net, and according to the public
information, that IP address seems to be a part of their
network/infrastructure in Germany.


So here, you're technically hiding behind someone else's identity (IP
address) to perform the query towards the final network, as they are
literally your middleman here.

No DNSBL/WL can see through your "middleman" and detect your
personal(/organisational) quota independently, when you're all hiding
behind the same "middleman".

The only thing being seen here, is the IP address of that particular
"middleman", and as such, all queries behind that middleman are all
being aggregated together, towards the total limit of 100'000 queries/24h.


Things such as e.g. trying to reach out to i3d.net's abuse desk, from
the example shown above, where the originating IP belongs their
organisation, doesn't work, at all.

See:

-> https://www.dnswl.org/?p=120 <https://www.dnswl.org/?p=120>
-> https://www.dnswl.org/?p=118 <https://www.dnswl.org/?p=118>
-> https://www.dnswl.org/?p=183 <https://www.dnswl.org/?p=183>


>
> Maybe you would prefer that SA disable dnswl lookups in the default
> config? Folks who are fluent in such things and have their own DNS
> server will know how to flip it on.

May I ask if you are actually reading and following the documentation
about the things you run?

->
https://cwiki.apache.org/confluence/display/SPAMASSASSIN/CachingNameserver
<https://cwiki.apache.org/confluence/display/SPAMASSASSIN/CachingNameserver>

It is not only just the best practice with a local resolver, but is as
close as it can be to "mandatory" with many block/welcome lists, such as
e.g. URIBL, Spamhaus, and several others.

At the link above, you want to make sure you catch the "NOTE:"-line.


In the past, I saw Spamhaus being criticized, apparently for something
that sounded like dropping queries with a firewall, which would lead to
long timeouts, causing the originating mail server to give up before the
responses were received, essentially leading to mails being deferred and
(sometimes) lost.

Such query dropping does (unfortunately) also means the queries often
will be magnified, as e.g. Linode's resolver in your case, will just try
another authoritative server for the zone.


That "returnhi" option is only used for a minority, and only in the very
extreme cases where other attempts have been tried, but with no positive
success for a long while, - which is also being mentioned in the link
from the top of this post.

In the end, there is no perfect solution that simply works for everyone,
and everything, at once.

It would of course be nice, ... if there was...

-- 
Med venlig hilsen / Kind regards,
Arne Jensen




Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Michael B Allen
On Wed, May 12, 2021 at 6:10 PM Matthias Leisi  wrote:
>
> > That is unfortunate. It's not entirely crystal clear to me that
> > deliberately returning false positives that allow potentially
> > destructive SPAM to get through filters is a good way to enforce usage
> > policy.
>
> We use the „return hi“ in cases where long times of using other methods does 
> not reduce the query load on the free nameservers.

I don't understand the technical details of all of this but what about
sending an error response just under the typical retry interval? If
you want to annoy someone, make it the one DNS server operator and not
the hundreds of SA endpoints using it. A lot of smaller companies like
me (I'm just me!) just use their hosting company DNS (linode for me)
and are completely oblivious as to what dnswl even is.

Maybe you would prefer that SA disable dnswl lookups in the default
config? Folks who are fluent in such things and have their own DNS
server will know how to flip it on.

Mike


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Raymond Dijkxhoorn
Hello Matthias, 

As a operator of a RBL I am way to familiar with this. We also see people still 
looking up domains on zones we phased out over 10 years ago. So not surprised 
at all… 

But if people are seeing things listed they should not I am reluctant saying 
it’s wrong to do that. 

Thanks,
Raymond Dijkxhoorn

> Op 13 mei 2021 om 00:12 heeft Matthias Leisi  het 
> volgende geschreven:
> 
> 
>> 
>> I would suggest to follow rfc’s. So return 127.0.0.1 for example. Or don’t 
>> answer at all. Deliberate giving ‘yes to any request’ is something I can 
>> understand you would do but it’s plain wrong. 
> 
> We do follow RFCs, and have a number of methods (not returning an answer, 
> returning REFUSED etc). But you’d be surprised how long some admins do not 
> act… In these cases (ie consistent query volumes way above the limits, and 
> prolonged times of inactio), returning a „hi“ result is the last option. This 
> has been the case for maybe 10 or so years.
> 
> — Matthias
> 


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Raymond Dijkxhoorn
Hi Benny,

The operator of the specific rbl is doing this, on purpose. Can’t make it more 
clear then that.

Dnssec would not add anything here. 

Thanks,
Raymond Dijkxhoorn

> Op 13 mei 2021 om 00:01 heeft Benny Pedersen  het volgende 
> geschreven:
> 
> On 2021-05-12 23:30, Raymond Dijkxhoorn wrote:
> 
>> It’s the authoritive nameserver giving that answer. With likely a view
>> or acl response. So adding dnssec would not make much of a difference
>> here.
> 
> so dnssec is brokken ?
> 
> auth dnsservers or not, problem is when other dns servers cache possitive 
> results imho, and continue keep it, while negative expires fast, but dns 
> servers should relly expire on soa changes no matter ttl is not expired
> 
> i am still no expert, just trying to understand the problem
> 
> i hate to see qname minimalzion in bind9 turned on by default, while there is 
> no fix for this on rbldnsd
> 
> would rbldnsd update dlz in bind9 redis in someway, i know it could dump dns 
> data as bind9 zone, but it would be nice to see it update dlz zone database, 
> to atleast make qname problem go away


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Matthias Leisi
> I would suggest to follow rfc’s. So return 127.0.0.1 for example. Or don’t 
> answer at all. Deliberate giving ‘yes to any request’ is something I can 
> understand you would do but it’s plain wrong. 

We do follow RFCs, and have a number of methods (not returning an answer, 
returning REFUSED etc). But you’d be surprised how long some admins do not act… 
In these cases (ie consistent query volumes way above the limits, and prolonged 
times of inactio), returning a „hi“ result is the last option. This has been 
the case for maybe 10 or so years.

— Matthias



Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Matthias Leisi
> That is unfortunate. It's not entirely crystal clear to me that
> deliberately returning false positives that allow potentially
> destructive SPAM to get through filters is a good way to enforce usage
> policy.

We use the „return hi“ in cases where long times of using other methods does 
not reduce the query load on the free nameservers.

— Matthias



Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Benny Pedersen

On 2021-05-12 23:30, Raymond Dijkxhoorn wrote:


It’s the authoritive nameserver giving that answer. With likely a view
or acl response. So adding dnssec would not make much of a difference
here.


so dnssec is brokken ?

auth dnsservers or not, problem is when other dns servers cache 
possitive results imho, and continue keep it, while negative expires 
fast, but dns servers should relly expire on soa changes no matter ttl 
is not expired


i am still no expert, just trying to understand the problem

i hate to see qname minimalzion in bind9 turned on by default, while 
there is no fix for this on rbldnsd


would rbldnsd update dlz in bind9 redis in someway, i know it could dump 
dns data as bind9 zone, but it would be nice to see it update dlz zone 
database, to atleast make qname problem go away


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Raymond Dijkxhoorn
Hi Benny,

It’s the authoritive nameserver giving that answer. With likely a view or acl 
response. So adding dnssec would not make much of a difference here. 

Thanks,
Raymond Dijkxhoorn

> Op 12 mei 2021 om 23:24 heeft Benny Pedersen  het volgende 
> geschreven:
> 
> On 2021-05-12 23:01, Matthias Leisi wrote:
 Am 12.05.2021 um 21:02 schrieb Michael B Allen :
>>> X-Spam-Report:
>>> * -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high
>>> *  trust
>>> *  [173.82.162.98 listed in list.dnswl.org]
>> 173.82.162.98 is not in the dnswl.org database.
>> It’s likely you’re using one of the nameservers who are not only
>> blocked from using dnswl.org free nameserver infrastructure, but where
>> we needed to use additional methods to make them stop (ab)using our
>> nameservers (namely, returning a „_HI“ result in the hope that whoever
>> is responsible will finally notice).
> 
> would it not make sense to enable dnssec on dnswl.org nameservers ?
> 
> sorry for asking, i dont know much about dns servers


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Raymond Dijkxhoorn
Hi!

I would suggest to follow rfc’s. So return 127.0.0.1 for example. Or don’t 
answer at all. Deliberate giving ‘yes to any request’ is something I can 
understand you would do but it’s plain wrong. 

Thanks,
Raymond Dijkxhoorn

> Op 12 mei 2021 om 23:17 heeft Michael B Allen  het volgende 
> geschreven:
> 
> On Wed, May 12, 2021 at 5:01 PM Matthias Leisi  wrote:
>>>> Am 12.05.2021 um 21:02 schrieb Michael B Allen :
>>> 
>>> X-Spam-Report:
>>> * -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high
>>> *  trust
>>> *  [173.82.162.98 listed in list.dnswl.org]
>> 
>> 173.82.162.98 is not in the dnswl.org database.
>> 
>> It’s likely you’re using one of the nameservers who are not only blocked 
>> from using dnswl.org free nameserver infrastructure, but where we needed to 
>> use additional methods to make them stop (ab)using our nameservers (namely, 
>> returning a „_HI“ result in the hope that whoever is responsible will 
>> finally notice).
> 
> Hi Matthias,
> 
> That is unfortunate. It's not entirely crystal clear to me that
> deliberately returning false positives that allow potentially
> destructive SPAM to get through filters is a good way to enforce usage
> policy.
> 
> Mike


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Benny Pedersen

On 2021-05-12 23:01, Matthias Leisi wrote:

Am 12.05.2021 um 21:02 schrieb Michael B Allen :



X-Spam-Report:
* -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, 
high

*  trust
*  [173.82.162.98 listed in list.dnswl.org]


173.82.162.98 is not in the dnswl.org database.

It’s likely you’re using one of the nameservers who are not only
blocked from using dnswl.org free nameserver infrastructure, but where
we needed to use additional methods to make them stop (ab)using our
nameservers (namely, returning a „_HI“ result in the hope that whoever
is responsible will finally notice).


would it not make sense to enable dnssec on dnswl.org nameservers ?

sorry for asking, i dont know much about dns servers


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Michael B Allen
On Wed, May 12, 2021 at 5:01 PM Matthias Leisi  wrote:
> > Am 12.05.2021 um 21:02 schrieb Michael B Allen :
>
> > X-Spam-Report:
> > * -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high
> > *  trust
> > *  [173.82.162.98 listed in list.dnswl.org]
>
> 173.82.162.98 is not in the dnswl.org database.
>
> It’s likely you’re using one of the nameservers who are not only blocked from 
> using dnswl.org free nameserver infrastructure, but where we needed to use 
> additional methods to make them stop (ab)using our nameservers (namely, 
> returning a „_HI“ result in the hope that whoever is responsible will finally 
> notice).

Hi Matthias,

That is unfortunate. It's not entirely crystal clear to me that
deliberately returning false positives that allow potentially
destructive SPAM to get through filters is a good way to enforce usage
policy.

Mike


Re: RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Matthias Leisi


> Am 12.05.2021 um 21:02 schrieb Michael B Allen :

> X-Spam-Report:
> * -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high
> *  trust
> *  [173.82.162.98 listed in list.dnswl.org]

173.82.162.98 is not in the dnswl.org database. 

It’s likely you’re using one of the nameservers who are not only blocked from 
using dnswl.org free nameserver infrastructure, but where we needed to use 
additional methods to make them stop (ab)using our nameservers (namely, 
returning a „_HI“ result in the hope that whoever is responsible will finally 
notice).

— Matthias

-- 
Matthias Leisi
Katzenrütistrasse 68, 8153 Rümlang
Mobile +41 79 377 04 43
matth...@leisi.net
Skype matthias.leisi



RCVD_IN_DNSWL_HI false positives

2021-05-12 Thread Michael B Allen
Hi all,

Because of RCVD_IN_DNSWL_HI a bunch of SEO type stuff is getting
through. Strangely the domains are not listed in www.dnswl.org like
the one below is "fixtheweberrors.online":

  IP address 173.82.162.98 is not whitelisted at dnswl.org.

Most of it is .online stuff. What am I missing? How can
RCVD_IN_DNSWL_HI be added if it's not in dnswl.org? Or maybe it was
and has since been removed? I find that hard to believe since
dnswl.org looks like it only has records for bigger sites.

Thanks,
Mike

Received: from [96.47.229.26] (unknown [96.47.229.26])
by shenzi.fixtheweberrors.online (Postfix) with ESMTPA id 3651DA77B;
Tue, 11 May 2021 22:17:27 -0400 (EDT)
Received: from shenzi.fixtheweberrors.online
(shenzi.fixtheweberrors.online [173.82.162.98])
by mail.ioplex.com (Postfix) with ESMTPS id 5090B11B9
for ; Wed, 12 May 2021 01:25:07 -0400 (EDT)

X-Spam-Report:
* -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at https://www.dnswl.org/, high
*  trust
*  [173.82.162.98 listed in list.dnswl.org]
*  3.0 BAYES_95 BODY: Bayes spam probability is 95 to 99%
*  [score: 0.9888]
* -0.0 SPF_PASS SPF: sender matches SPF record
*  0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record
*  0.2 FREEMAIL_REPLYTO_END_DIGIT Reply-To freemail username ends in digit
*  (catherinebrooke321[at]gmail.com)
*  0.0 RCVD_IN_MSPIKE_L5 RBL: Very bad reputation (-5)
*  [173.82.162.98 listed in bl.mailspike.net]
*  0.5 MISSING_MID Missing Message-Id: header
*  0.0 RCVD_IN_MSPIKE_BL Mailspike blacklisted
*  2.1 FREEMAIL_FORGED_REPLYTO Freemail in Reply-To, but not From


Re: Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-24 Thread John Hardin

On Sat, 24 Apr 2021, Steve Dondley wrote:





And if you want to test your rules against a corpus rather than
testing against a few one-off spamples, then look into setting up a
local masscheck instance. You don't need to upload the results to SA,
but it will give you a good overview of how a rule behaves against
multiple messages.


I'm not sure what you mean by "Local masscheck instance".


https://cwiki.apache.org/confluence/display/SPAMASSASSIN/MassCheck

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.org pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Making good people helpless does not make bad people harmless.
---
 329 days since the first private commercial manned orbital mission (SpaceX)


Re: Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-24 Thread Steve Dondley





And if you want to test your rules against a corpus rather than
testing against a few one-off spamples, then look into setting up a
local masscheck instance. You don't need to upload the results to SA,
but it will give you a good overview of how a rule behaves against
multiple messages.


I'm not sure what you mean by "Local masscheck instance". But I plan to 
do the following:


1) set up SA in a docker container which has a volume containing my 
spam/ham folders

2) run a script that syncs ham/spam with live server
2) set up a script that will compare scores before a rule is implemented 
and with scores after it is implemented
3) script will output a report that tells me the results and report 
whether a spam/ham email is "flipped"


Re: Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-24 Thread John Hardin

On Sat, 24 Apr 2021, Steve Dondley wrote:


On 2021-04-23 05:41 PM, Martin Gregorie wrote:

On Fri, 2021-04-23 at 16:28 -0400, Steve Dondley wrote:

I'm experimenting with writing a library of my own SA rules and
scores.


Treat this like any other code development project: use a rule
development SA installation like I describe so you never develop rules
using the live mail stream. This way your rules will be better written
and tested and you'll cause fewer false positives in your live mail
stream.


Sounds like the best plan. Thanks for the advice.



And if you want to test your rules against a corpus rather than testing 
against a few one-off spamples, then look into setting up a local 
masscheck instance. You don't need to upload the results to SA, but it 
will give you a good overview of how a rule behaves against multiple 
messages.



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.org pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Human beings are born with different capacities.
  If they are free, they are not equal. And if they are equal,
  they are not free.-- Aleksandr Solzhenitsyn
---
 329 days since the first private commercial manned orbital mission (SpaceX)


Re: Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-24 Thread Steve Dondley

On 2021-04-23 05:41 PM, Martin Gregorie wrote:

On Fri, 2021-04-23 at 16:28 -0400, Steve Dondley wrote:

I'm experimenting with writing a library of my own SA rules and
scores.


I do this on a separate computer, which has Spamassassin installed but
not linked into anything else. It also has a copy of all the live SA
configuration files. Alongside this I have a directory filled with
examples of spam to function as testing input.

Along with I have a bash script or two which is used to do things like:

1) start SA in debug mode to check the testing config for errors. 
   No messages are processed - its just looking for configuration
   errors.

2) run SA against a spam sample and only display the list of spam hits

3) run SA against a spam sample and display the entire output message
   using less so it can be scrolled through

4) run SA against the complete spam collection and only display
   references to messages which are not scored as spam

5) replace the live SA configuration with with the current testing
  configuration, i.e. make the most set of changes live.

In practise (1) through (3) are east to combine into a single script
with an option to select the required action while (4) and (5) are best
kept separate.

It helps a lot of to name the items in the spam collection to relate
each set of similar spam to the local rule that's intended to trap this
spam type.


I'd like to be sure that the rules I write don't turn ham into spam
and vice versa.


It won't if you test the rules against related spam and give some
thought to the score you apply to each rule.


I imagine a utility like this must exists so figured I'd ask here
before re-inventing the wheel and writing my own (probably bugg)
script.


The sort of scripts I use are fairly short and simple.


The script would need to check against all email files in .INBOX.* and
.Spam directory in a user's IMAP directory.


No. Treat this like any other code development project: use a rule
development SA installation like I describe so you never develop rules
using the live mail stream. This way your rules will be better written
and tested and you'll cause fewer false positives in your live mail
stream.

Martin


Sounds like the best plan. Thanks for the advice.


Re: Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-23 Thread Martin Gregorie
On Fri, 2021-04-23 at 16:28 -0400, Steve Dondley wrote:
> I'm experimenting with writing a library of my own SA rules and
> scores.
>
I do this on a separate computer, which has Spamassassin installed but
not linked into anything else. It also has a copy of all the live SA
configuration files. Alongside this I have a directory filled with
examples of spam to function as testing input.

Along with I have a bash script or two which is used to do things like:

1) start SA in debug mode to check the testing config for errors. 
   No messages are processed - its just looking for configuration
   errors.

2) run SA against a spam sample and only display the list of spam hits

3) run SA against a spam sample and display the entire output message
   using less so it can be scrolled through

4) run SA against the complete spam collection and only display
   references to messages which are not scored as spam

5) replace the live SA configuration with with the current testing
  configuration, i.e. make the most set of changes live.

In practise (1) through (3) are east to combine into a single script
with an option to select the required action while (4) and (5) are best
kept separate.  

It helps a lot of to name the items in the spam collection to relate
each set of similar spam to the local rule that's intended to trap this
spam type.
 
> I'd like to be sure that the rules I write don't turn ham into spam
> and vice versa.
>
It won't if you test the rules against related spam and give some
thought to the score you apply to each rule.
 
> I imagine a utility like this must exists so figured I'd ask here
> before re-inventing the wheel and writing my own (probably bugg)
> script.
>
The sort of scripts I use are fairly short and simple. 
> 
> The script would need to check against all email files in .INBOX.* and
> .Spam directory in a user's IMAP directory.
>
No. Treat this like any other code development project: use a rule
development SA installation like I describe so you never develop rules
using the live mail stream. This way your rules will be better written
and tested and you'll cause fewer false positives in your live mail
stream.

Martin
 





Script or command for testing new rules to ensure new rules don't generate false positives/negatives?

2021-04-23 Thread Steve Dondley
I'm experimenting with writing a library of my own SA rules and scores. 
I'd like to be sure that the rules I write don't turn ham into spam and 
vice versa. I figured the best way to do this would be to run SA against 
an existing collection of ham and spam to make sure emails are still 
scored accurately with the new rules.


I imagine a utility like this must exists so figured I'd ask here before 
re-inventing the wheel and writing my own (probably bugg) script.


The script would need to check against all email files in .INBOX.* and 
.Spam directory in a user's IMAP directory.


Thanks again, everyone.


KAM_LABEL2 false positives

2020-09-01 Thread Anthony Cartmell
Just a quick note:

the KAM_LABEL2 rule hits false positives, thanks to it looking for "PPE"
in subject and text case-insensitively and without boundary specifications.

This means that it hits "happening", so mail asking "what's happening
this week" in Subject and Body triggers the rule.

Anthony
-- 
www.fonant.com - Quality web sites
Tel. 01903 867 810
Fonant Ltd is registered in England and Wales, company No. 7006596
Registered office: Amelia House, Crescent Road, Worthing, West Sussex,
BN11 1QR


Re: DNS Blacklist wildcard query: distinguish IP v4/v6 to avoid false positives

2020-08-08 Thread RW
On Sat, 8 Aug 2020 16:21:24 +0100
RW wrote:

> On Fri, 7 Aug 2020 11:56:45 +0200
> Benoit Panizzon wrote:
> 
> 
> 
> > Well, but now I need to tell SpamAssassin to only query IPv4
> > addresses on the first zone and only query IPv6 addresses on the
> > ip6 one.
> > 
> > I was not able to find a way to achieve this. Did I overlook
> > something?
> >   
> 
> It can almost be done with AskDNS, which has distinct A and 
> lookups. It looks like all that's needed is a reversed version
> _LASTEXTERNALIP_. 

Sorry, that's nonsense.


Re: DNS Blacklist wildcard query: distinguish IP v4/v6 to avoid false positives

2020-08-08 Thread RW
On Fri, 7 Aug 2020 11:56:45 +0200
Benoit Panizzon wrote:



> Well, but now I need to tell SpamAssassin to only query IPv4 addresses
> on the first zone and only query IPv6 addresses on the ip6 one.
> 
> I was not able to find a way to achieve this. Did I overlook
> something?
> 

It can almost be done with AskDNS, which has distinct A and 
lookups. It looks like all that's needed is a reversed version
_LASTEXTERNALIP_. 






Re: DNS Blacklist wildcard query: distinguish IP v4/v6 to avoid false positives

2020-08-07 Thread Benoit Panizzon
Hi Bill

> Easy fix: do not use wildcards in IPv4 listings.

I agree, for the purpose of a 'listed yes/no' blacklist this is the
way to go.

> Both rbldnsd and BIND have other mechanisms for compactly generating 
> records that cover an IPv4 /24 network without also generating records 
> for all of an IPv6 /24 network. I would expect and hope that any other 
> authoritative nameserver would have similar mechanisms.
 
How about reputation databases which might cover the whole ipv4 range
and use more or less specific ranges with different reputation wights?

You would need quite a big DNS server to cover all 4G of ipv4 space.

And what about operators of blacklists which do use wildcards, because
they are not aware that spamassassin will also look up ipv6 addresses
against them and potentially cause false hits?

So having a way to tell spamassassin to restrict lookups on certain
blacklist with ip addresses from only one protocol version only could
still be beneficial.

Mit freundlichen Grüssen

-Benoît Panizzon-
-- 
I m p r o W a r e   A G-Leiter Commerce Kunden
__

Zurlindenstrasse 29 Tel  +41 61 826 93 00
CH-4133 PrattelnFax  +41 61 826 93 01
Schweiz Web  http://www.imp.ch
__


Re: DNS Blacklist wildcard query: distinguish IP v4/v6 to avoid false positives

2020-08-07 Thread Bill Cole

On 7 Aug 2020, at 5:56, Benoit Panizzon wrote:


Hi Gang

I am part of the SWINOG Anti-Spam Blacklists team which are used by a
handfull of swiss ISP.

Very early, we also started adding IPv6 addresses to the blacklist but
soon noticed that there is a potential problem with IPv6 and wildcard
entries.


Easy fix: do not use wildcards in IPv4 listings.

Both rbldnsd and BIND have other mechanisms for compactly generating 
records that cover an IPv4 /24 network without also generating records 
for all of an IPv6 /24 network. I would expect and hope that any other 
authoritative nameserver would have similar mechanisms.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not For Hire (currently)


Re: DNS Blacklist wildcard query: distinguish IP v4/v6 to avoid false positives

2020-08-07 Thread Raymond Dijkxhoorn

Hi!


I don't believe that use-case has been considered before.

What does the rule you are using look like and I will double check?


Not even sure why you want to add that with the asteriks there.


  Let's assume 2.0.0.0/24 is full of abusers and you decide to throw their
  whole /24 in the Blacklist:

  *.0.0.2.dnsbl.example.org 300 in TXT "Bunch of abusers, /24 listed"


Isnt the issue the way you load up your rbldnsd zone?

:127.0.0.2:https://www.wellknownblocklist.org/query/ip/$
1.0.20.0/24 
1.0.128.0/17 
!1.0.180.136


You should not use asteriks but the netmask?

Bye, Raymond


Re: DNS Blacklist wildcard query: distinguish IP v4/v6 to avoid false positives

2020-08-07 Thread Benny Pedersen

Benoit Panizzon skrev den 2020-08-07 11:56:


Well, but now I need to tell SpamAssassin to only query IPv4 addresses
on the first zone and only query IPv6 addresses on the ip6 one.


single zone with recults code for ipv4 and ipv6 ranges, the text record 
need to be overlaping in ipv4 and ipv6, but it can be seperated in 
results code



I was not able to find a way to achieve this. Did I overlook something?


if its possible its good to check default rules :=)

check tflags, i have lost if this can seperate ipv4 and ipv6 here, and 
goodgle is not my friend


Re: DNS Blacklist wildcard query: distinguish IP v4/v6 to avoid false positives

2020-08-07 Thread Kevin A. McGrail
I don't believe that use-case has been considered before.

What does the rule you are using look like and I will double check?

On Fri, Aug 7, 2020, 05:56 Benoit Panizzon  wrote:

> Hi Gang
>
> I am part of the SWINOG Anti-Spam Blacklists team which are used by a
> handfull of swiss ISP.
>
> Very early, we also started adding IPv6 addresses to the blacklist but
> soon noticed that there is a potential problem with IPv6 and wildcard
> entries.
>
> Let's assume 2.0.0.0/24 is full of abusers and you decide to throw their
> whole /24 in the Blacklist:
>
> *.0.0.2.dnsbl.example.org 300 in TXT "Bunch of abusers, /24 listed"
>
> This would wrongfully block an awful lot of IPv6 addresses!
>
> To avoid this issue, we use two different dns zones:
>
> *.0.0.2.dnsbl.example.org 300 in TXT "Bunch of abusers, /24 listed"
>
> *.0.0.0.0.0.0.0.0.8.b.d.0.1.0.0.2.ip6.dnsbl.example.org in TXT
> "Spamer /64 listed"
>
> Well, but now I need to tell SpamAssassin to only query IPv4 addresses
> on the first zone and only query IPv6 addresses on the ip6 one.
>
> I was not able to find a way to achieve this. Did I overlook something?
>
> Mit freundlichen Grüssen
>
> -Benoît Panizzon-
> --
> I m p r o W a r e   A G-Leiter Commerce Kunden
> __
>
> Zurlindenstrasse 29 Tel  +41 61 826 93 00
> CH-4133 PrattelnFax  +41 61 826 93 01
> Schweiz Web  http://www.imp.ch
> __
>


DNS Blacklist wildcard query: distinguish IP v4/v6 to avoid false positives

2020-08-07 Thread Benoit Panizzon
Hi Gang

I am part of the SWINOG Anti-Spam Blacklists team which are used by a
handfull of swiss ISP.

Very early, we also started adding IPv6 addresses to the blacklist but
soon noticed that there is a potential problem with IPv6 and wildcard
entries.

Let's assume 2.0.0.0/24 is full of abusers and you decide to throw their
whole /24 in the Blacklist:

*.0.0.2.dnsbl.example.org 300 in TXT "Bunch of abusers, /24 listed"

This would wrongfully block an awful lot of IPv6 addresses!

To avoid this issue, we use two different dns zones:

*.0.0.2.dnsbl.example.org 300 in TXT "Bunch of abusers, /24 listed"

*.0.0.0.0.0.0.0.0.8.b.d.0.1.0.0.2.ip6.dnsbl.example.org in TXT
"Spamer /64 listed"

Well, but now I need to tell SpamAssassin to only query IPv4 addresses
on the first zone and only query IPv6 addresses on the ip6 one.

I was not able to find a way to achieve this. Did I overlook something?

Mit freundlichen Grüssen

-Benoît Panizzon-
-- 
I m p r o W a r e   A G-Leiter Commerce Kunden
__

Zurlindenstrasse 29 Tel  +41 61 826 93 00
CH-4133 PrattelnFax  +41 61 826 93 01
Schweiz Web  http://www.imp.ch
__


Re: False positives due to __BITCOIN_ID

2019-12-04 Thread Giovanni Bechis
On Wed, Dec 04, 2019 at 08:59:42AM +0100, Benny Pedersen wrote:
> On 2019-12-03 20:15, RW wrote:
> > On Tue, 3 Dec 2019 14:05:10 -0500
> > Mark London wrote:
> > 
> >> It seems to me that the rule for detecting a BITCOIN in an email, is
> >> incorrect.   See below:
> >> 
> >> body __BITCOIN_ID /\b(? >> 
> >> Why is there a \s in this rule?I didn't think that a BITCOIN id
> >> has a space.
> > 
> > It doesn't, but spammers have started splitting them up to evade
> > detections.
> 
> if clients begin to pay to splitted btc it works :=)
> 
> i noted every btc spam have uniq btc address, so maybe its not mean for 
> payment but only hidded tracking
unfortunately it is meant for payment, here a spample:
https://pastebin.com/uBzPeXcX

 Giovanni


signature.asc
Description: PGP signature


Re: False positives due to __BITCOIN_ID

2019-12-04 Thread Benny Pedersen

On 2019-12-03 20:15, RW wrote:

On Tue, 3 Dec 2019 14:05:10 -0500
Mark London wrote:


It seems to me that the rule for detecting a BITCOIN in an email, is
incorrect.   See below:

body __BITCOIN_ID /\b(?

It doesn't, but spammers have started splitting them up to evade
detections.


if clients begin to pay to splitted btc it works :=)

i noted every btc spam have uniq btc address, so maybe its not mean for 
payment but only hidded tracking


Re: False positives due to __BITCOIN_ID

2019-12-03 Thread RW
On Tue, 3 Dec 2019 11:27:11 -0800 (PST)
John Hardin wrote:

> On Tue, 3 Dec 2019, Mark London wrote:
> 
> > It seems to me that the rule for detecting a BITCOIN in an email,
> > is incorrect.   See below:
> >
> > body __BITCOIN_ID /\b(? >
> > Why is there a \s in this rule?I didn't think that a BITCOIN id
> > has a space.  
> 
> Recent obfuscation seen in RL extortion spams.
> 
> > This rule is triggered, on a simple line like this, because of the
> > fact that the line has a "1" in it:
> >
> >For sure figure 1 is convincing that nqR is a good organising  
> 
> Ugh.
> 
> > Maybe this rule needs tweaking?   Thanks.  
> 
> I'm not sure we'd be able to detect obfuscation and not have FPs.
> 
> I'm open to suggestions. Reverting for now.
> 

You could try this:

/\b(?

Re: False positives due to __BITCOIN_ID

2019-12-03 Thread John Hardin

On Tue, 3 Dec 2019, Mark London wrote:

It seems to me that the rule for detecting a BITCOIN in an email, is 
incorrect.   See below:


body __BITCOIN_ID /\b(?Why is there a \s in this rule?I didn't think that a BITCOIN id has a 
space.


Recent obfuscation seen in RL extortion spams.

This rule is triggered, on a simple line like this, because of the fact that 
the line has a "1" in it:


   For sure figure 1 is convincing that nqR is a good organising


Ugh.


Maybe this rule needs tweaking?   Thanks.


I'm not sure we'd be able to detect obfuscation and not have FPs.

I'm open to suggestions. Reverting for now.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Collectivism: forever just one more execution away from Paradise.
---
 4 days until The 78th anniversary of Pearl Harbor


Re: False positives due to __BITCOIN_ID

2019-12-03 Thread RW
On Tue, 3 Dec 2019 14:05:10 -0500
Mark London wrote:

> It seems to me that the rule for detecting a BITCOIN in an email, is 
> incorrect.   See below:
> 
> body __BITCOIN_ID /\b(? 
> Why is there a \s in this rule?I didn't think that a BITCOIN id
> has a space.

It doesn't, but spammers have started splitting them up to evade
detections.


False positives due to __BITCOIN_ID

2019-12-03 Thread Mark London
It seems to me that the rule for detecting a BITCOIN in an email, is 
incorrect.   See below:


body __BITCOIN_ID /\b(?Why is there a \s in this rule?I didn't think that a BITCOIN id has 
a space.


This rule is triggered, on a simple line like this, because of the fact 
that the line has a "1" in it:


For sure figure 1 is convincing that nqR is a good organising

Maybe this rule needs tweaking?   Thanks.

- Mark




Re: Mysterious false positives in inbox

2018-05-09 Thread Eggert Ehmke
Perhaps this is a misunderstanding.  By "same" I mean "this server". The mail 
was originally received by my server via TLS, processed  by mailman and then 
delivered with the ***SPAM*** subject line to the recipients of the mailing 
list, but not to the Quarantine. One of the recipients is my own mailbox. So 
in any case, the poorly configured server is my own.

Am Mittwoch, 9. Mai 2018, 10:35:34 CEST schrieb Ian Zimmerman:
> On 2018-05-09 13:08, Eggert Ehmke wrote:
> > > Wild stab - maybe they're entering the system already with
> > > ***SPAM*** in the subject?
> > 
> > The mail also originated from the same server.
> 
> All the more reason to suspect the "wild stab" is correct.
> 
> In my experience this is quite common on some poorly configured mailing
> list servers.




Re: Mysterious false positives in inbox

2018-05-09 Thread Ian Zimmerman
On 2018-05-09 13:08, Eggert Ehmke wrote:

> > Wild stab - maybe they're entering the system already with
> > ***SPAM*** in the subject?

> The mail also originated from the same server.

All the more reason to suspect the "wild stab" is correct.

In my experience this is quite common on some poorly configured mailing
list servers.

-- 
Please don't Cc: me privately on mailing lists and Usenet,
if you also post the followup to the list or newsgroup.
To reply privately _only_ on Usenet and on broken lists
which rewrite From, fetch the TXT record for no-use.mooo.com.


Re: Mysterious false positives in inbox

2018-05-09 Thread Eggert Ehmke
The mail also originated from the same server.
 Ok, I look into the amavisd config.

Thanks,
Eggert

Am Mittwoch, 9. Mai 2018, 14:06:08 CEST schrieb Reio Remma:
> Wild stab - maybe they're entering the system already with ***SPAM*** in
> the subject?
> 
> With amavisd-new it's amavisd that modifies the subject, local.cf
> shouldn't have an effect on that.
> 
> Good luck,
> Reio
> 
> On 09.05.18 14:02, Eggert Ehmke wrote:
> > Hello,
> > 
> > I have spamassassin 3.4.1 / amavisd / postfix / dovecot installed on
> > my Debian 9.4 server. I also run a mailman mailing list. Most of the
> > time, all runs very well, but occasionally I get mails marked
> > ***SPAM*** in my inbox. These are indeed no spam, but valid mails
> > forwarded by mailman. Training seems to have no effect.
> > 
> > The mails in question have those header entries:
> > 
> > X-Virus-Scanned: Debian amavisd-new at 
> > 
> > X-Spam-Flag: NO
> > 
> > X-Spam-Score: -1
> > 
> > X-Spam-Level:
> > 
> > X-Spam-Status: No, score=-1 tagged_above=-999 required=3
> > tests=[ALL_TRUSTED=-1, SHORTCIRCUIT=-0.0001] autolearn=disabled
> > 
> > With those entries, why is the ***SPAM*** put into the subject line??
> > 
> > In /etc/spamassassin/local.cf are these entries:
> > 
> > rewrite_header Subject ***SPAM*** report_safe 0 trusted_networks  > of my other server> required_score 2.0 use_bayes 1 bayes_auto_learn 1
> > ifplugin Mail::SpamAssassin::Plugin::Shortcircuit shortcircuit
> > ALL_TRUSTED on endif #
> > Mail::SpamAssassin::Plugin::Shortcircuit Any idea?
> > 
> > Eggert




Re: Mysterious false positives in inbox

2018-05-09 Thread Reio Remma
Wild stab - maybe they're entering the system already with ***SPAM*** in 
the subject?


With amavisd-new it's amavisd that modifies the subject, local.cf 
shouldn't have an effect on that.


Good luck,
Reio

On 09.05.18 14:02, Eggert Ehmke wrote:


Hello,

I have spamassassin 3.4.1 / amavisd / postfix / dovecot installed on 
my Debian 9.4 server. I also run a mailman mailing list. Most of the 
time, all runs very well, but occasionally I get mails marked 
***SPAM*** in my inbox. These are indeed no spam, but valid mails 
forwarded by mailman. Training seems to have no effect.


The mails in question have those header entries:

X-Virus-Scanned: Debian amavisd-new at 

X-Spam-Flag: NO

X-Spam-Score: -1

X-Spam-Level:

X-Spam-Status: No, score=-1 tagged_above=-999 required=3 
tests=[ALL_TRUSTED=-1, SHORTCIRCUIT=-0.0001] autolearn=disabled


With those entries, why is the ***SPAM*** put into the subject line??

In /etc/spamassassin/local.cf are these entries:

rewrite_header Subject ***SPAM*** report_safe 0 trusted_networks of my other server> required_score 2.0 use_bayes 1 bayes_auto_learn 1 
ifplugin Mail::SpamAssassin::Plugin::Shortcircuit shortcircuit 
ALL_TRUSTED on endif # 
Mail::SpamAssassin::Plugin::Shortcircuit Any idea?


Eggert





Mysterious false positives in inbox

2018-05-09 Thread Eggert Ehmke
Hello, 

I have spamassassin 3.4.1 / amavisd / postfix / dovecot installed on my Debian 
9.4 server. I 
also run a mailman mailing list. Most of the time, all runs very well, but 
occasionally I get 
mails marked ***SPAM*** in my inbox. These are indeed no spam, but valid mails 
forwarded by mailman. Training seems to have no effect.

The mails in question have those header entries:
X-Virus-Scanned: Debian amavisd-new at 
X-Spam-Flag: NO
X-Spam-Score: -1
X-Spam-Level: 
X-Spam-Status: No, score=-1 tagged_above=-999 required=3 tests=[ALL_TRUSTED=-1, 
SHORTCIRCUIT=-0.0001] autolearn=disabled

With those entries, why is the ***SPAM*** put into the subject line??

In /etc/spamassassin/local.cf are these entries:

rewrite_header Subject ***SPAM*** 

Eggert


Re: T_DKIM_INVALID false positives with Gmail

2018-03-20 Thread RW
On Mon, 19 Mar 2018 11:53:19 -0400
Bill Cole wrote:

> On 19 Mar 2018, at 11:29, Sebastian Arcus wrote:
> 
> > I've been seeing a number of false positives recently from 
> > T_DKIM_INVALID with Gmail emails. Are some Gmail servers 
> > misconfigured,

> There are LOTS of ways to break a DKIM signature. 

Including signing non-existent List-* headers and then posting to a
mailing list.


DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
 d=open-t.co.uk; s=20170820; h=Content-Transfer-Encoding:Content-Type:
 MIME-Version:Date:Message-ID:Subject:From:To:Sender:Reply-To:Cc:Content-ID:
 Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
 :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:
 List-Subscribe:List-Post:List-Owner:List-Archive;...


Re: T_DKIM_INVALID false positives with Gmail

2018-03-19 Thread Kevin A. McGrail
No, because DKIM is verifying the unmodified header/body (more complicated
than that).

--
Kevin A. McGrail
Asst. Treasurer & VP Fundraising, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171

On Mon, Mar 19, 2018 at 11:55 AM, Sebastian Arcus <s.ar...@open-t.co.uk>
wrote:

> On 19/03/18 15:53, Bill Cole wrote:
>
>> On 19 Mar 2018, at 11:29, Sebastian Arcus wrote:
>>
>> I've been seeing a number of false positives recently from T_DKIM_INVALID
>>> with Gmail emails. Are some Gmail servers misconfigured, or could something
>>> be going on at my end? The DKIM record which is flagged as invalid is below:
>>>
>>> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com;
>>> s=20161025; h=mime-version:from:date:message-id:subject:to;bh=8wlgvdpEOm
>>> UO2ugslPxRkFYA/ZThwu2bWy5VmlR76ug=;
>>> b=gRcnOIzmENqS8a91mSdETdXvyH6df7u0tSwsadk6CMD0KtAbzuM3ojHW+kPEo7AB1i
>>>  vnbCDc/vsR6H7pP0k3hZmF7z/dAaeZWD4RVzqM+Fv70oHy4af64j+fGSekOCM9o4ShRQ
>>> Vk3KyF+69sKTK3rRWEnfrcgi/pN2DJWDvrIBRjmFOZYKNVN+8elaVM9DOO7tEMLYuw7T
>>> +sVaUMNt8MuPxRhrskJYOIxK8zzkcJHYV+1TuWJuqZAHRVwgnDWX7q3Wx0GwrX+3lKpm
>>> 3A1+F5dBVjH4dXvdfIESm5XpV8b9uBn9daGWrUgkR+PB23XsL9QkxEqCRXdgII3FRxtQ
>>> Ps6A==
>>>
>>
>> There are LOTS of ways to break a DKIM signature. Whether that one is
>> broken can't be checked and how it might have been broken can't be guessed
>> at without the full *unmodified* headers and body of the message.
>>
>
> I use Exim to pass stuff directly to SA. Could I attach the DKIM header in
> a text file and send it to the list?
>


Re: T_DKIM_INVALID false positives with Gmail

2018-03-19 Thread Sebastian Arcus

On 19/03/18 15:53, Bill Cole wrote:

On 19 Mar 2018, at 11:29, Sebastian Arcus wrote:

I've been seeing a number of false positives recently from 
T_DKIM_INVALID with Gmail emails. Are some Gmail servers 
misconfigured, or could something be going on at my end? The DKIM 
record which is flagged as invalid is below:


DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; 
d=googlemail.com; s=20161025; 
h=mime-version:from:date:message-id:subject:to;bh=8wlgvdpEOmUO2ugslPxRkFYA/ZThwu2bWy5VmlR76ug=; 

b=gRcnOIzmENqS8a91mSdETdXvyH6df7u0tSwsadk6CMD0KtAbzuM3ojHW+kPEo7AB1i 
 vnbCDc/vsR6H7pP0k3hZmF7z/dAaeZWD4RVzqM+Fv70oHy4af64j+fGSekOCM9o4ShRQ
Vk3KyF+69sKTK3rRWEnfrcgi/pN2DJWDvrIBRjmFOZYKNVN+8elaVM9DOO7tEMLYuw7T 
+sVaUMNt8MuPxRhrskJYOIxK8zzkcJHYV+1TuWJuqZAHRVwgnDWX7q3Wx0GwrX+3lKpm 
   3A1+F5dBVjH4dXvdfIESm5XpV8b9uBn9daGWrUgkR+PB23XsL9QkxEqCRXdgII3FRxtQ

Ps6A==


There are LOTS of ways to break a DKIM signature. Whether that one is 
broken can't be checked and how it might have been broken can't be 
guessed at without the full *unmodified* headers and body of the message.


I use Exim to pass stuff directly to SA. Could I attach the DKIM header 
in a text file and send it to the list?


Re: T_DKIM_INVALID false positives with Gmail

2018-03-19 Thread Bill Cole

On 19 Mar 2018, at 11:29, Sebastian Arcus wrote:

I've been seeing a number of false positives recently from 
T_DKIM_INVALID with Gmail emails. Are some Gmail servers 
misconfigured, or could something be going on at my end? The DKIM 
record which is flagged as invalid is below:


DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; 
d=googlemail.com; s=20161025; 
h=mime-version:from:date:message-id:subject:to;bh=8wlgvdpEOmUO2ugslPxRkFYA/ZThwu2bWy5VmlR76ug=;
b=gRcnOIzmENqS8a91mSdETdXvyH6df7u0tSwsadk6CMD0KtAbzuM3ojHW+kPEo7AB1i   
 vnbCDc/vsR6H7pP0k3hZmF7z/dAaeZWD4RVzqM+Fv70oHy4af64j+fGSekOCM9o4ShRQ
Vk3KyF+69sKTK3rRWEnfrcgi/pN2DJWDvrIBRjmFOZYKNVN+8elaVM9DOO7tEMLYuw7T   
+sVaUMNt8MuPxRhrskJYOIxK8zzkcJHYV+1TuWJuqZAHRVwgnDWX7q3Wx0GwrX+3lKpm   
   3A1+F5dBVjH4dXvdfIESm5XpV8b9uBn9daGWrUgkR+PB23XsL9QkxEqCRXdgII3FRxtQ

Ps6A==


There are LOTS of ways to break a DKIM signature. Whether that one is 
broken can't be checked and how it might have been broken can't be 
guessed at without the full *unmodified* headers and body of the 
message.


Re: T_DKIM_INVALID false positives with Gmail

2018-03-19 Thread Kevin A. McGrail
What glue are you using for SA?

DKIM is pretty fragile depending on the signature and implementation.  One
\n\r changed to \n for example which some SMTP transports will do can cause
a failure.

I pretty much consider DKIM a 100% if it works and generally worthless if
it fails technology right now BUT should get better as people realize they
can't muck with things mid transport.

Regards,
KAM

--
Kevin A. McGrail
Asst. Treasurer & VP Fundraising, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171

On Mon, Mar 19, 2018 at 11:29 AM, Sebastian Arcus <s.ar...@open-t.co.uk>
wrote:

> I've been seeing a number of false positives recently from T_DKIM_INVALID
> with Gmail emails. Are some Gmail servers misconfigured, or could something
> be going on at my end? The DKIM record which is flagged as invalid is below:
>
> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com;
> s=20161025; h=mime-version:from:date:message-id:subject:to;bh=8wlgvdpEOm
> UO2ugslPxRkFYA/ZThwu2bWy5VmlR76ug=;
> b=gRcnOIzmENqS8a91mSdETdXvyH6df7u0tSwsadk6CMD0KtAbzuM3ojHW+kPEo7AB1i
> vnbCDc/vsR6H7pP0k3hZmF7z/dAaeZWD4RVzqM+Fv70oHy4af64j+fGSekOCM9o4ShRQ
> Vk3KyF+69sKTK3rRWEnfrcgi/pN2DJWDvrIBRjmFOZYKNVN+8elaVM9DOO7tEMLYuw7T
>  +sVaUMNt8MuPxRhrskJYOIxK8zzkcJHYV+1TuWJuqZAHRVwgnDWX7q3Wx0GwrX+3lKpm
>   3A1+F5dBVjH4dXvdfIESm5XpV8b9uBn9daGWrUgkR+PB23XsL9QkxEqCRXdgII3FRxtQ
> Ps6A==
>


T_DKIM_INVALID false positives with Gmail

2018-03-19 Thread Sebastian Arcus
I've been seeing a number of false positives recently from 
T_DKIM_INVALID with Gmail emails. Are some Gmail servers misconfigured, 
or could something be going on at my end? The DKIM record which is 
flagged as invalid is below:


DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; 
s=20161025; 
h=mime-version:from:date:message-id:subject:to;bh=8wlgvdpEOmUO2ugslPxRkFYA/ZThwu2bWy5VmlR76ug=; 

b=gRcnOIzmENqS8a91mSdETdXvyH6df7u0tSwsadk6CMD0KtAbzuM3ojHW+kPEo7AB1i 
   vnbCDc/vsR6H7pP0k3hZmF7z/dAaeZWD4RVzqM+Fv70oHy4af64j+fGSekOCM9o4ShRQ 

Vk3KyF+69sKTK3rRWEnfrcgi/pN2DJWDvrIBRjmFOZYKNVN+8elaVM9DOO7tEMLYuw7T 
  +sVaUMNt8MuPxRhrskJYOIxK8zzkcJHYV+1TuWJuqZAHRVwgnDWX7q3Wx0GwrX+3lKpm 
 3A1+F5dBVjH4dXvdfIESm5XpV8b9uBn9daGWrUgkR+PB23XsL9QkxEqCRXdgII3FRxtQ

Ps6A==


Re: SURBL false positives ratio

2018-01-04 Thread David Jones

On 01/04/2018 02:12 PM, Pedro David Marco wrote:


Out of curiosity...  how is SUBRL in terms of false positives?? is it a 
worthy IOC DDBB??



Thanks.

---
PedroD


My mail filtering volume is high enough that I would have to pay for a 
feed subscription.  I tried out a trial feed about a year ago for a few 
days and it was horrible for my US-based mail flow.  I saw way too many 
FPs against solid RBLs.  I don't even think I would want to use it even 
if it was free for my mail flow.


Disclaimer: Each mail flow will see much different spam so it may work 
for others.


I get good value out of the Invaluement RBL.  Combine it with Spamhaus 
ZEN and that will block the majority of junk.


--
David Jones


SURBL false positives ratio

2018-01-04 Thread Pedro David Marco

Out of curiosity...  how is SUBRL in terms of false positives?? is it a worthy 
IOC DDBB??

Thanks.
---PedroD

Re: FORGED_YAHOO_RCVD still causing false positives

2017-09-18 Thread Dan Malm
On 09/15/2017 02:26 PM, RW wrote:
> On Fri, 15 Sep 2017 11:50:25 +0100
> Sebastian Arcus wrote:
> 
>> I see this has come up again and again. Since FORGED_YAHOO_RCVD seems
>> to work by checking the address of the Yahoo smtp server in the
>> headers against a predefined list of Yahoo servers in SA, and Yahoo
>> seems to add new servers all the time - which causes false positives,
> 
> It's based on Yahoo received header formats, but they are liable to
> change.
> 
>> is there much point to this check?
> 
> The rule was created and scored when spoofing Yahoo was very common,
> but it isn't any more. I don't think it's worth keeping as it is - high
> maintenance and error prone.
> 

Since yahoo has DMARC with p=reject, just validating DMARC and rejecting
when it tells you to should make the FORGED_YAHOO_RCVD rule redundant.
I've had the score for that rule set to 0 for quite some time.



signature.asc
Description: OpenPGP digital signature


Re: FORGED_YAHOO_RCVD still causing false positives

2017-09-15 Thread Alex
Hi,

On Fri, Sep 15, 2017 at 9:34 AM, Kevin A. McGrail
 wrote:
> On 9/15/2017 8:26 AM, RW wrote:
>>
>> The rule was created and scored when spoofing Yahoo was very common,
>> but it isn't any more. I don't think it's worth keeping as it is - high
>> maintenance and error prone.
>
>
> Agreed.  Score FORGED_YAHOO_RCVD to zero locally and will get a bug open to
> deprecate it.

This then invalidates KAM_GRABBAG5 and KAM_UAH_YAHOOGROUP_SENDER from KAM.cf.


Re: FORGED_YAHOO_RCVD still causing false positives

2017-09-15 Thread Sebastian Arcus


On 15/09/17 14:34, Kevin A. McGrail wrote:

On 9/15/2017 8:26 AM, RW wrote:

The rule was created and scored when spoofing Yahoo was very common,
but it isn't any more. I don't think it's worth keeping as it is - high
maintenance and error prone.


Agreed.  Score FORGED_YAHOO_RCVD to zero locally and will get a bug open 
to deprecate it.


Regards,

KAM


Much appreciated - thank you both!


Re: FORGED_YAHOO_RCVD still causing false positives

2017-09-15 Thread Kevin A. McGrail

On 9/15/2017 8:26 AM, RW wrote:

The rule was created and scored when spoofing Yahoo was very common,
but it isn't any more. I don't think it's worth keeping as it is - high
maintenance and error prone.


Agreed.  Score FORGED_YAHOO_RCVD to zero locally and will get a bug open 
to deprecate it.


Regards,

KAM



Re: FORGED_YAHOO_RCVD still causing false positives

2017-09-15 Thread RW
On Fri, 15 Sep 2017 11:50:25 +0100
Sebastian Arcus wrote:

> I see this has come up again and again. Since FORGED_YAHOO_RCVD seems
> to work by checking the address of the Yahoo smtp server in the
> headers against a predefined list of Yahoo servers in SA, and Yahoo
> seems to add new servers all the time - which causes false positives,

It's based on Yahoo received header formats, but they are liable to
change.

> is there much point to this check?

The rule was created and scored when spoofing Yahoo was very common,
but it isn't any more. I don't think it's worth keeping as it is - high
maintenance and error prone.



FORGED_YAHOO_RCVD still causing false positives

2017-09-15 Thread Sebastian Arcus
I see this has come up again and again. Since FORGED_YAHOO_RCVD seems to 
work by checking the address of the Yahoo smtp server in the headers 
against a predefined list of Yahoo servers in SA, and Yahoo seems to add 
new servers all the time - which causes false positives, is there much 
point to this check?


If not, maybe the default score should be lowered at least to something 
like 0.2 or 0.3 (I think is at 1.5 at the moment).


Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-21 Thread RW
On Thu, 20 Apr 2017 10:41:21 -0400
Lyle Evans wrote:

> I have been getting false positives from Yahoo due to
> FORGED_MUA_MOZILLA hitting on a new X-Mailer line added by Yahoo
> about 3/31/17

I've been looking into this and IMO Yahoo have exposed a problem with
the rule: 

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7411


Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-21 Thread Merijn van den Kroonenberg
> On Thu, 20 Apr 2017, Lyle Evans wrote:
>
>> At 01:00 PM 4/20/2017, John Hardin wrote:
>>> On Thu, 20 Apr 2017, Merijn van den Kroonenberg wrote:
>>>
>>> > > On Thu, 20 Apr 2017 10:41:21 -0400
>>> > > Lyle Evans wrote:
>>> > >
>>> > > > I have been getting false positives from Yahoo due to
>>> > > > FORGED_MUA_MOZILLA hitting on a new X-Mailer line added by Yahoo
>>> > > > about 3/31/17
>>> > > >
>>> > > > The X-Mailer line reads:
>>> > > >
>>> > > > X-Mailer: WebService/1.1.9272 YahooMailNeo Mozilla/5.0 (Windows
>>> NT
>>> > > > 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
>>> > > > Chrome/56.0.2924.87 Safari/537.36
>>> > > /DCE\)/
>>> > >
>>> > > My guess is that they are including the http user-agent header of
>>> the
>>> > > browser that connected to their webmail server.
>>> >
>>> > Correct, I also noticed this a few days ago. Maybe the rule could be
>>> > changed to exclude yahoo...but maybe other webmail applications do
>>> this
>>> > too, not sure.
>>>
>>> Excluding when verified from Yahoo would be the proper approach.
>>
>> I added && !__FROM_YAHOO_COM (from 20_headers.cf) to FORGED_MUA_MOZILLA
>> giving
>>
>> FORGED_MUA_MOZILLA (__MOZILLA_MUA && !__UNUSABLE_MSGID &&
>> !__MOZILLA_MSGID && !__FROM_YAHOO_COM )
>>
>> I am testing that now,
>> any comments or suggestions for improvement are welcome.
>
> My concern would be how easy it might be to spoof __FROM_YAHOO_COM (which
> I'm not at the moment going evaluate...) If it's a basic "From header
> includes 'yahoo.com'" rule (which is what the name suggests), you might
> want to create a meta of __FROM_YAHOO_COM && (__SPF_PASS || __DKIM_PASS)
> (rule names from memory, that's only to suggest the approach) and then use
> that instead of the bare __FROM_YAHOO_COM.
>

I think in this case the ability to spoof/bypass the FORGED_MUA_MOZILLA is
not a huge issue.

Yahoo does DKIM sign the mail:
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com;
s=s2048; t=1492004654; bh=u/RrXL8wELnsl6uuALJnwAC/TQxfVkCBCHQc7pZDY/A=;
h=Date:From:Reply-To:To:In-Reply-To:References:Subject:From:Subject;
b=P5zjzMsC0OoZ7c

But to make it waterproof we would need to verify if the mail was DKIM
signed for d=yahoo.com (and not for a spammer controlled domain). Is it
possible to do this somehow?

I assume checking for DKIM_VALID_AU is not good enough if users can use a
different mail identity in yahoo (I don't know if its possible).

SPF_PASS would work but you would need to check if the EnvelopeFrom is
from yahoo.com

But I think Lyle's rule is already better than nothing and might be good
enough, even if it can be spoofed.




Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-20 Thread Lyle Evans

At 01:00 PM 4/20/2017, John Hardin wrote:

On Thu, 20 Apr 2017, Merijn van den Kroonenberg wrote:


On Thu, 20 Apr 2017 10:41:21 -0400
Lyle Evans wrote:


I have been getting false positives from Yahoo due to
FORGED_MUA_MOZILLA hitting on a new X-Mailer line added by Yahoo
about 3/31/17

The X-Mailer line reads:

X-Mailer: WebService/1.1.9272 YahooMailNeo Mozilla/5.0 (Windows NT
10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/56.0.2924.87 Safari/537.36

/DCE\)/

My guess is that they are including the http user-agent header of the
browser that connected to their webmail server.


Correct, I also noticed this a few days ago. Maybe the rule could be
changed to exclude yahoo...but maybe other webmail applications do this
too, not sure.


Excluding when verified from Yahoo would be the proper approach.


I added && !__FROM_YAHOO_COM (from 20_headers.cf) to FORGED_MUA_MOZILLA
giving

FORGED_MUA_MOZILLA (__MOZILLA_MUA && !__UNUSABLE_MSGID && 
!__MOZILLA_MSGID && !__FROM_YAHOO_COM )


I am testing that now,
any comments or suggestions for improvement are welcome.

Lyle Evans


Unfortunately masscheck is down for migration so any global fix 
won't go out anytime soon...



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-20 Thread Lyle Evans

At 01:00 PM 4/20/2017, John Hardin wrote:

On Thu, 20 Apr 2017, Merijn van den Kroonenberg wrote:


On Thu, 20 Apr 2017 10:41:21 -0400
Lyle Evans wrote:


I have been getting false positives from Yahoo due to
FORGED_MUA_MOZILLA hitting on a new X-Mailer line added by Yahoo
about 3/31/17

The X-Mailer line reads:

X-Mailer: WebService/1.1.9272 YahooMailNeo Mozilla/5.0 (Windows NT
10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/56.0.2924.87 Safari/537.36

/DCE\)/

My guess is that they are including the http user-agent header of the
browser that connected to their webmail server.


Correct, I also noticed this a few days ago. Maybe the rule could be
changed to exclude yahoo...but maybe other webmail applications do this
too, not sure.


Excluding when verified from Yahoo would be the proper approach.


I added && !__FROM_YAHOO_COM (from 20_headers.cf) to FORGED_MUA_MOZILLA
giving

FORGED_MUA_MOZILLA (__MOZILLA_MUA && !__UNUSABLE_MSGID && 
!__MOZILLA_MSGID && !__FROM_YAHOO_COM )


I am testing that now,
any comments or suggestions for improvement are welcome.

Lyle Evans


Unfortunately masscheck is down for migration so any global fix 
won't go out anytime soon...



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79




---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-20 Thread John Hardin

On Thu, 20 Apr 2017, Lyle Evans wrote:


At 01:00 PM 4/20/2017, John Hardin wrote:

On Thu, 20 Apr 2017, Merijn van den Kroonenberg wrote:

> > On Thu, 20 Apr 2017 10:41:21 -0400
> > Lyle Evans wrote:
> > 
> > > I have been getting false positives from Yahoo due to

> > > FORGED_MUA_MOZILLA hitting on a new X-Mailer line added by Yahoo
> > > about 3/31/17
> > > 
> > > The X-Mailer line reads:
> > > 
> > > X-Mailer: WebService/1.1.9272 YahooMailNeo Mozilla/5.0 (Windows NT

> > > 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
> > > Chrome/56.0.2924.87 Safari/537.36
> > /DCE\)/
> > 
> > My guess is that they are including the http user-agent header of the

> > browser that connected to their webmail server.
> 
> Correct, I also noticed this a few days ago. Maybe the rule could be

> changed to exclude yahoo...but maybe other webmail applications do this
> too, not sure.

Excluding when verified from Yahoo would be the proper approach.


I added && !__FROM_YAHOO_COM (from 20_headers.cf) to FORGED_MUA_MOZILLA
giving

FORGED_MUA_MOZILLA (__MOZILLA_MUA && !__UNUSABLE_MSGID && 
!__MOZILLA_MSGID && !__FROM_YAHOO_COM )


I am testing that now,
any comments or suggestions for improvement are welcome.


My concern would be how easy it might be to spoof __FROM_YAHOO_COM (which 
I'm not at the moment going evaluate...) If it's a basic "From header 
includes 'yahoo.com'" rule (which is what the name suggests), you might 
want to create a meta of __FROM_YAHOO_COM && (__SPF_PASS || __DKIM_PASS) 
(rule names from memory, that's only to suggest the approach) and then use 
that instead of the bare __FROM_YAHOO_COM.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Campuses today are a theatrical mashup of
  1984 and Lord of the Flies, performed by people
  who don't understand these references.   -- David Burge
---
 3 days until Max Planck's 159th birthday


Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-20 Thread John Hardin

On Thu, 20 Apr 2017, Merijn van den Kroonenberg wrote:


On Thu, 20 Apr 2017 10:41:21 -0400
Lyle Evans wrote:


I have been getting false positives from Yahoo due to
FORGED_MUA_MOZILLA hitting on a new X-Mailer line added by Yahoo
about 3/31/17

The X-Mailer line reads:

X-Mailer: WebService/1.1.9272 YahooMailNeo Mozilla/5.0 (Windows NT
10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/56.0.2924.87 Safari/537.36

/DCE\)/

My guess is that they are including the http user-agent header of the
browser that connected to their webmail server.



Correct, I also noticed this a few days ago. Maybe the rule could be
changed to exclude yahoo...but maybe other webmail applications do this
too, not sure.


Excluding when verified from Yahoo would be the proper approach.

Unfortunately masscheck is down for migration so any global fix won't go 
out anytime soon...



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  It is criminal to teach a man not to defend himself when he is the
  constant victim of brutal attacks.  -- Malcolm X (1964)
---
 3 days until Max Planck's 159th birthday


Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-20 Thread RW
On Thu, 20 Apr 2017 17:02:57 +0200
Merijn van den Kroonenberg wrote:


> > My guess is that they are including the http user-agent header of
> > the browser that connected to their webmail server.
> >  
> 
> Correct, I also noticed this a few days ago. Maybe the rule could be
> changed to exclude yahoo...but maybe other webmail applications do
> this too, not sure.

I don't get much yahoo mail, is this the norm now? 



Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-20 Thread Merijn van den Kroonenberg
> On Thu, 20 Apr 2017 10:41:21 -0400
> Lyle Evans wrote:
>
>> I have been getting false positives from Yahoo due to
>> FORGED_MUA_MOZILLA hitting on a new X-Mailer line added by Yahoo
>> about 3/31/17
>>
>> The X-Mailer line reads:
>>
>> X-Mailer: WebService/1.1.9272 YahooMailNeo Mozilla/5.0 (Windows NT
>> 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
>> Chrome/56.0.2924.87 Safari/537.36
> /DCE\)/
>
> My guess is that they are including the http user-agent header of the
> browser that connected to their webmail server.
>

Correct, I also noticed this a few days ago. Maybe the rule could be
changed to exclude yahoo...but maybe other webmail applications do this
too, not sure.





Re: False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-20 Thread RW
On Thu, 20 Apr 2017 10:41:21 -0400
Lyle Evans wrote:

> I have been getting false positives from Yahoo due to
> FORGED_MUA_MOZILLA hitting on a new X-Mailer line added by Yahoo
> about 3/31/17
> 
> The X-Mailer line reads:
> 
> X-Mailer: WebService/1.1.9272 YahooMailNeo Mozilla/5.0 (Windows NT 
> 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 
> Chrome/56.0.2924.87 Safari/537.36
/DCE\)/

My guess is that they are including the http user-agent header of the
browser that connected to their webmail server.


False Positives from yahoo due to FORGED_MUA_MOZILLA

2017-04-20 Thread Lyle Evans

I have been getting false positives from Yahoo due to FORGED_MUA_MOZILLA
hitting on a new X-Mailer line added by Yahoo
about 3/31/17

The X-Mailer line reads:

X-Mailer: WebService/1.1.9272 YahooMailNeo Mozilla/5.0 (Windows NT 
10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 
Chrome/56.0.2924.87 Safari/537.36


and the Messagid reads:

Message-ID: <909353831.1397505.1490989414...@mail.yahoo.com>


It is triggering the rule FORGED_MUA_MOZILLA from 20_meta_tests.cf


header __MOZILLA_MUA   X-Mailer =~ /\bMozilla\b/
header __MOZILLA_MSGID MESSAGEID =~ 
/^<[A-F\d]{8}\.[A-F1-9][A-F\d]{0,7}\@\S+>$/m
meta   FORGED_MUA_MOZILLA (__MOZILLA_MUA && !__UNUSABLE_MSGID && 
!__MOZILLA_MSGID)

describe FORGED_MUA_MOZILLAForged mail pretending to be from Mozilla

50_scores.cf: score FORGED_MUA_MOZILLA 2.399 1.596 2.399 2.309

I realize that its just 2.309 points but throw in a few other 
miscellaneous hits and you get a

False Positive. (I'll make another post about one of the miscellaneous hits.)

Where __UNUSABLE_MSGID is defined in 20_ratware.cf
# first define situations where servers rewrite message id so we 
can't use message id to detect forgeries


header __HOTMAIL_BAYDAV_MSGID   MESSAGEID =~ 
/^<[A-Z]{3}\d+-(?:DAV|SMTP)\d+[A-Z0-9]{25}\@phx\.gbl>$/m


header __IPLANET_MESSAGING_SERVER Received =~ /iPlanet Messaging Server/

header __LYRIS_EZLM_REMAILER  List-Unsubscribe =~ 
/<mailto:(?:leave-\S+|\S+-unsubscribe)\@\S+>$/


header __SYMPATICO_MSGIDMESSAGEID =~ 
/^<BAYC\d+-PASMTP\d+[A-Z0-9]{25}\@CEZ\.ICE>$/m


header __WACKY_SENDMAIL_VERSION Received =~ /\/CWT\/DCE\)/

meta __UNUSABLE_MSGID (__LYRIS_EZLM_REMAILER || 
__GATED_THROUGH_RCVD_REMOVER || __WACKY_SENDMAIL_VERSION || 
__IPLANET_MESSAGING_SERVER || __HOTMAIL_BAYDAV_MSGID || __SYMPATICO_MSGID)


My questions are is anybody else seeing this?
Why the @#$%! is Yahoo doing this?
What is the best fix?
I have temporarily removed the rule.

Thanks
Lyle Evans



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



MIME header false positives (was Rule to score word documents)

2016-04-06 Thread Cedric Knight
On 30/03/16 21:11, @lbutlr wrote:
> On Wed Mar 30 2016 13:34:23 Alex   said:
>>
>> /^(Content-(Type|Disposition)\:|[[:space:]]+).*(file)?name="?.*\.doc"?;?$/
>> REJECT
> 
> /^\s*Content-(Disposition|Type).*name\s*=\s*"?(.*\.(ade|adp|bas|bat|chm|cmd|com|cpl|crt|dll|exe|hlp|hta|inf|ins|isp|js|jse|lnk|mdb|mde|mdt|mdw|msc|msi|msp|mst|nws|ops|pcd|pif|prf|reg|scf|scr\??|sct|shb|shs|shm|swf|vb[esx]?|vxd|wsc|wsf|wsh))(\?=)?"?\s*(;|$)/x
> REJECT Attachment name "$2" may not end with ".$3”

I'd like to take the opportunity to warn that regexes like this (and the
version in the Postfix documentation as "man header_checks") have
started blocking email from iPhones.

This is because some Apple email client adds a parameter to Content-Type
that may end in ".com".  The ".*\." can span between those parameters.
If you block extensions in Postfix, check your logs for
"x-apple-part-url" and you may see something like:

server postfix/cleanup[1234]: 123412341234: reject: header Content-Type:
 application/vnd.ms-publisher;??name="redacted
redacted.pub";??x-apple-part-url="abcd1234-1234-5678--123412341...@yahoo.com"

("??" is the CRLF line break.)

For postfix the rule can be rewritten to specify the parameter value to
avoid this type of false positive:

/^Content-(Disposition|Type).*name\s*=\s*
("(?:[^"]|\\")*|[^();:,\/<>\@\"?=<>\[\]\ ]*)
((?:\.|=2E)(
ade|adp|asp|bas|bat|chm|cmd|com|cpl|crt|dll|exe|
hlp|ht[at]|
inf|ins|isp|jse?|lnk|md[betw]|ms[cipt]|nws|
\{[[:xdigit:]]{8}(?:-[[:xdigit:]]{4}){3}-[[:xdigit:]]{12}\}|
ops|pcd|pif|prf|reg|sc[frt]|sh[bsm]|swf|
vb[esx]?|vxd|ws[cfh])(\?=)?"?)\s*(;|$)/x
REJECT Attachment name $2$3 may not end with ".$4"

So far as I can see, no standard SpamAssassin rule checks for .com so
shouldn't cause a false positive, but some rules that are intended to
just check filename extensions and might hit other parts of the header
include OBFU_TEXT_ATTACH, T_OBFU_DOC_ATTACH and __TVD_MIME_ATT_AOPDF.

> Just add the MS Office file extensions to that.
> 
> Then, when your users revolt and are banging on your door with pitchforks and 
> torches, take them out again.

:) or staff the machiolations because you know best.

Some that I seriously would add are .mso, .xl, .ocx and .jar.

CK



Re: False positives with Razor2

2015-12-06 Thread RW
On Sun, 06 Dec 2015 09:28:08 +0100
Torsten Bronger wrote:

> Hallöchen!
> 
> Bill Cole writes:
> 
> > [...]
> >
> > Indicates that someone has sabotaged your SA scores. Those are
> > entirely insane scores for those tests. If the default values were
> > used, that message would not have been misclassified.  
> 
> I myself set those values, almost 10 years ago.  They have served
> very well through those times with 15.000 spams/year.  And in the
> first two years, I even inspected all spam mails and had not a
> single false positive.

Then you've been very lucky. I find that that combination of razor
rules hits about 50% of my spam; it would be astonishing if that
didn't come with some FPs. The frequency of FPs can be very erratic
without a lot of users to average over.

> > And don't trust whoever set your BAYES and RAZOR scores to have
> > anything to do with your spam control.  
> 
> Well, I don't trust Razor anymore!  If there is such a thing as "the
> opposite of spam", then these mails.  Besides, I personally see no
> point in a crowdsourcing tool with scores on the level of
> "HTML_IMAGE_ONLY".


That's a gross exaggeration of the problem. With your scores you could
drop the combined scores of the razor rules to 9 points and avoid the
FPs - that's still over twice your threshold. 

If you really feel the need to score these rule such that they can't be
saved by BAYES_00 you might have a Bayes database that needs
retraining. 


However, the cause of the Razor FPs is the link that starts:

   http://bronger-jmp...

it seems that appspotDOTcom, or any sub-domain on it, causes those razor
rules to fire. Simply removing that dead-link from your signature will
prevent those FPs.


Re: False positives with Razor2

2015-12-06 Thread Torsten Bronger
Hallöchen!

Bill Cole writes:

> [...]
>
> Indicates that someone has sabotaged your SA scores. Those are
> entirely insane scores for those tests. If the default values were
> used, that message would not have been misclassified.

I myself set those values, almost 10 years ago.  They have served
very well through those times with 15.000 spams/year.  And in the
first two years, I even inspected all spam mails and had not a
single false positive.

> [...] Razor (like Cloudmark Authority, its commercial cousin) does
> poorly with low-occurrence URLs. That's why razor-whitelist
> exists. Use it.

I maintain whitelists for spam as a whole, but I don't want to
additionally maintain whitelists for subsystems of it.

> And don't trust whoever set your BAYES and RAZOR scores to have
> anything to do with your spam control.

Well, I don't trust Razor anymore!  If there is such a thing as "the
opposite of spam", then these mails.  Besides, I personally see no
point in a crowdsourcing tool with scores on the level of
"HTML_IMAGE_ONLY".

Anyway, thank you very much for the clarification and explanations!

Regards,
Torsten.

-- 
Torsten BrongerJabber ID: torsten.bron...@jabber.rwth-aachen.de



Re: False positives with Razor2

2015-12-06 Thread Reindl Harald



Am 06.12.2015 um 09:28 schrieb Torsten Bronger:

And don't trust whoever set your BAYES and RAZOR scores to have
anything to do with your spam control.


Well, I don't trust Razor anymore!  If there is such a thing as "the
opposite of spam", then these mails.


nonsense, hence this is a scoring system and you ruined it with zero 
understanding that the scores you raised are even adaptive while "6.0 
RAZOR2_CF_RANGE_E8_51_100" is nothing else than idiotic - soryr there is 
no nicer word for scoring a remote hash-system not controlled at your 
own that high


> They have served very well through those times with
> 15.000 spams/year

well, below the razor hits on a machine having 6000 spamass-milter 
rejects each month with no false-positives


RAZOR: 2165
RAZOR: 1600
RAZOR: 1516
RAZOR: 1394
RAZOR: 1480
RAZOR: 1489


Besides, I personally see no
point in a crowdsourcing tool with scores on the level of
"HTML_IMAGE_ONLY"


really?

than you don't know much about mailusers - a crowdsourcing has always 
false positives caused by idiots marking their newsletters as spam 
instead click on the unsubscribe link, frankly there are enough of them 
which confuse "delete" and "mark as spam" all the time




signature.asc
Description: OpenPGP digital signature


Re: False positives with Razor2

2015-12-05 Thread Bill Cole
On 5 Dec 2015, at 4:42, Torsten Bronger wrote:

> Hallöchen!
>
> In http://wilson.bronger.org/37196


Nope:

*   Trying 176.199.175.106...
* Connected to wilson.bronger.org (176.199.175.106) port 80 (#0)
> GET /37196 HTTP/1.1
> Host: wilson.bronger.org
> User-Agent: curl/7.45.0
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Server: nginx/1.4.6 (Ubuntu)
< Date: Sat, 05 Dec 2015 16:30:56 GMT
< Content-Type: text/html; charset=iso-8859-1
< Content-Length: 290
< Connection: keep-alive
< 


403 Forbidden

Forbidden
You don't have permission to access /37196
on this server.

Apache/2.4.7 (Ubuntu) Server at wilson.bronger.org Port 80



Re: False positives with Razor2

2015-12-05 Thread Torsten Bronger
Hallöchen!

Bill Cole writes:

> On 5 Dec 2015, at 4:42, Torsten Bronger wrote:
>
>> In http://wilson.bronger.org/37196
>
> Nope:

Sorry, works now.

Tschö,
Torsten.

-- 
Torsten BrongerJabber ID: torsten.bron...@jabber.rwth-aachen.de



False positives with Razor2

2015-12-05 Thread Torsten Bronger
Hallöchen!

In http://wilson.bronger.org/37196 you see a mail from myself to
myself which was marked by Razor2.  This is hilarious since I don't
report anything to Razor and such messages are only seen by me.

Is Razor still being maintained?  The webpage doesn't look like
this.  Should I just set the Razor scores to zero?

Tschö,
Torsten.

-- 
Torsten BrongerJabber ID: torsten.bron...@jabber.rwth-aachen.de



Re: False positives with Razor2

2015-12-05 Thread Bill Cole

On 5 Dec 2015, at 14:46, Torsten Bronger wrote:


Hallöchen!

Bill Cole writes:


On 5 Dec 2015, at 4:42, Torsten Bronger wrote:


In http://wilson.bronger.org/37196


Nope:


Sorry, works now.



This:


-5.3 BAYES_00   BODY: Bayes spam probability is 0 to 1%
  [score: 0.]
3.0 RAZOR2_CHECK   Listed in Razor2 (http://razor.sf.net/)
3.0 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
  [cf: 100]
6.0 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
  above 50%
  [cf: 100]



Indicates that someone has sabotaged your SA scores. Those are entirely 
insane scores for those tests. If the default values were used, that 
message would not have been misclassified.


Note that while the Razor client package has not been updated recently, 
it is not something that needs substantial ongoing development: the 
critical component of Razor is in the fingerprint data on Cloudmark's 
servers. What this particular false positive probably means is that 
someone reported a message with an URL similar to one in that message as 
spam. Razor (like Cloudmark Authority, its commercial cousin) does 
poorly with low-occurrence URLs. That's why razor-whitelist exists. Use 
it. And don't trust whoever set your BAYES and RAZOR scores to have 
anything to do with your spam control.


Re: New SA install, configuring for retraining on false positives

2015-11-05 Thread Axb

On 11/05/2015 12:52 PM, David Mehler wrote:

It's looking like I have several options, MailScanner which hooks in
to SA, Amavisd-new ditto, or SA as a milter called directly from my
MTA. Comments on these or other methods?


Mailscanner and Postfix is a hack - it works BUT
Amavisd-new is good and very well supported...
Both in Perl...

If you want something lightweight and no Perl dependency party.

http://fuglu.org

Axb




New SA install, configuring for retraining on false positives

2015-11-05 Thread David Mehler
Hello,

I've got a Postfix email server going with a Mysql database backend on
FreeBSD 10.2. I'm now wanting to add Spamassassin to the picture and
am wondering current best practices? It's been a number of years since
I did it and last time effectiveness wasn't so good. I'm not sure if
it was because I was following old information or didn't have things
done right configuration wise?

It's looking like I have several options, MailScanner which hooks in
to SA, Amavisd-new ditto, or SA as a milter called directly from my
MTA. Comments on these or other methods?

I'm also wanting to get the latest antispam rules, are those from SA
or are there third party rules I should look into?

Finally, one of the things I'm going to implement in addition to SA is
Sieve, done with my MDA Dovecot, in which mail flagged witha spam
header is automatically moved in to a dedicated spam folder. I am then
wanting to set up a system to tell SA when it has misclassified a
false positive, what are people using in that environment?

Any other user feedback appreciated.

Thanks.
Dave.


  1   2   3   4   5   6   >