Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Rupert Gallagher
A future-proof list that complies with GDPR would automatically rewrite the To 
header, leaving the list address only. Any other recipient will still receive 
it from the original sender.

On Thu, Feb 28, 2019 at 20:29, Mike Marynowski  wrote:

> Unfortunately I don't see a reply-to header on your messages. What do
> you have it set to? I thought mailing lists see who is in the "to"
> section of a reply so that 2 copies aren't sent out. The "mailing list
> ethics" guide I read said to always use "reply all" and the mailing list
> system takes care of not sending duplicate replies.
>
> I removed your direct email from this reply and only kept the mailing
> list address, but for the record I don't see any reply-to headers.
>
> On 2/28/2019 2:21 PM, Bill Cole wrote:
>> Please respect my consciously set Reply-To header. I don't ever need 2
>> copies of a message posted to a mailing list, and ignoring that header
>> is rude.
>>
>> On 28 Feb 2019, at 13:28, Mike Marynowski wrote:
>>
>>> On 2/28/2019 12:41 PM, Bill Cole wrote:
 You should probably put the envelope sender (i.e. the SA
 "EnvelopeFrom" pseudo-header) into that list, maybe even first. That
 will make many messages sent via discussion mailing lists (such as
 this one) pass your test where a test of real header domains would
 fail, while it it is more likely to cause commercial bulk mail to
 fail where it would usually pass based on real standard headers.
 (That's based on a hunch, not testing.)
>>>
>>> Hmmm. I'll have to give some more thought into the exact headers it
>>> decides to test. I'm not sure if my MTA puts in envelope info into
>>> the SA request or not. For sake of simplicity right now I might just
>>> ignore mailing lists, I don't know. What I do know is that in the
>>> spam messages I'm reviewing right now, the reply-to / from headers
>>> set often don't have websites at those domains and none of them are
>>> masquerading as mailing lists. I haven't thought through the
>>> situation with mailing lists yet.
>>>
>>> I'm new to this whole SA plugin dev process - can you suggest the
>>> best way to log the full requests that SA receives so I can see what
>>> info it is getting and what I have to work with?
>>
>> The best way to see far too much information about what SA is doing is
>> to add a "-D all" to the invocation of the spamassassin script. You
>> can also add that to the flags used by spamd, if you want to punish
>> your logging subsystem
>>

Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Mike Marynowski
Changing up the algorithm a bit. Once a domain has been added to the 
cache, the DNS service will perform HTTP checks in the background 
automatically on a much more aggressive schedule for invalid domains so 
that temporary website problems are much less of an issue and invalid 
domains don't delay mail delivery threads for up to 15s after TTL 
expirations during the initial test period with progressively increasing 
TTLs - queries can always return instantly after the first one, as long 
as the domain has been queried in the last 30 days and is still in cache.


Domains deemed to have "invalid" websites will be rechecked much more 
aggressively in the background to ensure newly queried domains with 
temporary website issues stop tripping this filter as soon as possible. 
There will be a "sliding window" of a few days where temporary website 
issues during the window won't cause the filter to trip, it just needs 
to provide a valid response sometime during the sliding window to stay 
in good standing.




Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread RW
On Wed, 27 Feb 2019 12:16:20 -0500
Mike Marynowski wrote:
> Almost all of the spam emails that are
> coming through do not have a working website at the room domain of
> the sender. 

Did you establish what fraction of this spam could be caught just by
looking for an A record? 


Re: whitelist_from_rcvd hits only sometimes

2019-03-01 Thread RW
On Thu, 28 Feb 2019 12:44:16 +0100
Helmut Schneider wrote:

> Hi,
> 
> I'm trying to find out why a message sometimes hits
> whitelist_from_rcvd and sometimes does not. I checked the headers
> again and again but cannot see the difference.

I couldn't reproduce this with the email labelled as 'miss'.  It may be
that there was a difference in the headers at the time of scanning.


Re: TxRep increases sa-learn processing time exponentially

2019-03-01 Thread David Gessel
Nix,


That's probably a reasonable path for now, I'm using TxRep with the diff I 
posted but not on a large mail server.   Thanks for the insight.


-David



On 27/02/2019 17.27, Nix wrote:
> On 27 Feb 2019, David Gessel told this:
>
>> check https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7164
>>
>> My amateur analysis was summarized in this message 
>> https://mail-archives.apache.org/mod_mbox/spamassassin-users/201711.mbox/browser
> btw, that's not a message, that's a whole mailbox. :)
>
> One thread in that mailbox talks about sa-learn taking 90 seconds per
> token. 90 seconds is 3x the flock timeout for the txrep database, which
> is consistent with four lock takeouts, three of them blocking on its own
> locks because it doesn't bother to release the locks (perhaps the author
> wrongly assumes they nest.)
>
> (90s/message is precisely what I saw until I hacked up the ugly
> blocks-on-its-own-locks fix I cited earlier. Honestly, I suspect TxRep's
> lock handling and state handling in general is so much of a tangled mess
> that the thing cannot be considered a suitable replacement for the AWL
> until it's entirely rewritten. It blocks on its own locks, it is clearly
> doing something similar with redis, it reuses other users' configuration
> unless you force it to throw away all its cached state for every message
> and reconnect to all its dbs again (!)... this is not production-quality
> code, sorry. I keep meaning to switch back to the AWL, which might be
> less effective but at least doesn't have giant bugs suggestive of
> software that is just not fully baked scattered all through it.)
>


Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Mike Marynowski
Looking for an A record on what - just the email address domain or the 
chain of parent domains as well? If the latter, well a lack of A record 
will cause this to fail so it's kind of embedded in.


Quick sampling of 10 emails: 8 of them have valid A records on the email 
domain. I presumed SpamAssassin was already doing simple checks like that.


On 3/1/2019 10:23 AM, RW wrote:

On Wed, 27 Feb 2019 12:16:20 -0500
Mike Marynowski wrote:

Almost all of the spam emails that are
coming through do not have a working website at the room domain of
the sender.

Did you establish what fraction of this spam could be caught just by
looking for an A record?





Re: whitelist_from_rcvd hits only sometimes

2019-03-01 Thread Matus UHLAR - fantomas

On 28.02.19 12:44, Helmut Schneider wrote:

I'm trying to find out why a message sometimes hits whitelist_from_rcvd
and sometimes does not. I checked the headers again and again but
cannot see the difference.

whitelist_from_rcvd quarant...@eu.quarantine.symantec.com messagelabs.com
whitelist_from_rcvd quarant...@eu.quarantine.symantec.com messagelabs.net




Miss:


this looks like the "mydomain Content Filter" has modified the message
headers so spamassassin didn't parse them properly.
Do you have the original file?


X-Spam-Score: 19.767
X-Spam-Level: ***
X-Spam-Status: Yes, score=19.767 tagged_above=- required=6.3
 tests=[BAYES_99=6.5, BAYES_999=6.5, HELO_MISC_IP=0.25,
 HTML_MESSAGE=0.001, INTERNETX_UCE_NOT_REG=5, MIME_HTML_ONLY=0.723,
 RCVD_IN_DNSWL_NONE=-0.0001, RDNS_NONE=0.793]
 autolearn=no autolearn_force=no
Received: from deaugmail02.mydomain.com ([127.0.0.1])
 by localhost (deaugmail02.mydomain.com [127.0.0.1]) 
(amavisd-new,port 10024)

 with ESMTP id TbYATLBnkUKk for ;
 Tue, 26 Feb 2019 01:19:03 +0100 (CET)
MIME-Version: 1.0
Subject: [mydomain Content Filter] [EXT] Email Quarantine: You have 2 new
 emails
Received: from deaugmail01-in.mydomain.com (mailin.desog.mydomain.com
[172.20.16.23])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by deaugmail02.mydomain.com (Postfix) with ESMTPS
 for ; Tue, 26 Feb 2019 01:19:03 +0100 (CET)
Received: from mail6.bemta25.messagelabs.com
(mail6.bemta25.messagelabs.com [195.245.230.106])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256bits))
 (No client certificate requested)
 by deaugmail01-in.mydomain.com (Postfix) with ESMTPS id CC521D3AD2F
 for ; Tue, 26 Feb 2019 01:19:03 +0100 (CET)
Received: from [46.226.52.194] (using TLSv1.2 with cipher
DHE-RSA-AES256-GCM-SHA384 (256 bits))
 by server-2.bemta.az-b.eu-west-1.aws.symcld.net id
45/A1-14990-7F5847C5; Tue, 26 Feb 2019 00:19:03 +
Received: (qmail 17246 invoked from network); 26 Feb 2019 00:19:02 -
Received: from mail-css2-1.ld1.messagelabs.net (HELO
inbound.prqfe006003.mgmt.messagelabs.net) (95.131.104.177)
by server-22.tower-282.messagelabs.com with DHE-RSA-AES256-GCM-SHA384
encrypted SMTP; 26 Feb 2019 00:19:02 -
Received: from [127.0.0.1] ([127.0.0.1:38688]
helo=prqfe006003.mgmt.messagelabs.net)
 by prqfe006003.mgmt.messagelabs.net (envelope-from
)
 (ecelerity 4.2.28.58446 r(Core:4.2.28.1)) with
ESMTPS(cipher=AES256-SHA256)
 id DB/F9-02397-6F5847C5; Tue, 26 Feb 2019 00:19:02 +
To: intern...@mydomain.com
Date: Tue, 26 Feb 2019 00:19:02 +
Message-Id:
<20190226001902.43540a5f10d008b5d2c8...@quarantine.messagelabs.com>
From: Email Quarantine 


--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
You have the right to remain silent. Anything you say will be misquoted,
then used against you. 


Re: whitelist_from_rcvd hits only sometimes

2019-03-01 Thread RW
On Fri, 1 Mar 2019 17:46:55 +0100
Matus UHLAR - fantomas wrote:

> On 28.02.19 12:44, Helmut Schneider wrote:
> >I'm trying to find out why a message sometimes hits
> >whitelist_from_rcvd and sometimes does not. I checked the headers
> >again and again but cannot see the difference.
> >
> >whitelist_from_rcvd quarant...@eu.quarantine.symantec.com
> >messagelabs.com whitelist_from_rcvd
> >quarant...@eu.quarantine.symantec.com messagelabs.net  
> 
> 
> >Miss:  
> 
> this looks like the "mydomain Content Filter" has modified the message
> headers so spamassassin didn't parse them properly.
> Do you have the original file?

I removed the SpamAssassin lines and  fixed the wrapping before
testing. There was no problem in parsing it.


Re: whitelist_from_rcvd hits only sometimes

2019-03-01 Thread Matus UHLAR - fantomas



On 28.02.19 12:44, Helmut Schneider wrote:
>I'm trying to find out why a message sometimes hits
>whitelist_from_rcvd and sometimes does not. I checked the headers
>again and again but cannot see the difference.
>
>whitelist_from_rcvd quarant...@eu.quarantine.symantec.com
>messagelabs.com whitelist_from_rcvd
>quarant...@eu.quarantine.symantec.com messagelabs.net


>Miss:



On Fri, 1 Mar 2019 17:46:55 +0100
Matus UHLAR - fantomas wrote:

this looks like the "mydomain Content Filter" has modified the message
headers so spamassassin didn't parse them properly.
Do you have the original file?


On 01.03.19 17:41, RW wrote:

I removed the SpamAssassin lines and  fixed the wrapping before
testing. There was no problem in parsing it.


maybe the original mail was broken in a way SA could not parse it.
hard to decide with only pasted content.

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I wonder how much deeper the ocean would be without sponges. 


Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread RW
On Fri, 1 Mar 2019 11:37:18 -0500
Mike Marynowski wrote:

> Looking for an A record on what - just the email address domain or
> the chain of parent domains as well? If the latter, well a lack of A
> record will cause this to fail so it's kind of embedded in.

Sure, but had it turned-out that most of these domains didn't have the A
record necessary for your HTTP test, it wouldn't have been worth doing
anything more complicated. 

> Quick sampling of 10 emails: 8 of them have valid A records on the
> email domain. I presumed SpamAssassin was already doing simple checks
> like that.

You don't need an A record for email. The last time I looked it just
tests that there's enough DNS for a bounce to be received, so an A or
MX for the sender domain.


Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Antony Stone
On Friday 01 March 2019 at 17:37:18, Mike Marynowski wrote:

> Quick sampling of 10 emails: 8 of them have valid A records on the email
> domain. I presumed SpamAssassin was already doing simple checks like that.

That doesn't sound like a good idea to me (presuming, I mean).


Antony.

-- 
"The future is already here.   It's just not evenly distributed yet."

 - William Gibson

   Please reply to the list;
 please *don't* CC me.


Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Mike Marynowski
Sorry, I meant I thought it was doing those checks because I know I was 
playing with checking A records before and figured the rules would have 
it enabled by default...I tried to find the rules after I sent that 
message and realized that was related to sender domain A record checks 
done in my MTA.


On 3/1/2019 2:26 PM, Antony Stone wrote:

On Friday 01 March 2019 at 17:37:18, Mike Marynowski wrote:


Quick sampling of 10 emails: 8 of them have valid A records on the email
domain. I presumed SpamAssassin was already doing simple checks like that.

That doesn't sound like a good idea to me (presuming, I mean).


Antony.





Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Mike Marynowski

On 3/1/2019 1:07 PM, RW wrote:

Sure, but had it turned-out that most of these domains didn't have the A
record necessary for your HTTP test, it wouldn't have been worth doing
anything more complicated.


I've noticed a lot of the spam domains appear to point to actual web 
servers but throw 403 or 503 errors, which A records wouldn't help with 
and has been taken into account here. As for being "more complicated" - 
it's basically done and running in my test environment for final 
tweaking haha, so bit late now :P It was only a day's work to put 
everything together including the DNS service and caching layer, so meh. 
Unless you mean complicated in the sense that it's more technically 
complicated as opposed to effort wise.



You don't need an A record for email. The last time I looked it just
tests that there's enough DNS for a bounce to be received, so an A or
MX for the sender domain.


I'm confusing different tests here, you can disregard my previous message.



Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Grant Taylor

On 02/28/2019 09:39 PM, Mike Marynowski wrote:
I modified it so it checks the root domain and all subdomains up to the 
email domain.


:-)

As for your question - if afraid.org has a website then you are correct, 
all subdomains of afraid.org will not flag this rule, but if lots of 
afraid.org subdomains are sending spam then I imagine other spam 
detection methods will have a good chance of catching it.


ACK

afraid.org is much like DynDNS in that one entity (afaid.org themselves 
or DynDNS) provide DNS services for other entities.


I don't see a good way to differentiate between the sets of entities.

I'm not sure what you mean by "working up the tree" - if afraid.org has 
a website and I work my way up the tree then either way eventually I'll 
hit afraid.org and get a valid website, no?


True.

I wonder if there is any value in detecting zone boundaries via not 
going any higher up the tree past the zone that's containing the email 
domain(s).


Perhaps something like that would enable differentiation between Afraid 
& DynDNS and the entities that they are hosting DNS services for. 
(Assuming that there are separate zones.


My current implementation fires off concurrent HTTP requests to the root 
domain and all subdomains up to the email domain and waits for a valid 
answer from any of them.


ACK

s/up to/down to/

I don't grok the value of doing this as well as you do.  But I think 
your use case is enough different than mine such that I can't make an 
objective value estimate.


That being said, I do find the idea technically interesting, even if I 
think I'll not utilize it.




--
Grant. . . .
unix || die



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Grant Taylor

On 03/01/2019 01:25 AM, Rupert Gallagher wrote:
A future-proof list that complies with GDPR would automatically rewrite 
the To header, leaving the list address only.


Doesn't GDPR also include things like signatures?  Thus if the mailing 
list is only modifying the email metadata and not the message body (thus 
signature), then it's still subject to GDPR.


I also feel like it is a disservice to the mailing list to hide who the 
message is from.  But I have no idea of the legalities of (not) doing such.



Any other recipient will still receive it from the original sender.


I presume you're talking about (B)CC and additional To recipients.

I never did hear, how does GDPR play out in such a scenario.  Does the 
sender need to make a request to all To / (B)CC recipients for them to 
forget the sender?  Also, does the mailing list operator have any 
responsibility to pass the request on to all subscribers to purge the 
requester from their personal archives?  I feel like there's a LOT of 
unaddressed issues here, and that singling out the mailing list is 
somewhat unfair.  But life's unfair.  So … ¯\_(ツ)_/¯




--
Grant. . . .
unix || die



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Mike Marynowski



On 3/1/2019 4:31 PM, Grant Taylor wrote:
afraid.org is much like DynDNS in that one entity (afaid.org 
themselves or DynDNS) provide DNS services for other entities.


I don't see a good way to differentiate between the sets of entities.


I haven't come across any notable amount of spam that's punched through 
all the other detection methods in place with a reply-to/from email 
address subdomain on a service like that. I'm sure it happens though and 
in that case this filter simply won't add any value.




Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Rupert Gallagher
Case study:

example.com bans any e-mail sent from its third levels up, and does it by spf.

spf-banned.example.com sent mail, and my SA at server.com adds a big fat 
penalty, high enough to bounch it.

Suppose I do not bounch it, and use your filter to check for its websites. It 
turns out that both example.com and spf-banned.example.com have a website. Was 
it worth it to spend cycles on it? I guess not. The spf is an accepted rfc and 
it should have priority. So, I recommend the website test to first read the 
result of the SPF test, quit when positive, continue otherwise.

--- ruga

On Fri, Mar 1, 2019 at 22:31, Grant Taylor  wrote:

> On 02/28/2019 09:39 PM, Mike Marynowski wrote:
>> I modified it so it checks the root domain and all subdomains up to the
>> email domain.
>
> :-)
>
>> As for your question - if afraid.org has a website then you are correct,
>> all subdomains of afraid.org will not flag this rule, but if lots of
>> afraid.org subdomains are sending spam then I imagine other spam
>> detection methods will have a good chance of catching it.
>
> ACK
>
> afraid.org is much like DynDNS in that one entity (afaid.org themselves
> or DynDNS) provide DNS services for other entities.
>
> I don't see a good way to differentiate between the sets of entities.
>
>> I'm not sure what you mean by "working up the tree" - if afraid.org has
>> a website and I work my way up the tree then either way eventually I'll
>> hit afraid.org and get a valid website, no?
>
> True.
>
> I wonder if there is any value in detecting zone boundaries via not
> going any higher up the tree past the zone that's containing the email
> domain(s).
>
> Perhaps something like that would enable differentiation between Afraid
> & DynDNS and the entities that they are hosting DNS services for.
> (Assuming that there are separate zones.
>
>> My current implementation fires off concurrent HTTP requests to the root
>> domain and all subdomains up to the email domain and waits for a valid
>> answer from any of them.
>
> ACK
>
> s/up to/down to/
>
> I don't grok the value of doing this as well as you do. But I think
> your use case is enough different than mine such that I can't make an
> objective value estimate.
>
> That being said, I do find the idea technically interesting, even if I
> think I'll not utilize it.
>
> --
> Grant. . . .
> unix || die

Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Mike Marynowski
Does SpamAssassin even have facilities to do that? Don't all rules run 
all the time? SpamAssassin still needs to run all the rules because MTAs 
might have different spam mark / spam delete /etc thresholds than the 
one set in SA.


The number of cycles you're talking about is the same as an RBL lookup 
so I really don't see it as being significant. The DNS service does all 
the heavy lifting and I'm planning to make it public.


On 3/1/2019 5:09 PM, Rupert Gallagher wrote:

Case study:

example.com bans any e-mail sent from its third levels up, and does it 
by spf.


spf-banned.example.com sent mail, and my SA at server.com adds a big 
fat penalty, high enough to bounch it.


Suppose I do not bounch it, and use your filter to check for its 
websites. It turns out that both example.com and 
spf-banned.example.com have a website. Was it worth it to spend cycles 
on it? I guess not. The spf is an accepted rfc and it should have 
priority. So, I recommend the website test to first read the result of 
the SPF test, quit when positive, continue otherwise.


--- ruga


On 3/1/2019 5:09 PM, Rupert Gallagher wrote:

Case study:

example.com bans any e-mail sent from its third levels up, and does it 
by spf.


spf-banned.example.com sent mail, and my SA at server.com adds a big 
fat penalty, high enough to bounch it.


Suppose I do not bounch it, and use your filter to check for its 
websites. It turns out that both example.com and 
spf-banned.example.com have a website. Was it worth it to spend cycles 
on it? I guess not. The spf is an accepted rfc and it should have 
priority. So, I recommend the website test to first read the result of 
the SPF test, quit when positive, continue otherwise.


--- ruga



On Fri, Mar 1, 2019 at 22:31, Grant Taylor > wrote:

On 02/28/2019 09:39 PM, Mike Marynowski wrote:
> I modified it so it checks the root domain and all subdomains up to the
> email domain.

:-)

> As for your question - if afraid.org has a website then you are 
correct,

> all subdomains of afraid.org will not flag this rule, but if lots of
> afraid.org subdomains are sending spam then I imagine other spam
> detection methods will have a good chance of catching it.

ACK

afraid.org is much like DynDNS in that one entity (afaid.org themselves
or DynDNS) provide DNS services for other entities.

I don't see a good way to differentiate between the sets of entities.

> I'm not sure what you mean by "working up the tree" - if afraid.org has
> a website and I work my way up the tree then either way eventually I'll
> hit afraid.org and get a valid website, no?

True.

I wonder if there is any value in detecting zone boundaries via not
going any higher up the tree past the zone that's containing the email
domain(s).

Perhaps something like that would enable differentiation between Afraid
& DynDNS and the entities that they are hosting DNS services for.
(Assuming that there are separate zones.

> My current implementation fires off concurrent HTTP requests to the 
root

> domain and all subdomains up to the email domain and waits for a valid
> answer from any of them.

ACK

s/up to/down to/

I don't grok the value of doing this as well as you do. But I think
your use case is enough different than mine such that I can't make an
objective value estimate.

That being said, I do find the idea technically interesting, even if I
think I'll not utilize it.



--
Grant. . . .
unix || die








Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Rupert Gallagher
The focus was on the To header for mailing lists, complaints on MUAs and 
people's choices. If you do not want to appear in the To header of a list, you 
are exercising a legal right under the GDPR. So, to cut through all those 
problems and enforce a sound solution, I suggest list majordomos do the 
compliance heavy lifting by forcing a sane To header. That's all. If you want 
to talk more in general about GDPR, I do it everyday, so leave me alone on 
weekends, will you? :-)

On Fri, Mar 1, 2019 at 22:41, Grant Taylor  wrote:

> On 03/01/2019 01:25 AM, Rupert Gallagher wrote:
>> A future-proof list that complies with GDPR would automatically rewrite
>> the To header, leaving the list address only.
>
> Doesn't GDPR also include things like signatures? Thus if the mailing
> list is only modifying the email metadata and not the message body (thus
> signature), then it's still subject to GDPR.
>
> I also feel like it is a disservice to the mailing list to hide who the
> message is from. But I have no idea of the legalities of (not) doing such.
>
>> Any other recipient will still receive it from the original sender.
>
> I presume you're talking about (B)CC and additional To recipients.
>
> I never did hear, how does GDPR play out in such a scenario. Does the
> sender need to make a request to all To / (B)CC recipients for them to
> forget the sender? Also, does the mailing list operator have any
> responsibility to pass the request on to all subscribers to purge the
> requester from their personal archives? I feel like there's a LOT of
> unaddressed issues here, and that singling out the mailing list is
> somewhat unfair. But life's unfair. So … ¯_(ツ)_/¯
>
> --
> Grant. . . .
> unix || die

Re: Spam rule for HTTP/HTTPS request to sender's root domain

2019-03-01 Thread Rupert Gallagher
On Fri, Mar 1, 2019 at 23:14, Mike Marynowski  wrote:

>> Does SpamAssassin even have facilities to do that?

> Yes, if spf runs at priority 1, you can define your test at priority 2, so SA 
> executes them in the given order.

>> Don't all rules run all the time?

> They run when relevant, in the given order, and they do whay they say, so if 
> you say that webtest stops if spf test succeeds, then SA does it.

>> SpamAssassin still needs to run all the rules because MTAs might have 
>> different spam mark / spam delete /etc thresholds than the one set in SA.
>
>> The number of cycles you're talking about is the same as an RBL lookup so I 
>> really don't see it as being significant. The DNS service does all the heavy 
>> lifting and I'm planning to make it public.
>
> It is significant of you have many emails to process. It is even more 
> significant if you run the test locally.
>
> On 3/1/2019 5:09 PM, Rupert Gallagher wrote:
>
>> Case study:
>>
>> example.com bans any e-mail sent from its third levels up, and does it by 
>> spf.
>>
>> spf-banned.example.com sent mail, and my SA at server.com adds a big fat 
>> penalty, high enough to bounch it.
>>
>> Suppose I do not bounch it, and use your filter to check for its websites. 
>> It turns out that both example.com and spf-banned.example.com have a 
>> website. Was it worth it to spend cycles on it? I guess not. The spf is an 
>> accepted rfc and it should have priority. So, I recommend the website test 
>> to first read the result of the SPF test, quit when positive, continue 
>> otherwise.
>>
>> --- ruga
>
> On 3/1/2019 5:09 PM, Rupert Gallagher wrote:
>
>> Case study:
>>
>> example.com bans any e-mail sent from its third levels up, and does it by 
>> spf.
>>
>> spf-banned.example.com sent mail, and my SA at server.com adds a big fat 
>> penalty, high enough to bounch it.
>>
>> Suppose I do not bounch it, and use your filter to check for its websites. 
>> It turns out that both example.com and spf-banned.example.com have a 
>> website. Was it worth it to spend cycles on it? I guess not. The spf is an 
>> accepted rfc and it should have priority. So, I recommend the website test 
>> to first read the result of the SPF test, quit when positive, continue 
>> otherwise.
>>
>> --- ruga
>>
>> On Fri, Mar 1, 2019 at 22:31, Grant Taylor  
>> wrote:
>>
>>> On 02/28/2019 09:39 PM, Mike Marynowski wrote:
 I modified it so it checks the root domain and all subdomains up to the
 email domain.
>>>
>>> :-)
>>>
 As for your question - if afraid.org has a website then you are correct,
 all subdomains of afraid.org will not flag this rule, but if lots of
 afraid.org subdomains are sending spam then I imagine other spam
 detection methods will have a good chance of catching it.
>>>
>>> ACK
>>>
>>> afraid.org is much like DynDNS in that one entity (afaid.org themselves
>>> or DynDNS) provide DNS services for other entities.
>>>
>>> I don't see a good way to differentiate between the sets of entities.
>>>
 I'm not sure what you mean by "working up the tree" - if afraid.org has
 a website and I work my way up the tree then either way eventually I'll
 hit afraid.org and get a valid website, no?
>>>
>>> True.
>>>
>>> I wonder if there is any value in detecting zone boundaries via not
>>> going any higher up the tree past the zone that's containing the email
>>> domain(s).
>>>
>>> Perhaps something like that would enable differentiation between Afraid
>>> & DynDNS and the entities that they are hosting DNS services for.
>>> (Assuming that there are separate zones.
>>>
 My current implementation fires off concurrent HTTP requests to the root
 domain and all subdomains up to the email domain and waits for a valid
 answer from any of them.
>>>
>>> ACK
>>>
>>> s/up to/down to/
>>>
>>> I don't grok the value of doing this as well as you do. But I think
>>> your use case is enough different than mine such that I can't make an
>>> objective value estimate.
>>>
>>> That being said, I do find the idea technically interesting, even if I
>>> think I'll not utilize it.
>>>
>>> --
>>> Grant. . . .
>>> unix || die