On Tuesday, February 14, 2023 4:16:00 PM EST Evan Burke wrote:
> On Tue, Feb 14, 2023 at 11:44 AM Michael Thomas <m...@mtcc.com> wrote:
> > On Tue, Feb 14, 2023 at 11:18 AM Michael Thomas <m...@mtcc.com> wrote:
> >> Have you considered something like rate limiting on the receiver side for
> >> things with duplicate msg-id's? Aka, a tar pit, iirc?
> 
> I believe Yahoo does currently use some sort of count-based approach to
> detect replay, though I'm not clear on the details.
> 
> > As I recall that technique is sometimes not suggested because (a) we can't
> > come up with good advice about how long you need to cache message IDs to
> > watch for duplicates, and (b) the longer that cache needs to live, the
> > larger of a resource burden the technique imposes, and small operators
> > might not be able to do it well.
> > 
> > At maximum, isn't it just the x= value? It seems to me that if you don't
> > specify an x= value, or it's essentially infinite, they are saying they
> > don't care about "replays". Which is fine in most cases and you can just
> > ignore it. Something that really throttles down x= should be a tractable
> > problem, right?
> > 
> > But even at scale it seems like a pretty small database in comparison to
> > the overall volume. It's would be easy for a receiver to just prune it
> > after a day or so, say.
> 
> I think count-based approaches can be made even simpler than that, in fact.
> I'm halfway inclined to submit a draft using that approach, as time permits.

I suppose if the thresholds are high enough, it won't hit much in the way of 
legitimate mail (as an example, I anticipate this message will hit at least 
hundreds of mail boxes at Gmail, but not millions), but of course letting the 
first X through isn't ideal.

If I had access to a database of numerically scored IP reputation values (I 
don't currently, but I have in the past, so I can imagine this at least), I 
think I'd be more inclined to look at the reputation of the domain as a whole 
(something like average score of messages from an SPF validated Mail From, 
DKIM validated d=, or DMARC pass domain) and the reputation of the IP for a 
message from that domain and then if there was sufficient statistical 
confidence 
that the reputation of the IP was "bad" compared to the domain's reputation I 
would infer it was likely being replayed and ignore the signature.

I think that approaches the same effect as a "too many dupes" approach without 
the threshold problem.  It does require reputation data, but I assume any 
entity of a non-trivial size either has access to their own or can buy it from 
someone else.

Scott K


_______________________________________________
Ietf-dkim mailing list
Ietf-dkim@ietf.org
https://www.ietf.org/mailman/listinfo/ietf-dkim

Reply via email to