On Tuesday, February 14, 2023 4:16:00 PM EST Evan Burke wrote: > On Tue, Feb 14, 2023 at 11:44 AM Michael Thomas <m...@mtcc.com> wrote: > > On Tue, Feb 14, 2023 at 11:18 AM Michael Thomas <m...@mtcc.com> wrote: > >> Have you considered something like rate limiting on the receiver side for > >> things with duplicate msg-id's? Aka, a tar pit, iirc? > > I believe Yahoo does currently use some sort of count-based approach to > detect replay, though I'm not clear on the details. > > > As I recall that technique is sometimes not suggested because (a) we can't > > come up with good advice about how long you need to cache message IDs to > > watch for duplicates, and (b) the longer that cache needs to live, the > > larger of a resource burden the technique imposes, and small operators > > might not be able to do it well. > > > > At maximum, isn't it just the x= value? It seems to me that if you don't > > specify an x= value, or it's essentially infinite, they are saying they > > don't care about "replays". Which is fine in most cases and you can just > > ignore it. Something that really throttles down x= should be a tractable > > problem, right? > > > > But even at scale it seems like a pretty small database in comparison to > > the overall volume. It's would be easy for a receiver to just prune it > > after a day or so, say. > > I think count-based approaches can be made even simpler than that, in fact. > I'm halfway inclined to submit a draft using that approach, as time permits.
I suppose if the thresholds are high enough, it won't hit much in the way of legitimate mail (as an example, I anticipate this message will hit at least hundreds of mail boxes at Gmail, but not millions), but of course letting the first X through isn't ideal. If I had access to a database of numerically scored IP reputation values (I don't currently, but I have in the past, so I can imagine this at least), I think I'd be more inclined to look at the reputation of the domain as a whole (something like average score of messages from an SPF validated Mail From, DKIM validated d=, or DMARC pass domain) and the reputation of the IP for a message from that domain and then if there was sufficient statistical confidence that the reputation of the IP was "bad" compared to the domain's reputation I would infer it was likely being replayed and ignore the signature. I think that approaches the same effect as a "too many dupes" approach without the threshold problem. It does require reputation data, but I assume any entity of a non-trivial size either has access to their own or can buy it from someone else. Scott K _______________________________________________ Ietf-dkim mailing list Ietf-dkim@ietf.org https://www.ietf.org/mailman/listinfo/ietf-dkim