Re: [dmarc-discuss] DMARC Reporting De-duplication
In article <1675430.NNnUSil6oV@kitterma-e6430> you write: >As an example, I have been able to find four messages I sent to >lists.debian.org email lists on April 30th. The volume reported for that >source for that day from various feedback reporters was 2,436. This makes it >a little hard to consume the feedback. My feedback goes into a database where I do occasional summary queries. I don't recall any particular problems doing the analysis and it is kind of fun to extract numbers like how many NANOG subscribers get their mail at Gmail. If a future DMARC 1.01 had deduped reports, some would be deduped, some wouldn't, and it'd be if anything harder to find the signal in the noise. R's, John ___ dmarc-discuss mailing list dmarc-discuss@dmarc.org http://www.dmarc.org/mailman/listinfo/dmarc-discuss NOTE: Participating in this list means you agree to the DMARC Note Well terms (http://www.dmarc.org/note_well.html)
Re: [dmarc-discuss] DMARC Reporting De-duplication
On Fri 04/May/2018 21:37:35 +0200 Scott Kitterman via dmarc-discuss wrote: > > Shouldn't it be possible to de-duplicate these based on message ID *before* > sending aggregate reports back? Can/should this be added to DMARC the next > time the specification is updated? [my emphasis] The "before" I emphasized above suggests that the message-id won't make it into the final report, doing so would seriously inflate the report itself. However, in order to envisage the effect, let's suppose message-id is part of a draft report at the sender, for example right after the count field in a tabular view[*]: Source IP CountMessage-idDisposition SPF ... 192.0.2.1 12blob1@domain none ✗ Fail ... 192.0.2.2 1blob1@domain none ✓ Pass ... 192.0.2.3 1blob2@domain none ✓ Pass ... ... Assuming message-id's are reliable, that table shows two messages, one of which was received 13 times. The second message was received just once, but that doesn't mean it had a single recipient, does it? So, if the multi-destination delivery of a single message results from an expansion (a.k.a. explosion) performed by external relays, the count is going to be higher than for expansions performed internally. Your proposal is to substitute "12" with "1" in the draft report, and then cut the message-id, group by source IP, From: domain (not showed), and results while counting just the rows, correct? That technique wouldn't fully eliminate the inconsistency, because equivalent copies of a message may come from different sources. Thinking of the tricks MTAs deploy to break long recipient lists into multiple messages with shorter list sizes, possibly relayed by different mailouts, tells me that the count field cannot be precise in any case. It is a rough estimate of the results' impact. In that sense, "12" tells you that the SPF failures exemplified above are more important than the two passes, in case you were thinking about hardening your policy. I'd keep it as is. jm2c Ale -- [*] https://en.wikipedia.org/wiki/DMARC#Aggregate_reports ___ dmarc-discuss mailing list dmarc-discuss@dmarc.org http://www.dmarc.org/mailman/listinfo/dmarc-discuss NOTE: Participating in this list means you agree to the DMARC Note Well terms (http://www.dmarc.org/note_well.html)
Re: [dmarc-discuss] DMARC Reporting De-duplication
On 05/04/2018 12:37 PM, Scott Kitterman via dmarc-discuss wrote: I participate in a lot of mailing lists many of which that have a large number of subscribers. ... Shouldn't it be possible to de-duplicate these based on message ID before sending aggregate reports back? Can/should this be added to DMARC the next time the specification is updated? There may be interesting anti-abuse cases that justify storing this kind of information in a readily accessible form for e.g. de-duplication, versus a static log file. But even if receivers / mailbox providers are already doing that, where's their incentive for the reporting change you describe? What would be the resulting improvement in the quality of mailstreams sent using a given domain? The reduction in customer support, or increase in customer satisfaction, of the kind they purportedly see when it's easier to detect fraudulent messages? I can understand how the reporting change you suggest *might* be useful to the individual sender, where the sender and the domain operator total 1 or 2. Can you help us understand what's in it for the other parties involved? And how does it help in the more typical case where there are between dozens and thousands of users of the domain? --S. ___ dmarc-discuss mailing list dmarc-discuss@dmarc.org http://www.dmarc.org/mailman/listinfo/dmarc-discuss NOTE: Participating in this list means you agree to the DMARC Note Well terms (http://www.dmarc.org/note_well.html)
Re: [dmarc-discuss] DMARC Reporting De-duplication
Would this really help? You haven't explained what you mean by "a little hard to consume". On the face of it, it's just an integer; a 1 is no easier to perform arithmetic on than a 2,436. If what you mean is that it's difficult to make meaningful comparison between the number that you send and the number reported received then that's true of course, but that would remain the case as you'd have no way to work out which of the counts in the individual receiver reports included messages that had been processed by mailing lists; you'd replace your 2,436 with an indeterminate number between 1 and, say, 100. - Roland On 05/05/18 03:37, Scott Kitterman via dmarc-discuss wrote: I participate in a lot of mailing lists many of which that have a large number of subscribers. As a result, when I send a single message to a mailing list, many copies of the same message get sent to users at large mail providers. These get counted as individual messages in aggregate reporting. As an example, I have been able to find four messages I sent to lists.debian.org email lists on April 30th. The volume reported for that source for that day from various feedback reporters was 2,436. This makes it a little hard to consume the feedback. Shouldn't it be possible to de-duplicate these based on message ID before sending aggregate reports back? Can/should this be added to DMARC the next time the specification is updated? Scott K ___ dmarc-discuss mailing list dmarc-discuss@dmarc.org http://www.dmarc.org/mailman/listinfo/dmarc-discuss NOTE: Participating in this list means you agree to the DMARC Note Well terms (http://www.dmarc.org/note_well.html) ___ dmarc-discuss mailing list dmarc-discuss@dmarc.org http://www.dmarc.org/mailman/listinfo/dmarc-discuss NOTE: Participating in this list means you agree to the DMARC Note Well terms (http://www.dmarc.org/note_well.html)
[dmarc-discuss] DMARC Reporting De-duplication
I participate in a lot of mailing lists many of which that have a large number of subscribers. As a result, when I send a single message to a mailing list, many copies of the same message get sent to users at large mail providers. These get counted as individual messages in aggregate reporting. As an example, I have been able to find four messages I sent to lists.debian.org email lists on April 30th. The volume reported for that source for that day from various feedback reporters was 2,436. This makes it a little hard to consume the feedback. Shouldn't it be possible to de-duplicate these based on message ID before sending aggregate reports back? Can/should this be added to DMARC the next time the specification is updated? Scott K ___ dmarc-discuss mailing list dmarc-discuss@dmarc.org http://www.dmarc.org/mailman/listinfo/dmarc-discuss NOTE: Participating in this list means you agree to the DMARC Note Well terms (http://www.dmarc.org/note_well.html)