Re: [dmarc-ietf] Does Aggregate Reporting meet "Internet Scale" test?

Mark Alley Thu, 08 Dec 2022 04:56:59 -0800

Adding clarification since I forgot to specify - this would beper-sender per-source. Not a set percentage of all mail received from asource, that obviously would not work as intended.


On 12/8/2022 6:52 AM, Mark Alley wrote:

This may have been thought of before, so forgive the potentiallyduplicate idea, I was musing earlier about feedback reporting based ona percent of the overall mail per-source. I'm thinking of somethingsimilar in concept to the pct= tag for published policy.
This would reduce the overhead required to report from particularsources... But as I'm typing this idea out, this seems less thanfeasible due to the other considerations that come to mind; If areceiver designed to report only on 10% of mail received from asource, was sent 100 emails from said source, and the 80 of thoseemails of mail were forwards, the feedback would be overwhelminglybiased towards forwarding data, and the sender would miss out onreports from direct senders and therefore fully compliant (andarguably more useful) reports.
Evolving on this thought, if a receiver reported subset percentages ofall different types of compliant/non-compliant email per-source (SPFfails/DKIM passes, SPF passes/DKIM fails... etc, etc.) this mightprovide the data needed while still keeping the reporting volumemanageable for less internet-scale receivers.
Though, it goes without saying, this type of reporting would bewoefully inadequate in terms of data availability, and only gives anidea of traffic types seen, not inclusive of all-encompassingvolumetric data that could be derived normally from feedback reportersthat process all emails.
On 12/8/2022 12:58 AM, Douglas Foster wrote:
1) DMARC was a successful 2-company experiment, which was turned intoa widely implemented informational RFP. We are now writing thestandards-track version of that concept. We hope that StandardsTrack will provide the basis for significantly increased adoption. This seems the appropriate time to ask whether the design can beoptimized for efficiency. If you were designing from scratch, wouldthis reporting design be the result? What alternatives have weconsidered and ruled out?
2) The burden of reporting is not experienced equally by all reportsenders. If I send a batch of messages from 1 source domain to:- 10 target domains at Google, I will get 1 report, because Googleconsolidates across target domains.- 10 target domains at Yahoo, I will get 10 reports, because Yahoochooses to disaggregate by target domain.- 10 target domains to Ironport clients, I will get 20 or 30reports. These are client-specific appliances, many clients havemultiple appliances configured in parallel for load balancing, andeach appliance produces its own report.
Google presumably can dedicate servers to the reporting function,while the Ironport servers seem to generate reports in parallel withmessage processing. Altogether, I conclude that Google can absorban increase in workload much more easily than an appliance
3) The burden of reporting is not shared equally at present. Substantially all of my reporting comes from the three sources juststated: Google, Yahoo, and Ironport appliances. Since theseorganizations have not been actively participating, perhaps you areright and they are happy with the present design. On the otherhand, perhaps someone with connections should ask them whether theywant to see optimizations.
4) As DMARC participation grows, the growth curve is not reallylinear. Currently, 40% of my mailstream is covered by DMARCreporting because more than 30% of my outbound mail goes to Googleservers. Altogether, the number of reporting domains, from allsources, is somewhere around 40. To move reporting from 40% ofmessages to 40% of domains, the volume of reports will grow by ordersof magnitude.
5) Which then raises the question of, "Who do we expect to doreporting?" Several participants in this group have expressed theconviction that everyone who benefits from DMARC should alsocontribute to DMARC by doing reporting. This seems fair, but it isprobably not necessary. Reporting from Google alone is probablysufficient for domain owners to know whether or not their servers areproperly configured. But as long as we want everyone toparticipate, we cannot assume that everyone will have Google'sresources to contribute to the reporting task.
All of which says to me that we should be looking to optimize thereporting function to minimize the cost of participation.
Doug Foster


On Tue, Dec 6, 2022 at 10:15 PM Seth Blank <s...@sethblank.com> wrote:

    I'm super unclear what you're talking about.

    https://dmarc.org/2022/03/dmarc-policies-up-84-for-2021/

    Aggregate reporting is used by the largest volume senders on
    earth, and the vast majority of mail received by mailbox
    providers comes with a dmarc record and reporting address attached.

    This is umpteen billions of messages a day that get aggregated
    into reports.

    What are you getting at? That seems pretty internet scale to me...

    Seth

    On Mon, Dec 5, 2022 at 2:01 PM Douglas Foster
    <dougfoster.emailstanda...@gmail.com> wrote:

        I began wondering if Aggregate Reporting works only because
        DMARC has been embraced by a small portion of domain owners.

        1) Is Aggregate Reporting a significant portion of all mail? 
        In some cases, Yes.

        My organization's data:
        Inbound volume is 11 times greater than my outbound volume.
        Inbound mail has 1 new domain for every 5 messages

        Net result:   If I were to do reporting, and reporting became
        requested for most or all domains, my outbound mail volume
        would triple, because my outbound report volume would be
        twice as large as my outbound business mail volume.

        2) Is Aggregate Reporting efficient?   Restating previous
        concerns:

        "All Signature" reporting means:
        We keep evaluating even after successful authentication has
        been established,
        so that we can capture and store data of little actual value,
        even though it causes reduced aggregation and longer reports.

        "No Problems found, No changes found" reporting means:
        We send redundant reports day after day.

        "All Requesters" reporting means:
        We send reports even to domain owners that were blocked
        because of domain reputation.

        A good place to start would be to extend the reporting
        interval for no-problem-found reports.

        Doug Foster


        _______________________________________________
        dmarc mailing list
        dmarc@ietf.org
        https://www.ietf.org/mailman/listinfo/dmarc


_______________________________________________
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc

_______________________________________________
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc

Re: [dmarc-ietf] Does Aggregate Reporting meet "Internet Scale" test?

Reply via email to