It is not much how to sort the data as the way to store it during the day.  If 
the spec meant to track policy changes, it had better put policy_published in 
each row, no?

Frankly, I never thought to work around the size limit by sending multiple 
reports.  Cute.  It obviously assumes that the size limit is an advisory for 
maximum message size, as possibly advertised also by the SIZE= SMTP extension.  
However, discovering the limit at sending time would result in a bounce.  The 
other reason for setting a size constraint is to prevent mailbox full, which 
also would result in a bounce.  Perhaps, dmarcbis should clarify this point 
(and this is the wrong list to note that.)

Out of curiosity, do you often hit 10m size limits?

Best
Ale

On Fri 31/Jan/2020 19:47:28 +0100 Brandon Long wrote:
> I mean, we're google, so our reporting is done via map reduce (well,
> probably a flume now), and so we just build up a list of unique key
> counters and then group those by the things that are unique by email.  We
> already generate multiple reports to address because of size constraints.
> 
> Brandon
> 
> On Fri, Jan 31, 2020 at 4:38 AM Alessandro Vesely via dmarc-discuss <
> dmarc-discuss@dmarc.org> wrote:
> 
>> Hi,
>>
>> I'm not clear on what Brandon actually said.  I agree that the
>> schema doesn't make that clear:>>
>> policy_published, like report_metadata, occurs once in a report.  I
>> don't think one should send multiple reports in the face of policy
>> changes.  It would complicate report sending quite.>>
>> Complication stems from the reporting period.  In practice, I have
>> a cron job which runs after midnight UCT.  Hence, I apologize with
>> those who have ri != 86400[*], but I override the original ri and
>> send reports daily.  If some day I'll notice daily is not enough,
>> for example because I often would exceed the size limit, I may
>> decide to run "zaggregate"[†] more often.  The spec says:>>
>>                                                   DMARC implementations
>>       MUST be able to provide daily reports and SHOULD be able to
>>       provide hourly reports when requested.
>>
>> Now, DMARC policy is looked up when mail is received.  In some cases, it is
>> looked up again when the aggregate report is composed.  So, the reported
>> policy_published is the one which was retrieved last.  Normally it is the
>> one applied last (except in some cases...)
>>
>> I don't think that sending multiple records (to possibly different
>> rua's) in an attempt to track policy changes is worth the
>> complication.  Section 10.2, DNS Load and Caching, doesn't specify
>> a TTL.  In fact, policies don't change so often.  Splitting reports
>> on the odd day would disrupt both generators and consumers to no
>> avail.>>
>> For PolicyOverride, I fully agree it is part of the key, along
>> with IPAddress. I added a <reson> to the example record at the
>> bottom of Freddie's page[‡]>>
>>
>> Best
>> Ale
>> --
>>
>> [*] Most domain have an "original_ri" of 0, i.e. not specified.
>> An excerpt from my tiny MTA's db:
>>
>> MariaDB [mail]> select original_ri, count(*), MIN(domain) from domain
>> group by
>> original_ri;
>> +-------------+----------+---------------------+
>> | original_ri | count(*) | MIN(domain)         |
>> +-------------+----------+---------------------+
>> |           0 |   102730 | "BancaMarche"       |
>> |           8 |        1 | zaspy.com           |
>> |         300 |        2 | fcotten.nl          |
>> |         600 |        1 | 0086.info           |
>> |         900 |        1 | shopeemobile.com    |
>> |        3600 |       79 | acbc.wa.edu.au      |
>> |        6400 |        2 | ml.mkccvo.info      |
>> |        7200 |        1 | anandbus.net        |
>> |        8200 |        1 | 123-reg.co.uk       |
>> |       14400 |       15 | a1mailserver.com    |
>> |       21600 |        4 | iijmio-mail.jp      |
>> |       43200 |        1 | f00f.org            |
>> |       44200 |        1 | freecycle.org       |
>> |       84600 |        2 | resoundnetworks.com |
>> |       86400 |     1085 | 06d01.mspz3.gob.ec  |
>> |      342000 |        1 | fmp.com             |
>> |      604800 |        3 | cgates.lt           |
>> +-------------+----------+---------------------+
>> 17 rows in set (0.11 sec)
>>
>>
>> [†] zaggregate is the name of my reporting program.
>> http://www.tana.it/sw/zdkimfilter/zaggregate.html
>>
>> [‡] http://bit.ly/dmarc-rpt-schema
>>
>>
>>
>> On Wed 29/Jan/2020 12:18:20 +0100 Dotzero via dmarc-discuss wrote:
>>> +1 to what Brandon wrote.
>>>
>>> On Tue, Jan 28, 2020 at 8:23 PM Brandon Long via dmarc-discuss <
>>> dmarc-discuss@dmarc.org> wrote:
>>>
>>>> Isn't the override in the RowType?  So you can just have multiple
>>>> RecordTypes, each with different RowTypes?
>>>>
>>>> Ultimately, it seems like the report is a bunch of fields with a count,
>>>> and so the composition is to make sure that the set of rows is a "unique"
>>>> key.  Theoretically you should log even the published policy at eval time,
>>>> so you can report different counts even if the policy changes over that
>>>> period... even if you'd have to send separate reports.
>>>>
>>>> The schema doesn't really make that clear, to my mind, I wouldn't have
>>>> buried the count.
>>>>
>>>> Brandon
>>>>
>>>> On Tue, Jan 28, 2020 at 5:26 AM Brotman, Alex via dmarc-discuss <
>>>> dmarc-discuss@dmarc.org> wrote:
>>>>
>>>>> What is to be done if only a portion of the messages from the reporting
>>>>> period receive a policy override?  Perhaps this is done based on IP, or
>>>>> only applied part way through the day.  It seems like in the 
>>>>> specification,
>>>>> the reporting definition assumes the entire set of reported messages has
>>>>> the override.
>>>>>
>>>>> --
>>>>> Alex Brotman
>>>>> Sr. Engineer, Anti-Abuse & Messaging Policy
>>>>> Comcast
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> dmarc-discuss mailing list
>>>>> dmarc-discuss@dmarc.org
>>>>> http://www.dmarc.org/mailman/listinfo/dmarc-discuss
>>>>>
>>>>> NOTE: Participating in this list means you agree to the DMARC Note Well
>>>>> terms (http://www.dmarc.org/note_well.html)
>>>>>
>>>> _______________________________________________
>>>> dmarc-discuss mailing list
>>>> dmarc-discuss@dmarc.org
>>>> http://www.dmarc.org/mailman/listinfo/dmarc-discuss
>>>>
>>>> NOTE: Participating in this list means you agree to the DMARC Note Well
>>>> terms (http://www.dmarc.org/note_well.html)
>>>
>>>
>>> _______________________________________________
>>> dmarc-discuss mailing list
>>> dmarc-discuss@dmarc.org
>>> http://www.dmarc.org/mailman/listinfo/dmarc-discuss
>>>
>>> NOTE: Participating in this list means you agree to the DMARC Note Well
>>> terms (http://www.dmarc.org/note_well.html)
>>>
>> _______________________________________________
>> dmarc-discuss mailing list
>> dmarc-discuss@dmarc.org
>> http://www.dmarc.org/mailman/listinfo/dmarc-discuss
>>
>> NOTE: Participating in this list means you agree to the DMARC Note Well
>> terms (http://www.dmarc.org/note_well.html)
> 
_______________________________________________
dmarc-discuss mailing list
dmarc-discuss@dmarc.org
http://www.dmarc.org/mailman/listinfo/dmarc-discuss

NOTE: Participating in this list means you agree to the DMARC Note Well terms 
(http://www.dmarc.org/note_well.html)

Reply via email to