Re: [dmarc-discuss] PolicyPublished, was PolicyOverride in Reporting
On Sat, Feb 1, 2020 at 5:16 AM Alessandro Vesely wrote: > It is not much how to sort the data as the way to store it during the > day. If the spec meant to track policy changes, it had better put > policy_published in each row, no? > > Frankly, I never thought to work around the size limit by sending multiple > reports. Cute. It obviously assumes that the size limit is an advisory > for maximum message size, as possibly advertised also by the SIZE= SMTP > extension. However, discovering the limit at sending time would result in > a bounce. The other reason for setting a size constraint is to prevent > mailbox full, which also would result in a bounce. Perhaps, dmarcbis > should clarify this point (and this is the wrong list to note that.) > > Out of curiosity, do you often hit 10m size limits? > Honestly, I don't have the stats available. It has happened, but keeping reports to reasonable sizes was considered in the initial design. Brandon > > Best > Ale > > On Fri 31/Jan/2020 19:47:28 +0100 Brandon Long wrote: > > I mean, we're google, so our reporting is done via map reduce (well, > > probably a flume now), and so we just build up a list of unique key > > counters and then group those by the things that are unique by email. We > > already generate multiple reports to address because of size constraints. > > > > Brandon > > > > On Fri, Jan 31, 2020 at 4:38 AM Alessandro Vesely via dmarc-discuss < > > dmarc-discuss@dmarc.org> wrote: > > > >> Hi, > >> > >> I'm not clear on what Brandon actually said. I agree that the > >> schema doesn't make that clear:>> > >> policy_published, like report_metadata, occurs once in a report. I > >> don't think one should send multiple reports in the face of policy > >> changes. It would complicate report sending quite.>> > >> Complication stems from the reporting period. In practice, I have > >> a cron job which runs after midnight UCT. Hence, I apologize with > >> those who have ri != 86400[*], but I override the original ri and > >> send reports daily. If some day I'll notice daily is not enough, > >> for example because I often would exceed the size limit, I may > >> decide to run "zaggregate"[†] more often. The spec says:>> > >> DMARC implementations > >> MUST be able to provide daily reports and SHOULD be able to > >> provide hourly reports when requested. > >> > >> Now, DMARC policy is looked up when mail is received. In some cases, > it is > >> looked up again when the aggregate report is composed. So, the reported > >> policy_published is the one which was retrieved last. Normally it is > the > >> one applied last (except in some cases...) > >> > >> I don't think that sending multiple records (to possibly different > >> rua's) in an attempt to track policy changes is worth the > >> complication. Section 10.2, DNS Load and Caching, doesn't specify > >> a TTL. In fact, policies don't change so often. Splitting reports > >> on the odd day would disrupt both generators and consumers to no > >> avail.>> > >> For PolicyOverride, I fully agree it is part of the key, along > >> with IPAddress. I added a to the example record at the > >> bottom of Freddie's page[‡]>> > >> > >> Best > >> Ale > >> -- > >> > >> [*] Most domain have an "original_ri" of 0, i.e. not specified. > >> An excerpt from my tiny MTA's db: > >> > >> MariaDB [mail]> select original_ri, count(*), MIN(domain) from domain > >> group by > >> original_ri; > >> +-+--+-+ > >> | original_ri | count(*) | MIN(domain) | > >> +-+--+-+ > >> | 0 | 102730 | "BancaMarche" | > >> | 8 |1 | zaspy.com | > >> | 300 |2 | fcotten.nl | > >> | 600 |1 | 0086.info | > >> | 900 |1 | shopeemobile.com| > >> |3600 | 79 | acbc.wa.edu.au | > >> |6400 |2 | ml.mkccvo.info | > >> |7200 |1 | anandbus.net| > >> |8200 |1 | 123-reg.co.uk | > >> | 14400 | 15 | a1mailserver.com| > >> | 21600 |4 | iijmio-mail.jp | > >> | 43200 |1 | f00f.org| > >> | 44200 |1 | freecycle.org | > >> | 84600 |2 | resoundnetworks.com | > >> | 86400 | 1085 | 06d01.mspz3.gob.ec | > >> | 342000 |1 | fmp.com | > >> | 604800 |3 | cgates.lt | > >> +-+--+-+ > >> 17 rows in set (0.11 sec) > >> > >> > >> [†] zaggregate is the name of my reporting program. > >> http://www.tana.it/sw/zdkimfilter/zaggregate.html > >> > >> [‡] http://bit.ly/dmarc-rpt-schema > >> > >> > >> > >> On Wed 29/Jan/2020 12:18:20 +0100 Dotzero via dmarc-discuss wrote: > >>> +1 to what Brandon wrote. > >>> > >>> On Tue, Jan 28, 2020 at 8:23 PM Br
Re: [dmarc-discuss] PolicyPublished, was PolicyOverride in Reporting
It is not much how to sort the data as the way to store it during the day. If the spec meant to track policy changes, it had better put policy_published in each row, no? Frankly, I never thought to work around the size limit by sending multiple reports. Cute. It obviously assumes that the size limit is an advisory for maximum message size, as possibly advertised also by the SIZE= SMTP extension. However, discovering the limit at sending time would result in a bounce. The other reason for setting a size constraint is to prevent mailbox full, which also would result in a bounce. Perhaps, dmarcbis should clarify this point (and this is the wrong list to note that.) Out of curiosity, do you often hit 10m size limits? Best Ale On Fri 31/Jan/2020 19:47:28 +0100 Brandon Long wrote: > I mean, we're google, so our reporting is done via map reduce (well, > probably a flume now), and so we just build up a list of unique key > counters and then group those by the things that are unique by email. We > already generate multiple reports to address because of size constraints. > > Brandon > > On Fri, Jan 31, 2020 at 4:38 AM Alessandro Vesely via dmarc-discuss < > dmarc-discuss@dmarc.org> wrote: > >> Hi, >> >> I'm not clear on what Brandon actually said. I agree that the >> schema doesn't make that clear:>> >> policy_published, like report_metadata, occurs once in a report. I >> don't think one should send multiple reports in the face of policy >> changes. It would complicate report sending quite.>> >> Complication stems from the reporting period. In practice, I have >> a cron job which runs after midnight UCT. Hence, I apologize with >> those who have ri != 86400[*], but I override the original ri and >> send reports daily. If some day I'll notice daily is not enough, >> for example because I often would exceed the size limit, I may >> decide to run "zaggregate"[†] more often. The spec says:>> >> DMARC implementations >> MUST be able to provide daily reports and SHOULD be able to >> provide hourly reports when requested. >> >> Now, DMARC policy is looked up when mail is received. In some cases, it is >> looked up again when the aggregate report is composed. So, the reported >> policy_published is the one which was retrieved last. Normally it is the >> one applied last (except in some cases...) >> >> I don't think that sending multiple records (to possibly different >> rua's) in an attempt to track policy changes is worth the >> complication. Section 10.2, DNS Load and Caching, doesn't specify >> a TTL. In fact, policies don't change so often. Splitting reports >> on the odd day would disrupt both generators and consumers to no >> avail.>> >> For PolicyOverride, I fully agree it is part of the key, along >> with IPAddress. I added a to the example record at the >> bottom of Freddie's page[‡]>> >> >> Best >> Ale >> -- >> >> [*] Most domain have an "original_ri" of 0, i.e. not specified. >> An excerpt from my tiny MTA's db: >> >> MariaDB [mail]> select original_ri, count(*), MIN(domain) from domain >> group by >> original_ri; >> +-+--+-+ >> | original_ri | count(*) | MIN(domain) | >> +-+--+-+ >> | 0 | 102730 | "BancaMarche" | >> | 8 |1 | zaspy.com | >> | 300 |2 | fcotten.nl | >> | 600 |1 | 0086.info | >> | 900 |1 | shopeemobile.com| >> |3600 | 79 | acbc.wa.edu.au | >> |6400 |2 | ml.mkccvo.info | >> |7200 |1 | anandbus.net| >> |8200 |1 | 123-reg.co.uk | >> | 14400 | 15 | a1mailserver.com| >> | 21600 |4 | iijmio-mail.jp | >> | 43200 |1 | f00f.org| >> | 44200 |1 | freecycle.org | >> | 84600 |2 | resoundnetworks.com | >> | 86400 | 1085 | 06d01.mspz3.gob.ec | >> | 342000 |1 | fmp.com | >> | 604800 |3 | cgates.lt | >> +-+--+-+ >> 17 rows in set (0.11 sec) >> >> >> [†] zaggregate is the name of my reporting program. >> http://www.tana.it/sw/zdkimfilter/zaggregate.html >> >> [‡] http://bit.ly/dmarc-rpt-schema >> >> >> >> On Wed 29/Jan/2020 12:18:20 +0100 Dotzero via dmarc-discuss wrote: >>> +1 to what Brandon wrote. >>> >>> On Tue, Jan 28, 2020 at 8:23 PM Brandon Long via dmarc-discuss < >>> dmarc-discuss@dmarc.org> wrote: >>> Isn't the override in the RowType? So you can just have multiple RecordTypes, each with different RowTypes? Ultimately, it seems like the report is a bunch of fields with a count, and so the composition is to make sure that the set of rows is a "unique" key. Theoretically you should log even the published p
Re: [dmarc-discuss] PolicyPublished, was PolicyOverride in Reporting
I mean, we're google, so our reporting is done via map reduce (well, probably a flume now), and so we just build up a list of unique key counters and then group those by the things that are unique by email. We already generate multiple reports to address because of size constraints. Brandon On Fri, Jan 31, 2020 at 4:38 AM Alessandro Vesely via dmarc-discuss < dmarc-discuss@dmarc.org> wrote: > Hi, > > I'm not clear on what Brandon actually said. I agree that the schema > doesn't > make that clear: > > policy_published, like report_metadata, occurs once in a report. I don't > think > one should send multiple reports in the face of policy changes. It would > complicate report sending quite. > > Complication stems from the reporting period. In practice, I have a cron > job > which runs after midnight UCT. Hence, I apologize with those who have ri > != > 86400[*], but I override the original ri and send reports daily. If some > day > I'll notice daily is not enough, for example because I often would exceed > the > size limit, I may decide to run "zaggregate"[†] more often. The spec says: > > DMARC implementations > MUST be able to provide daily reports and SHOULD be able to > provide hourly reports when requested. > > Now, DMARC policy is looked up when mail is received. In some cases, it is > looked up again when the aggregate report is composed. So, the reported > policy_published is the one which was retrieved last. Normally it is the > one > applied last (except in some cases...) > > I don't think that sending multiple records (to possibly different rua's) > in an > attempt to track policy changes is worth the complication. Section 10.2, > DNS > Load and Caching, doesn't specify a TTL. In fact, policies don't change so > often. Splitting reports on the odd day would disrupt both generators and > consumers to no avail. > > For PolicyOverride, I fully agree it is part of the key, along with > IPAddress. > I added a to the example record at the bottom of Freddie's page[‡] > > > Best > Ale > -- > > [*] Most domain have an "original_ri" of 0, i.e. not specified. > An excerpt from my tiny MTA's db: > > MariaDB [mail]> select original_ri, count(*), MIN(domain) from domain > group by > original_ri; > +-+--+-+ > | original_ri | count(*) | MIN(domain) | > +-+--+-+ > | 0 | 102730 | "BancaMarche" | > | 8 |1 | zaspy.com | > | 300 |2 | fcotten.nl | > | 600 |1 | 0086.info | > | 900 |1 | shopeemobile.com| > |3600 | 79 | acbc.wa.edu.au | > |6400 |2 | ml.mkccvo.info | > |7200 |1 | anandbus.net| > |8200 |1 | 123-reg.co.uk | > | 14400 | 15 | a1mailserver.com| > | 21600 |4 | iijmio-mail.jp | > | 43200 |1 | f00f.org| > | 44200 |1 | freecycle.org | > | 84600 |2 | resoundnetworks.com | > | 86400 | 1085 | 06d01.mspz3.gob.ec | > | 342000 |1 | fmp.com | > | 604800 |3 | cgates.lt | > +-+--+-+ > 17 rows in set (0.11 sec) > > > [†] zaggregate is the name of my reporting program. > http://www.tana.it/sw/zdkimfilter/zaggregate.html > > [‡] http://bit.ly/dmarc-rpt-schema > > > > On Wed 29/Jan/2020 12:18:20 +0100 Dotzero via dmarc-discuss wrote: > > +1 to what Brandon wrote. > > > > On Tue, Jan 28, 2020 at 8:23 PM Brandon Long via dmarc-discuss < > > dmarc-discuss@dmarc.org> wrote: > > > >> Isn't the override in the RowType? So you can just have multiple > >> RecordTypes, each with different RowTypes? > >> > >> Ultimately, it seems like the report is a bunch of fields with a count, > >> and so the composition is to make sure that the set of rows is a > "unique" > >> key. Theoretically you should log even the published policy at eval > time, > >> so you can report different counts even if the policy changes over that > >> period... even if you'd have to send separate reports. > >> > >> The schema doesn't really make that clear, to my mind, I wouldn't have > >> buried the count. > >> > >> Brandon > >> > >> On Tue, Jan 28, 2020 at 5:26 AM Brotman, Alex via dmarc-discuss < > >> dmarc-discuss@dmarc.org> wrote: > >> > >>> What is to be done if only a portion of the messages from the reporting > >>> period receive a policy override? Perhaps this is done based on IP, or > >>> only applied part way through the day. It seems like in the > specification, > >>> the reporting definition assumes the entire set of reported messages > has > >>> the override. > >>> > >>> -- > >>> Alex Brotman > >>> Sr. Engineer, Anti-Abuse & Messaging Policy > >>> Comcast > >>> > >>> > >>> _
[dmarc-discuss] PolicyPublished, was PolicyOverride in Reporting
Hi, I'm not clear on what Brandon actually said. I agree that the schema doesn't make that clear: policy_published, like report_metadata, occurs once in a report. I don't think one should send multiple reports in the face of policy changes. It would complicate report sending quite. Complication stems from the reporting period. In practice, I have a cron job which runs after midnight UCT. Hence, I apologize with those who have ri != 86400[*], but I override the original ri and send reports daily. If some day I'll notice daily is not enough, for example because I often would exceed the size limit, I may decide to run "zaggregate"[†] more often. The spec says: DMARC implementations MUST be able to provide daily reports and SHOULD be able to provide hourly reports when requested. Now, DMARC policy is looked up when mail is received. In some cases, it is looked up again when the aggregate report is composed. So, the reported policy_published is the one which was retrieved last. Normally it is the one applied last (except in some cases...) I don't think that sending multiple records (to possibly different rua's) in an attempt to track policy changes is worth the complication. Section 10.2, DNS Load and Caching, doesn't specify a TTL. In fact, policies don't change so often. Splitting reports on the odd day would disrupt both generators and consumers to no avail. For PolicyOverride, I fully agree it is part of the key, along with IPAddress. I added a to the example record at the bottom of Freddie's page[‡] Best Ale -- [*] Most domain have an "original_ri" of 0, i.e. not specified. An excerpt from my tiny MTA's db: MariaDB [mail]> select original_ri, count(*), MIN(domain) from domain group by original_ri; +-+--+-+ | original_ri | count(*) | MIN(domain) | +-+--+-+ | 0 | 102730 | "BancaMarche" | | 8 |1 | zaspy.com | | 300 |2 | fcotten.nl | | 600 |1 | 0086.info | | 900 |1 | shopeemobile.com| |3600 | 79 | acbc.wa.edu.au | |6400 |2 | ml.mkccvo.info | |7200 |1 | anandbus.net| |8200 |1 | 123-reg.co.uk | | 14400 | 15 | a1mailserver.com| | 21600 |4 | iijmio-mail.jp | | 43200 |1 | f00f.org| | 44200 |1 | freecycle.org | | 84600 |2 | resoundnetworks.com | | 86400 | 1085 | 06d01.mspz3.gob.ec | | 342000 |1 | fmp.com | | 604800 |3 | cgates.lt | +-+--+-+ 17 rows in set (0.11 sec) [†] zaggregate is the name of my reporting program. http://www.tana.it/sw/zdkimfilter/zaggregate.html [‡] http://bit.ly/dmarc-rpt-schema On Wed 29/Jan/2020 12:18:20 +0100 Dotzero via dmarc-discuss wrote: > +1 to what Brandon wrote. > > On Tue, Jan 28, 2020 at 8:23 PM Brandon Long via dmarc-discuss < > dmarc-discuss@dmarc.org> wrote: > >> Isn't the override in the RowType? So you can just have multiple >> RecordTypes, each with different RowTypes? >> >> Ultimately, it seems like the report is a bunch of fields with a count, >> and so the composition is to make sure that the set of rows is a "unique" >> key. Theoretically you should log even the published policy at eval time, >> so you can report different counts even if the policy changes over that >> period... even if you'd have to send separate reports. >> >> The schema doesn't really make that clear, to my mind, I wouldn't have >> buried the count. >> >> Brandon >> >> On Tue, Jan 28, 2020 at 5:26 AM Brotman, Alex via dmarc-discuss < >> dmarc-discuss@dmarc.org> wrote: >> >>> What is to be done if only a portion of the messages from the reporting >>> period receive a policy override? Perhaps this is done based on IP, or >>> only applied part way through the day. It seems like in the specification, >>> the reporting definition assumes the entire set of reported messages has >>> the override. >>> >>> -- >>> Alex Brotman >>> Sr. Engineer, Anti-Abuse & Messaging Policy >>> Comcast >>> >>> >>> ___ >>> dmarc-discuss mailing list >>> dmarc-discuss@dmarc.org >>> http://www.dmarc.org/mailman/listinfo/dmarc-discuss >>> >>> NOTE: Participating in this list means you agree to the DMARC Note Well >>> terms (http://www.dmarc.org/note_well.html) >>> >> ___ >> dmarc-discuss mailing list >> dmarc-discuss@dmarc.org >> http://www.dmarc.org/mailman/listinfo/dmarc-discuss >> >> NOTE: Participating in this list means you agree to the DMARC Note Well >> terms (http://www.dmarc.org/note_well.html) > > > ___ > dmarc-discus