Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70, #51)
On Fri 14/May/2021 20:21:59 +0200 Brotman, Alex wrote: We "can avoid", but *must* we? There are a number of tickets this impacts. Yes. Matt mentioned ticket #51 (added in the subject), for example. That change might break consumers who meticulously check the values, but those who just report whatever values they found might be undamaged. In this case, I'd reckon that backward compatibility concerns would account for a rather minor hesitance against the change. I'd say 2 on a 0-10 scale. Perhaps we should rate each ticket. Or we could produce some dummy reports with the new format and check how many parsers still work on them. Best Ale -- ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
Alessandro Vesely wrote on 2021-05-14 20:12: > In my tiny MX I have a cache of 631 aggregate reports received > recently. 121 reports from 31 unique org_names have a /feedback/version > element, 510 from 37 organizations don't. The latter group includes > google.com, Yahoo! Inc., Verizon Media, Mail.Ru, ... In my data, 84% (193/229) of reporters announce 1.0. 16% of reporters omit the version and seem to use the pre-IETF draft schema. Regards, Matt ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
We "can avoid", but *must* we? There are a number of tickets this impacts. -- Alex Brotman Sr. Engineer, Anti-Abuse & Messaging Policy Comcast > -Original Message- > From: dmarc On Behalf Of Alessandro Vesely > Sent: Friday, May 14, 2021 2:13 PM > To: dmarc@ietf.org > Subject: Re: [dmarc-ietf] Versioning and XML namespaces in aggregate > reports (#33, #70) > > On Fri 14/May/2021 15:42:56 +0200 Brotman, Alex wrote: > > There are a few tickets that may break report ingestion systems due to > structure and/or value changes. Should we decide that's an implementation > issue, or that we truly can't change the format of the reports? I'm sure most > ingestion systems are rather flexible given the number of reports that > appear to not match what 7489 states/suggests. > > > Report consumers use XML libraries to recover the value of named fields. > We can safely add fields. Renaming fields or change existing semantics > would break backward compatibility, which I think we can avoid. > > > > If we are going to allow changes to the structure, and there is some > concern about which version the receiver supports (or prefers?), should we > put a flag into the DMARC record? And of course, that may dependent on > the receiver, if multiple are listed, so that would have to belong to each > individual receiving address. > > > Overkill IMHO. > > > >> From: dmarc On Behalf Of Matthäus Wander > >> > >> Regarding the existing top-level below : Even if > >> parsers don't require the version to function, it remains useful for > >> measuring the adoption of the different DMARC specifications (as > >> requested in #70). In fact, one implementation I looked at > >> (parsedmarc) uses it for only this purpose. A missing is logged > as "draft" > >> schema version. > > In my tiny MX I have a cache of 631 aggregate reports received recently. 121 > reports from 31 unique org_names have a /feedback/version element, 510 > from 37 organizations don't. The latter group includes google.com, Yahoo! > Inc., Verizon Media, Mail.Ru, ... > > Perhaps, someone with larger mail flows can bring better statistics. > > > >> Regarding the XML namespace declaration: > >> The XML schema serves not only as specification for developers, but > >> can be also used for automatic syntax checks of reports -- provided > >> that the namespace declaration is fixed. XSD validation is an > >> immensely useful tool for testing the output of report generators. It > >> helped me to discover two nasty bugs in an implementation, which > >> appeared in 2 out of ~10k reports and would have gone unnoticed > otherwise. > > > Very much agreed. Validating the report before sending is very safe. Also > building online aggregate report checking utilities would benefit from this > possibility. > > Does the IETF provide URLs for hosting XSDs? > > > >> A version number within the schema is not necessary for this use case. > > > Or we can stick to a static 1.0, similar to v=DMARC1, > MIME-Version, and the like, if useful. > > > >> A different matter is whether automatic XSD validation on the report > >> consumer side is a supported use case. There is some value in it: two lines > of > >> code suffice to perform input validation. However, the validation is strict > and > >> does not allow for being liberal in what you accept (might be handy for > >> protocol police, though). Achieving upward compatibility is not trivial, > >> because there is no general "ignore all unknown elements" statement in > >> XSD. It is possible to define a placeholder in the schema, but > >> this > >> element must be inserted explicitly into each place where extensibility is > >> desired. This would require careful foresight in the schema design. > > > Designing an abstract extension for ARC is going to be particularly > challenging. > > > Best > Ale > -- > > > > > > > > > > > > > > > > ___ > dmarc mailing list > dmarc@ietf.org > https://urldefense.com/v3/__https://www.ietf.org/mailman/listinfo/dmarc > __;!!CQl3mcHX2A!Q3kGuVczKJh6EQuYf24QFyvWnwvaeUkyjnyhIGu9DMQQ- > 6Xb_w-hV7tSxFRmor-OwwfRXbxMrg$ ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
On Fri 14/May/2021 15:42:56 +0200 Brotman, Alex wrote: There are a few tickets that may break report ingestion systems due to structure and/or value changes. Should we decide that's an implementation issue, or that we truly can't change the format of the reports? I'm sure most ingestion systems are rather flexible given the number of reports that appear to not match what 7489 states/suggests. Report consumers use XML libraries to recover the value of named fields. We can safely add fields. Renaming fields or change existing semantics would break backward compatibility, which I think we can avoid. If we are going to allow changes to the structure, and there is some concern about which version the receiver supports (or prefers?), should we put a flag into the DMARC record? And of course, that may dependent on the receiver, if multiple are listed, so that would have to belong to each individual receiving address. Overkill IMHO. From: dmarc On Behalf Of Matthäus Wander Regarding the existing top-level below : Even if parsers don't require the version to function, it remains useful for measuring the adoption of the different DMARC specifications (as requested in #70). In fact, one implementation I looked at (parsedmarc) uses it for only this purpose. A missing is logged as "draft" schema version. In my tiny MX I have a cache of 631 aggregate reports received recently. 121 reports from 31 unique org_names have a /feedback/version element, 510 from 37 organizations don't. The latter group includes google.com, Yahoo! Inc., Verizon Media, Mail.Ru, ... Perhaps, someone with larger mail flows can bring better statistics. Regarding the XML namespace declaration: The XML schema serves not only as specification for developers, but can be also used for automatic syntax checks of reports -- provided that the namespace declaration is fixed. XSD validation is an immensely useful tool for testing the output of report generators. It helped me to discover two nasty bugs in an implementation, which appeared in 2 out of ~10k reports and would have gone unnoticed otherwise. Very much agreed. Validating the report before sending is very safe. Also building online aggregate report checking utilities would benefit from this possibility. Does the IETF provide URLs for hosting XSDs? A version number within the schema is not necessary for this use case. Or we can stick to a static 1.0, similar to v=DMARC1, MIME-Version, and the like, if useful. A different matter is whether automatic XSD validation on the report consumer side is a supported use case. There is some value in it: two lines of code suffice to perform input validation. However, the validation is strict and does not allow for being liberal in what you accept (might be handy for protocol police, though). Achieving upward compatibility is not trivial, because there is no general "ignore all unknown elements" statement in XSD. It is possible to define a placeholder in the schema, but this element must be inserted explicitly into each place where extensibility is desired. This would require careful foresight in the schema design. Designing an abstract extension for ARC is going to be particularly challenging. Best Ale -- ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
There are a few tickets that may break report ingestion systems due to structure and/or value changes. Should we decide that's an implementation issue, or that we truly can't change the format of the reports? I'm sure most ingestion systems are rather flexible given the number of reports that appear to not match what 7489 states/suggests. If we are going to allow changes to the structure, and there is some concern about which version the receiver supports (or prefers?), should we put a flag into the DMARC record? And of course, that may dependent on the receiver, if multiple are listed, so that would have to belong to each individual receiving address. -- Alex Brotman Sr. Engineer, Anti-Abuse & Messaging Policy Comcast > -Original Message- > From: dmarc On Behalf Of Matthäus Wander > Sent: Thursday, May 13, 2021 5:29 PM > To: dmarc@ietf.org > Subject: Re: [dmarc-ietf] Versioning and XML namespaces in aggregate > reports (#33, #70) > > Alessandro Vesely wrote on 2021-05-10 18:29: > > On Mon 10/May/2021 17:28:20 +0200 Dave Crocker wrote: > >> If an new spec merely /adds/ to a previous spec, then the presence of > >> the new constructs is self-declaring. The only requirement is to > >> have the base specification declare that unrecognized constructs are > >> to be ignored. So, versioning adds the illusion of utility, but > >> really only adds unnecessary complexity. > > > > > > I think the format we'll end up with will be pretty compatible with > > the existing practice, meaning that existing report consumers that use > > a proper XML parser and ignore unknown tags can work unchanged. I > > don't think any consumer parses reports "by hands". > > Alright, introducing incompabilities is off the table and backward compability > is a must. This brings #51 into question, which may affect backward > compability. > https://urldefense.com/v3/__https://trac.ietf.org/trac/dmarc/ticket/51__;!! > CQl3mcHX2A!RCdb_46_lcqdxdM882JSzD-hjS- > 66H5H0OL8qTxqEITjJ7dViYTApbhFoP1sF8sc-3FowsLllQ$ > > > Regarding the existing top-level below : > Even if parsers don't require the version to function, it remains useful for > measuring the adoption of the different DMARC specifications (as requested > in #70). In fact, one implementation I looked at (parsedmarc) uses it for only > this purpose. A missing is logged as "draft" > schema version. > > > Regarding the XML namespace declaration: > The XML schema serves not only as specification for developers, but can be > also used for automatic syntax checks of reports -- provided that the > namespace declaration is fixed. XSD validation is an immensely useful tool for > testing the output of report generators. It helped me to discover two nasty > bugs in an implementation, which appeared in 2 out of ~10k reports and > would have gone unnoticed otherwise. > A version number within the schema is not necessary for this use case. > > A different matter is whether automatic XSD validation on the report > consumer side is a supported use case. There is some value in it: two lines of > code suffice to perform input validation. However, the validation is strict > and > does not allow for being liberal in what you accept (might be handy for > protocol police, though). Achieving upward compatibility is not trivial, > because there is no general "ignore all unknown elements" statement in > XSD. It is possible to define a placeholder in the schema, but this > element must be inserted explicitly into each place where extensibility is > desired. This would require careful foresight in the schema design. > > Regards, > Matt > > ___ > dmarc mailing list > dmarc@ietf.org > https://urldefense.com/v3/__https://www.ietf.org/mailman/listinfo/dmarc > __;!!CQl3mcHX2A!RCdb_46_lcqdxdM882JSzD-hjS- > 66H5H0OL8qTxqEITjJ7dViYTApbhFoP1sF8sc-3GZUejGJQ$ ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
Alessandro Vesely wrote on 2021-05-10 18:29: > On Mon 10/May/2021 17:28:20 +0200 Dave Crocker wrote: >> If an new spec merely /adds/ to a previous spec, then the presence of >> the new constructs is self-declaring. The only requirement is to have >> the base specification declare that unrecognized constructs are to be >> ignored. So, versioning adds the illusion of utility, but really only >> adds unnecessary complexity. > > > I think the format we'll end up with will be pretty compatible with the > existing practice, meaning that existing report consumers that use a > proper XML parser and ignore unknown tags can work unchanged. I don't > think any consumer parses reports "by hands". Alright, introducing incompabilities is off the table and backward compability is a must. This brings #51 into question, which may affect backward compability. https://trac.ietf.org/trac/dmarc/ticket/51 Regarding the existing top-level below : Even if parsers don't require the version to function, it remains useful for measuring the adoption of the different DMARC specifications (as requested in #70). In fact, one implementation I looked at (parsedmarc) uses it for only this purpose. A missing is logged as "draft" schema version. Regarding the XML namespace declaration: The XML schema serves not only as specification for developers, but can be also used for automatic syntax checks of reports -- provided that the namespace declaration is fixed. XSD validation is an immensely useful tool for testing the output of report generators. It helped me to discover two nasty bugs in an implementation, which appeared in 2 out of ~10k reports and would have gone unnoticed otherwise. A version number within the schema is not necessary for this use case. A different matter is whether automatic XSD validation on the report consumer side is a supported use case. There is some value in it: two lines of code suffice to perform input validation. However, the validation is strict and does not allow for being liberal in what you accept (might be handy for protocol police, though). Achieving upward compatibility is not trivial, because there is no general "ignore all unknown elements" statement in XSD. It is possible to define a placeholder in the schema, but this element must be inserted explicitly into each place where extensibility is desired. This would require careful foresight in the schema design. Regards, Matt ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
On Mon 10/May/2021 17:28:20 +0200 Dave Crocker wrote: On 5/10/2021 7:10 AM, Matthäus Wander wrote: I support the use of the namespace declaration. A report with namespace declaration allows for automatic syntax checks with XML Schema Validation. Version numbers, and the like, tend to be a lot less useful than intuition leads one to expect. Automatic syntax checks are a different beast, though. The distinction to make is 'increments' versus 'incompatibilities'. If an new spec merely /adds/ to a previous spec, then the presence of the new constructs is self-declaring. The only requirement is to have the base specification declare that unrecognized constructs are to be ignored. So, versioning adds the illusion of utility, but really only adds unnecessary complexity. I think the format we'll end up with will be pretty compatible with the existing practice, meaning that existing report consumers that use a proper XML parser and ignore unknown tags can work unchanged. I don't think any consumer parses reports "by hands". The added complexity of using proper XML constructs to define the format, as well as properly formatting each instance, enables advanced use of XML parsing tools. Did you notice no site offers aggregate report validation services? Incompatibilities, where new constructs conflict with previous ones, mean that the new specification is not a new version. It is an independent specification. It needs to be labeled accordingly. This is not our case. Even if we find a better TLD for the targetNamespace URL, the format is going to be the first official version of DMARC aggregate report format, following the one(s) in use since 2012. Best Ale -- ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
John Levine wrote on 2021-05-10 17:21: > It appears that Matthäus Wander said: >> 1) #33 suggests to add a versioned XML namespace declaration in the root >> element. >> https://trac.ietf.org/trac/dmarc/ticket/33 >> >> I support the use of the namespace declaration. > > >> 4) How does the report generator know which format version the consumer >> supports? > > It doesn't. If we change the schema, a lot of report parsers will break. > What actual > real world problem does this change solve? The schema is broken already. See: https://trac.ietf.org/trac/dmarc/ticket/44 https://trac.ietf.org/trac/dmarc/ticket/45 https://www.uriports.com/blog/dmarc-reports-ietf-rfc-compliance/ The point is to fix the schema. > I haven't seen a lot of ill-formed reports. You obviously haven't tried XSD validation. Regards, Matt ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
On 5/10/2021 7:10 AM, Matthäus Wander wrote: I support the use of the namespace declaration. A report with namespace declaration allows for automatic syntax checks with XML Schema Validation. Version numbers, and the like, tend to be a lot less useful than intuition leads one to expect. The distinction to make is 'increments' versus 'incompatibilities'.l If an new spec merely /adds/ to a previous spec, then the presence of the new constructs is self-declaring. The only requirement is to have the base specification declare that unrecognized constructs are to be ignored. So, versioning adds the illusion of utility, but really only adds unnecessary complexity. Incompatibilities, where new constructs conflict with previous ones, mean that the new specification is not a new version. It is an independent specification. It needs to be labeled accordingly. d/ -- Dave Crocker dcroc...@gmail.com 408.329.0791 Volunteer, Silicon Valley Chapter Information & Planning Coordinator American Red Cross dave.crock...@redcross.org ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc
Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)
It appears that Matthäus Wander said: >1) #33 suggests to add a versioned XML namespace declaration in the root > element. >https://trac.ietf.org/trac/dmarc/ticket/33 > >I support the use of the namespace declaration. >4) How does the report generator know which format version the consumer >supports? It doesn't. If we change the schema, a lot of report parsers will break. What actual real world problem does this change solve? I haven't seen a lot of ill-formed reports. R's, John ___ dmarc mailing list dmarc@ietf.org https://www.ietf.org/mailman/listinfo/dmarc