Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70, #51)

2021-05-16 Thread Alessandro Vesely

On Fri 14/May/2021 20:21:59 +0200 Brotman, Alex wrote:

We "can avoid", but *must* we?  There are a number of tickets this impacts.



Yes.  Matt mentioned ticket #51 (added in the subject), for example.

That change might break consumers who meticulously check the values, but those 
who just report whatever values they found might be undamaged.  In this case, 
I'd reckon that backward compatibility concerns would account for a rather 
minor hesitance against the change.  I'd say 2 on a 0-10 scale.


Perhaps we should rate each ticket.  Or we could produce some dummy reports 
with the new format and check how many parsers still work on them.



Best
Ale
--








___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-15 Thread Matthäus Wander
Alessandro Vesely wrote on 2021-05-14 20:12:
> In my tiny MX I have a cache of 631 aggregate reports received
> recently.  121 reports from 31 unique org_names have a /feedback/version
> element, 510 from 37 organizations don't.  The latter group includes
> google.com, Yahoo! Inc., Verizon Media, Mail.Ru, ...

In my data, 84% (193/229) of reporters announce 1.0.
16% of reporters omit the version and seem to use the pre-IETF draft schema.

Regards,
Matt

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-14 Thread Brotman, Alex
We "can avoid", but *must* we?  There are a number of tickets this impacts.

--
Alex Brotman
Sr. Engineer, Anti-Abuse & Messaging Policy
Comcast

> -Original Message-
> From: dmarc  On Behalf Of Alessandro Vesely
> Sent: Friday, May 14, 2021 2:13 PM
> To: dmarc@ietf.org
> Subject: Re: [dmarc-ietf] Versioning and XML namespaces in aggregate
> reports (#33, #70)
>
> On Fri 14/May/2021 15:42:56 +0200 Brotman, Alex wrote:
> > There are a few tickets that may break report ingestion systems due to
> structure and/or value changes.  Should we decide that's an implementation
> issue, or that we truly can't change the format of the reports?  I'm sure most
> ingestion systems are rather flexible given the number of reports that
> appear to not match what 7489 states/suggests.
>
>
> Report consumers use XML libraries to recover the value of named fields.
> We can safely add fields.  Renaming fields or change existing semantics
> would break backward compatibility, which I think we can avoid.
>
>
> > If we are going to allow changes to the structure, and there is some
> concern about which version the receiver supports (or prefers?), should we
> put a flag into the DMARC record?  And of course, that may dependent on
> the receiver, if multiple are listed, so that would have to belong to each
> individual receiving address.
>
>
> Overkill IMHO.
>
>
> >> From: dmarc  On Behalf Of Matthäus Wander
> >>
> >> Regarding the existing top-level  below : Even if
> >> parsers don't require the version to function, it remains useful for
> >> measuring the adoption of the different DMARC specifications (as
> >> requested in #70). In fact, one implementation I looked at
> >> (parsedmarc) uses it for only this purpose. A missing  is logged
> as "draft"
> >> schema version.
>
> In my tiny MX I have a cache of 631 aggregate reports received recently.  121
> reports from 31 unique org_names have a /feedback/version element, 510
> from 37 organizations don't.  The latter group includes google.com, Yahoo!
> Inc., Verizon Media, Mail.Ru, ...
>
> Perhaps, someone with larger mail flows can bring better statistics.
>
>
> >> Regarding the XML namespace declaration:
> >> The XML schema serves not only as specification for developers, but
> >> can be also used for automatic syntax checks of reports -- provided
> >> that the namespace declaration is fixed. XSD validation is an
> >> immensely useful tool for testing the output of report generators. It
> >> helped me to discover two nasty bugs in an implementation, which
> >> appeared in 2 out of ~10k reports and would have gone unnoticed
> otherwise.
>
>
> Very much agreed.  Validating the report before sending is very safe.  Also
> building online aggregate report checking utilities would benefit from this
> possibility.
>
> Does the IETF provide URLs for hosting XSDs?
>
>
> >> A version number within the schema is not necessary for this use case.
>
>
> Or we can stick to a static 1.0, similar to v=DMARC1,
> MIME-Version, and the like, if useful.
>
>
> >> A different matter is whether automatic XSD validation on the report
> >> consumer side is a supported use case. There is some value in it: two lines
> of
> >> code suffice to perform input validation. However, the validation is strict
> and
> >> does not allow for being liberal in what you accept (might be handy for
> >> protocol police, though). Achieving upward compatibility is not trivial,
> >> because there is no general "ignore all unknown elements" statement in
> >> XSD. It is possible to define a  placeholder in the schema, but 
> >> this
> >> element must be inserted explicitly into each place where extensibility is
> >> desired. This would require careful foresight in the schema design.
>
>
> Designing an abstract extension for ARC is going to be particularly 
> challenging.
>
>
> Best
> Ale
> --
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> ___
> dmarc mailing list
> dmarc@ietf.org
> https://urldefense.com/v3/__https://www.ietf.org/mailman/listinfo/dmarc
> __;!!CQl3mcHX2A!Q3kGuVczKJh6EQuYf24QFyvWnwvaeUkyjnyhIGu9DMQQ-
> 6Xb_w-hV7tSxFRmor-OwwfRXbxMrg$
___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-14 Thread Alessandro Vesely

On Fri 14/May/2021 15:42:56 +0200 Brotman, Alex wrote:

There are a few tickets that may break report ingestion systems due to 
structure and/or value changes.  Should we decide that's an implementation 
issue, or that we truly can't change the format of the reports?  I'm sure most 
ingestion systems are rather flexible given the number of reports that appear 
to not match what 7489 states/suggests.



Report consumers use XML libraries to recover the value of named fields.  We 
can safely add fields.  Renaming fields or change existing semantics would 
break backward compatibility, which I think we can avoid.




If we are going to allow changes to the structure, and there is some concern 
about which version the receiver supports (or prefers?), should we put a flag 
into the DMARC record?  And of course, that may dependent on the receiver, if 
multiple are listed, so that would have to belong to each individual receiving 
address.



Overkill IMHO.



From: dmarc  On Behalf Of Matthäus Wander

Regarding the existing top-level  below : Even if
parsers don't require the version to function, it remains useful for 
measuring the adoption of the different DMARC specifications (as

requested in #70). In fact, one implementation I looked at (parsedmarc)
uses it for only this purpose. A missing  is logged as "draft" 
schema version.


In my tiny MX I have a cache of 631 aggregate reports received recently.  121 
reports from 31 unique org_names have a /feedback/version element, 510 from 37 
organizations don't.  The latter group includes google.com, Yahoo! Inc., 
Verizon Media, Mail.Ru, ...


Perhaps, someone with larger mail flows can bring better statistics.



Regarding the XML namespace declaration:
The XML schema serves not only as specification for developers, but can be
also used for automatic syntax checks of reports -- provided that the
namespace declaration is fixed. XSD validation is an immensely useful tool for
testing the output of report generators. It helped me to discover two nasty
bugs in an implementation, which appeared in 2 out of ~10k reports and
would have gone unnoticed otherwise.



Very much agreed.  Validating the report before sending is very safe.  Also 
building online aggregate report checking utilities would benefit from this 
possibility.


Does the IETF provide URLs for hosting XSDs?



A version number within the schema is not necessary for this use case.



Or we can stick to a static 1.0, similar to v=DMARC1, 
MIME-Version, and the like, if useful.




A different matter is whether automatic XSD validation on the report
consumer side is a supported use case. There is some value in it: two lines of
code suffice to perform input validation. However, the validation is strict and
does not allow for being liberal in what you accept (might be handy for
protocol police, though). Achieving upward compatibility is not trivial,
because there is no general "ignore all unknown elements" statement in
XSD. It is possible to define a  placeholder in the schema, but this
element must be inserted explicitly into each place where extensibility is
desired. This would require careful foresight in the schema design.



Designing an abstract extension for ARC is going to be particularly challenging.


Best
Ale
--















___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-14 Thread Brotman, Alex
There are a few tickets that may break report ingestion systems due to 
structure and/or value changes.  Should we decide that's an implementation 
issue, or that we truly can't change the format of the reports?  I'm sure most 
ingestion systems are rather flexible given the number of reports that appear 
to not match what 7489 states/suggests.

If we are going to allow changes to the structure, and there is some concern 
about which version the receiver supports (or prefers?), should we put a flag 
into the DMARC record?  And of course, that may dependent on the receiver, if 
multiple are listed, so that would have to belong to each individual receiving 
address.

--
Alex Brotman
Sr. Engineer, Anti-Abuse & Messaging Policy
Comcast

> -Original Message-
> From: dmarc  On Behalf Of Matthäus Wander
> Sent: Thursday, May 13, 2021 5:29 PM
> To: dmarc@ietf.org
> Subject: Re: [dmarc-ietf] Versioning and XML namespaces in aggregate
> reports (#33, #70)
>
> Alessandro Vesely wrote on 2021-05-10 18:29:
> > On Mon 10/May/2021 17:28:20 +0200 Dave Crocker wrote:
> >> If an new spec merely /adds/ to a previous spec, then the presence of
> >> the new constructs is self-declaring.  The only requirement is to
> >> have the base specification declare that unrecognized constructs are
> >> to be ignored.  So, versioning adds the illusion of utility, but
> >> really only adds unnecessary complexity.
> >
> >
> > I think the format we'll end up with will be pretty compatible with
> > the existing practice, meaning that existing report consumers that use
> > a proper XML parser and ignore unknown tags can work unchanged.  I
> > don't think any consumer parses reports "by hands".
>
> Alright, introducing incompabilities is off the table and backward compability
> is a must. This brings #51 into question, which may affect backward
> compability.
> https://urldefense.com/v3/__https://trac.ietf.org/trac/dmarc/ticket/51__;!!
> CQl3mcHX2A!RCdb_46_lcqdxdM882JSzD-hjS-
> 66H5H0OL8qTxqEITjJ7dViYTApbhFoP1sF8sc-3FowsLllQ$
>
>
> Regarding the existing top-level  below :
> Even if parsers don't require the version to function, it remains useful for
> measuring the adoption of the different DMARC specifications (as requested
> in #70). In fact, one implementation I looked at (parsedmarc) uses it for only
> this purpose. A missing  is logged as "draft"
> schema version.
>
>
> Regarding the XML namespace declaration:
> The XML schema serves not only as specification for developers, but can be
> also used for automatic syntax checks of reports -- provided that the
> namespace declaration is fixed. XSD validation is an immensely useful tool for
> testing the output of report generators. It helped me to discover two nasty
> bugs in an implementation, which appeared in 2 out of ~10k reports and
> would have gone unnoticed otherwise.
> A version number within the schema is not necessary for this use case.
>
> A different matter is whether automatic XSD validation on the report
> consumer side is a supported use case. There is some value in it: two lines of
> code suffice to perform input validation. However, the validation is strict 
> and
> does not allow for being liberal in what you accept (might be handy for
> protocol police, though). Achieving upward compatibility is not trivial,
> because there is no general "ignore all unknown elements" statement in
> XSD. It is possible to define a  placeholder in the schema, but this
> element must be inserted explicitly into each place where extensibility is
> desired. This would require careful foresight in the schema design.
>
> Regards,
> Matt
>
> ___
> dmarc mailing list
> dmarc@ietf.org
> https://urldefense.com/v3/__https://www.ietf.org/mailman/listinfo/dmarc
> __;!!CQl3mcHX2A!RCdb_46_lcqdxdM882JSzD-hjS-
> 66H5H0OL8qTxqEITjJ7dViYTApbhFoP1sF8sc-3GZUejGJQ$
___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-13 Thread Matthäus Wander
Alessandro Vesely wrote on 2021-05-10 18:29:
> On Mon 10/May/2021 17:28:20 +0200 Dave Crocker wrote:
>> If an new spec merely /adds/ to a previous spec, then the presence of
>> the new constructs is self-declaring.  The only requirement is to have
>> the base specification declare that unrecognized constructs are to be
>> ignored.  So, versioning adds the illusion of utility, but really only
>> adds unnecessary complexity.
> 
> 
> I think the format we'll end up with will be pretty compatible with the
> existing practice, meaning that existing report consumers that use a
> proper XML parser and ignore unknown tags can work unchanged.  I don't
> think any consumer parses reports "by hands".

Alright, introducing incompabilities is off the table and backward
compability is a must. This brings #51 into question, which may affect
backward compability.
https://trac.ietf.org/trac/dmarc/ticket/51


Regarding the existing top-level  below :
Even if parsers don't require the version to function, it remains useful
for measuring the adoption of the different DMARC specifications (as
requested in #70). In fact, one implementation I looked at (parsedmarc)
uses it for only this purpose. A missing  is logged as "draft"
schema version.


Regarding the XML namespace declaration:
The XML schema serves not only as specification for developers, but can
be also used for automatic syntax checks of reports -- provided that the
namespace declaration is fixed. XSD validation is an immensely useful
tool for testing the output of report generators. It helped me to
discover two nasty bugs in an implementation, which appeared in 2 out of
~10k reports and would have gone unnoticed otherwise.
A version number within the schema is not necessary for this use case.

A different matter is whether automatic XSD validation on the report
consumer side is a supported use case. There is some value in it: two
lines of code suffice to perform input validation. However, the
validation is strict and does not allow for being liberal in what you
accept (might be handy for protocol police, though). Achieving upward
compatibility is not trivial, because there is no general "ignore all
unknown elements" statement in XSD. It is possible to define a 
placeholder in the schema, but this element must be inserted explicitly
into each place where extensibility is desired. This would require
careful foresight in the schema design.

Regards,
Matt

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-10 Thread Alessandro Vesely

On Mon 10/May/2021 17:28:20 +0200 Dave Crocker wrote:

On 5/10/2021 7:10 AM, Matthäus Wander wrote:

I support the use of the namespace declaration. A report with namespace
declaration allows for automatic syntax checks with XML Schema
Validation.


Version numbers, and the like, tend to be a lot less useful than intuition 
leads one to expect.



Automatic syntax checks are a different beast, though.



The distinction to make is 'increments' versus 'incompatibilities'.

If an new spec merely /adds/ to a previous spec, then the presence of the new 
constructs is self-declaring.  The only requirement is to have the base 
specification declare that unrecognized constructs are to be ignored.  So, 
versioning adds the illusion of utility, but really only adds unnecessary 
complexity.



I think the format we'll end up with will be pretty compatible with the 
existing practice, meaning that existing report consumers that use a proper XML 
parser and ignore unknown tags can work unchanged.  I don't think any consumer 
parses reports "by hands".


The added complexity of using proper XML constructs to define the format, as 
well as properly formatting each instance, enables advanced use of XML parsing 
tools.  Did you notice no site offers aggregate report validation services?



Incompatibilities, where new constructs conflict with previous ones, mean that 
the new specification is not a new version.  It is an independent 
specification.  It needs to be labeled accordingly.



This is not our case.  Even if we find a better TLD for the targetNamespace 
URL, the format is going to be the first official version of DMARC aggregate 
report format, following the one(s) in use since 2012.



Best
Ale
--






















___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-10 Thread Matthäus Wander
John Levine wrote on 2021-05-10 17:21:
> It appears that Matthäus Wander  said:
>> 1) #33 suggests to add a versioned XML namespace declaration in the root
>>  element.
>> https://trac.ietf.org/trac/dmarc/ticket/33
>>
>> I support the use of the namespace declaration. 
> 
> 
>> 4) How does the report generator know which format version the consumer
>> supports?
> 
> It doesn't.  If we change the schema, a lot of report parsers will break.  
> What actual
> real world problem does this change solve?

The schema is broken already. See:
https://trac.ietf.org/trac/dmarc/ticket/44
https://trac.ietf.org/trac/dmarc/ticket/45
https://www.uriports.com/blog/dmarc-reports-ietf-rfc-compliance/

The point is to fix the schema.

> I haven't seen a lot of ill-formed reports.

You obviously haven't tried XSD validation.

Regards,
Matt

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-10 Thread Dave Crocker

On 5/10/2021 7:10 AM, Matthäus Wander wrote:

I support the use of the namespace declaration. A report with namespace
declaration allows for automatic syntax checks with XML Schema
Validation.


Version numbers, and the like, tend to be a lot less useful than 
intuition leads one to expect.


The distinction to make is 'increments' versus 'incompatibilities'.l

If an new spec merely /adds/ to a previous spec, then the presence of 
the new constructs is self-declaring.  The only requirement is to have 
the base specification declare that unrecognized constructs are to be 
ignored.  So, versioning adds the illusion of utility, but really only 
adds unnecessary complexity.


Incompatibilities, where new constructs conflict with previous ones, 
mean that the new specification is not a new version.  It is an 
independent specification.  It needs to be labeled accordingly.



d/

--
Dave Crocker
dcroc...@gmail.com
408.329.0791

Volunteer, Silicon Valley Chapter
Information & Planning Coordinator
American Red Cross
dave.crock...@redcross.org

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Versioning and XML namespaces in aggregate reports (#33, #70)

2021-05-10 Thread John Levine
It appears that Matthäus Wander  said:
>1) #33 suggests to add a versioned XML namespace declaration in the root
> element.
>https://trac.ietf.org/trac/dmarc/ticket/33
>
>I support the use of the namespace declaration. 


>4) How does the report generator know which format version the consumer
>supports?

It doesn't.  If we change the schema, a lot of report parsers will break.  What 
actual
real world problem does this change solve?  I haven't seen a lot of ill-formed 
reports.

R's,
John

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc