Re: [dmarc-ietf] Formal specification, URI

2015-03-18 Thread Murray S. Kucherawy
On Tue, Mar 17, 2015 at 8:23 AM, Alessandro Vesely ves...@tana.it wrote:

  Right, which is why things like semi-colon don't need to be
  percent-encoded; they're already special characters in the context of a
 URL.

 So are comma and exclamation.  What puzzles me is that DMARC spec treats
 them
 differently while RFC3896 does not.  Comma and semicolon seem to behave the
 same; e.g.:


Ah, that's true.  I was looking specifically at one and not all three.

At this point, since RFC7489 has just been published, I suggest you choose
(perhaps with direction from the co-chairs) how to record this discrepancy
for handling when the base draft gets updated.  You could open an erratum
report, or you could add it to the WG's tracker, or maybe they have another
suggestion.


  They aren't formally imported, and I'm not sure that's necessary here.
 The
  ABNF we have should be comprehensive over DMARC tag-value sets.  The
 prose
  you cited is merely meant to convey that they follow the same style.

 Right.  The question is if implementations can reuse DKIM parsers.


Without studying the ABNF again, I believe so.  DKIM parsers separate
tag-value entities at unencoded semicolons, after which the tag name and
tag value are separated at unencoded equal signs.  DMARC records are the
same up to that point, and it's below there for ruf and rua in the
DMARC case that things get interesting.  Just like in the case of b= for
DKIM, those two have special rules for value interpretation: make up a list
of URIs using an unencoded comma as the separator.


  Your question is Are they equivalent?  I believe they are.  Although it
  might be ideal to have a specification so tight that there's exactly one
  way to do something, in the end I don't think it's harmful to have two
 ways
  to say the same thing.  It's more of a concern if there's to ways to
  interpret a single thing; that's when we arguably have something to fix.

 I tried the NOT RECOMMENDED syntax quoted above.  Dmarcian[1] doesn't
 raise a
 brow, and RFC3896-compliant uriparser[2] ingests it smoothly.  However,
 although I sent a test message to a gmail account, I received no report.  I
 guess Google's implementation doesn't deploy a proper URI parser, but just
 looks for mailto:; followed by a plain path consisting of a single[3]
 addr-spec (as defined in RFC6068, i.e. w/o comments) with no query nor
 fragment
 --that's what I'd do myself, but I find no arguments in the spec that help
 proving that that record is bad.


I think we've wandered into implementation comparisons rather than getting
the ABNF right in the specification.  Or maybe a better way to say that is:
I don't think fixing the discrepancy you've raised would resolve the
disparity you're observing.


 The spec says a report is normally sent to each.  How can a publisher
 express
 that two URIs are meant to be either-or alternatives to each other?


Is that a capability you require?  I don't think that's a use case I've
ever encountered.


 It may also be worth to require domain in addr-spec to be A-label, as that
 simplifies verification and improves dn_ compression.  Such idea apparently
 conflicts with the example at the end of Section 6.3 of RFC6068, where the
 IDN
 is percent-encoded instead.


That is a completely different topic, something that should be taken up
when we do a standards track version.

-MSK
___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Formal specification, URI

2015-03-17 Thread Alessandro Vesely
On Mon 16/Mar/2015 20:22:31 +0100 Murray S. Kucherawy wrote: 
 On Mon, Mar 16, 2015 at 3:51 AM, Alessandro Vesely ves...@tana.it wrote:
 
 Section 2.2 of RFC3986 lists semi-colon as a reserved character that has to
 be percent-encoded in these URLs.  We don't need to repeat it here, I think.

 If the spec is going to be read by ignorants like me, it's better to repeat
 than to omit.  RFC3986 has a very wide scope, and uses phrases like may
 (or may not) be defined as delimiters.  It says:

If data for a URI component would conflict with a reserved
character's purpose as a delimiter, then the conflicting data must be
percent-encoded before the URI is formed.
 
 Right, which is why things like semi-colon don't need to be
 percent-encoded; they're already special characters in the context of a URL.

So are comma and exclamation.  What puzzles me is that DMARC spec treats them
differently while RFC3896 does not.  Comma and semicolon seem to behave the
same; e.g.:

http://www.tana.it/comma,comma.txt
http://www.tana.it/comma%2ccomma.txt
http://www.tana.it/comma%25%32%63comma.txt
http://www.tana.it/semicolon;semicolon.txt
http://www.tana.it/semicolon%3bsemicolon.txt
http://www.tana.it/semicolon%25%33%62semicolon.txt

 Commma and exclamation (which are sub-delims like semicolon) are apparently
 used in dmarc-uri's rule.  The preceding DMARC section says:

DMARC records follow the extensible tag-value syntax for DNS-based
key records defined in DKIM [DKIM].

 However, DKIM production rules don't seem to be formally imported.  If
 they are
 imported, semicolon exclusion is implied by the definition:

VALCHAR   =  %x21-3A / %x3C-7E
  ; EXCLAMATION to TILDE except SEMICOLON
 
 They aren't formally imported, and I'm not sure that's necessary here.  The
 ABNF we have should be comprehensive over DMARC tag-value sets.  The prose
 you cited is merely meant to convey that they follow the same style.

Right.  The question is if implementations can reuse DKIM parsers.

 How about the other two questions?  I didn't survey but a few DMARC
 records, but RFC6068 exemplifies the following:

Also note that it is syntactically valid to specify both to and an
hfname whose value is to.  That is,

mailto:addr1@an.example,addr2@an.example

is equivalent to

mailto:?to=addr1@an.example,addr2@an.example

is equivalent to

mailto:addr1@an.example?to=addr2@an.example

However, the latter form is NOT RECOMMENDED because different user
agents handle this case differently.  In particular, some existing
clients ignore to hfvalues.

 Yahoo instead uses 1st level syntax:

rua=mailto:dmarc-yahoo-...@yahoo-inc.com, mailto:dmarc_y_...@yahoo.com;
 
 Your question is Are they equivalent?  I believe they are.  Although it
 might be ideal to have a specification so tight that there's exactly one
 way to do something, in the end I don't think it's harmful to have two ways
 to say the same thing.  It's more of a concern if there's to ways to
 interpret a single thing; that's when we arguably have something to fix.

I tried the NOT RECOMMENDED syntax quoted above.  Dmarcian[1] doesn't raise a
brow, and RFC3896-compliant uriparser[2] ingests it smoothly.  However,
although I sent a test message to a gmail account, I received no report.  I
guess Google's implementation doesn't deploy a proper URI parser, but just
looks for mailto:; followed by a plain path consisting of a single[3]
addr-spec (as defined in RFC6068, i.e. w/o comments) with no query nor fragment
--that's what I'd do myself, but I find no arguments in the spec that help
proving that that record is bad.

[1] https://dmarcian.com/dmarc-inspector/torreinpietra.it
[2] http://uriparser.sourceforge.net/doc/html/
[3] haven't yet tried two %2c-separated addr-specs.

 The goal in allowing a comma-separated list of URLs is that you might
 conceivably want to put an http and a mailto URL in there, in the try A
 first, then try B sense.  We need to allow for that possibility.  We also
 need to account for the possibility of a comma that is inside of a URL;
 those are the ones that need to be encoded.  Outside of a URL, they're
 delimiters.

The spec says a report is normally sent to each.  How can a publisher express
that two URIs are meant to be either-or alternatives to each other?

 Unless I'm missing something, the ABNF for DMARC allows all three of the
 cited examples, as well as Yahoo's use, and all four of them mean the same
 thing.  That doesn't strike me as a bug.

Recall that it took years and an RFC revision to have mailto: URIs treated with
reasonable uniformity by web browsers.  Here we are specifying an entirely new
kind of client, which avails of its specific URI-transmission protocol.  IMHO,
if we want %2c to be interpreted as addr-spec separator or otherwise, we ought
to spell it loud and clear.

It may also be worth to require domain in addr-spec to be A-label, as that
simplifies verification and 

Re: [dmarc-ietf] Formal specification, URI

2015-03-16 Thread Stephen J. Turnbull
Alessandro Vesely writes:

  If the spec is going to be read by ignorants like me, it's better
  to repeat than to omit.

-1.  It's good that you read the spec, but that's not the primary
purpose of the spec.  It's a bad idea to repeat definitions clearly
stated in another document (even in informal comments on the formal
spec) when you refer to the original document; you're just asking for
new ambiguity.

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Formal specification, URI

2015-03-16 Thread ned+dmarc
 Alessandro Vesely writes:

   If the spec is going to be read by ignorants like me, it's better
   to repeat than to omit.

 -1.  It's good that you read the spec, but that's not the primary
 purpose of the spec.  It's a bad idea to repeat definitions clearly
 stated in another document (even in informal comments on the formal
 spec) when you refer to the original document; you're just asking for
 new ambiguity.

+1. Repeating stuff like this is in the long term a surefire of silly states,
where one repetition gets updated but not another.

The running code that comes to mind is the MIME specification, which
originally had a bunch of repeated and overlapping syntax definitions. In this
case it only took one revision for it to get out of sync and cause confusion.

Ned

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Formal specification, URI

2015-03-16 Thread Murray S. Kucherawy
On Mon, Mar 16, 2015 at 3:51 AM, Alessandro Vesely ves...@tana.it wrote:

  Section 2.2 of RFC3986 lists semi-colon as a reserved character that has
 to
  be percent-encoded in these URLs.  We don't need to repeat it here, I
 think.

 If the spec is going to be read by ignorants like me, it's better to repeat
 than to omit.  RFC3986 has a very wide scope, and uses phrases like may
 (or
 may not) be defined as delimiters.  It says:

If data for a URI component would conflict with a reserved
character's purpose as a delimiter, then the conflicting data must be
percent-encoded before the URI is formed.


Right, which is why things like semi-colon don't need to be
percent-encoded; they're already special characters in the context of a URL.


 Commma and exclamation (which are sub-delims like semicolon) are apparently
 used in dmarc-uri's rule.  The preceding DMARC section says:

DMARC records follow the extensible tag-value syntax for DNS-based
key records defined in DKIM [DKIM].

 However, DKIM production rules don't seem to be formally imported.  If
 they are
 imported, semicolon exclusion is implied by the definition:

VALCHAR   =  %x21-3A / %x3C-7E
  ; EXCLAMATION to TILDE except SEMICOLON


They aren't formally imported, and I'm not sure that's necessary here.  The
ABNF we have should be comprehensive over DMARC tag-value sets.  The prose
you cited is merely meant to convey that they follow the same style.


 How about the other two questions?  I didn't survey but a few DMARC
 records,
 but RFC6068 exemplifies the following:

Also note that it is syntactically valid to specify both to and an
hfname whose value is to.  That is,

mailto:addr1@an.example,addr2@an.example

is equivalent to

mailto:?to=addr1@an.example,addr2@an.example

is equivalent to

mailto:addr1@an.example?to=addr2@an.example

However, the latter form is NOT RECOMMENDED because different user
agents handle this case differently.  In particular, some existing
clients ignore to hfvalues.

 Yahoo instead uses 1st level syntax:

rua=mailto:dmarc-yahoo-...@yahoo-inc.com, mailto:dmarc_y_...@yahoo.com;


Your question is Are they equivalent?  I believe they are.  Although it
might be ideal to have a specification so tight that there's exactly one
way to do something, in the end I don't think it's harmful to have two ways
to say the same thing.  It's more of a concern if there's to ways to
interpret a single thing; that's when we arguably have something to fix.

The goal in allowing a comma-separated list of URLs is that you might
conceivably want to put an http and a mailto URL in there, in the try A
first, then try B sense.  We need to allow for that possibility.  We also
need to account for the possibility of a comma that is inside of a URL;
those are the ones that need to be encoded.  Outside of a URL, they're
delimiters.

Unless I'm missing something, the ABNF for DMARC allows all three of the
cited examples, as well as Yahoo's use, and all four of them mean the same
thing.  That doesn't strike me as a bug.

-MSK
___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Formal specification, URI

2015-03-16 Thread Alessandro Vesely
On Mon 16/Mar/2015 05:17:37 +0100 Murray S. Kucherawy wrote: 
 On Sun, Mar 15, 2015 at 11:53 AM, Alessandro Vesely ves...@tana.it wrote:
 
 This seems to be a bug:

 OLD:
  dmarc-uri   = URI [ ! 1*DIGIT [ k / m / g / t ] ]
; URI is imported from [URI]; commas (ASCII
; 0x2c) and exclamation points (ASCII 0x21)
; MUST be encoded; the numeric portion MUST fit
; within an unsigned 64-bit integer
 NEW:
  dmarc-uri   = URI [ ! 1*DIGIT [ k / m / g / t ] ]
; URI is imported from [URI]; commas (ASCII
; 0x2c), exclamation points (ASCII 0x21), and
; semicolons (ASCII 0x3b) MUST be percent-encoded;
; the numeric portion MUST fit within an unsigned
; 64-bit integer

 Is it equivalent to have, say, rua=mailto:a...@example.com%...@example.com
 and  rua=mail...@example.com, mailto:b...@example.com?

 Is the following meant to to be allowed?
mailto:dmarc@ietf.org?subject=Formal%20specification%2c%20URI
 
 Section 2.2 of RFC3986 lists semi-colon as a reserved character that has to
 be percent-encoded in these URLs.  We don't need to repeat it here, I think.

If the spec is going to be read by ignorants like me, it's better to repeat
than to omit.  RFC3986 has a very wide scope, and uses phrases like may (or
may not) be defined as delimiters.  It says:

   If data for a URI component would conflict with a reserved
   character's purpose as a delimiter, then the conflicting data must be
   percent-encoded before the URI is formed.

Commma and exclamation (which are sub-delims like semicolon) are apparently
used in dmarc-uri's rule.  The preceding DMARC section says:

   DMARC records follow the extensible tag-value syntax for DNS-based
   key records defined in DKIM [DKIM].

However, DKIM production rules don't seem to be formally imported.  If they are
imported, semicolon exclusion is implied by the definition:

   VALCHAR   =  %x21-3A / %x3C-7E
 ; EXCLAMATION to TILDE except SEMICOLON

Anyway, I'd add the percent- word, lest anyone tries #44...

How about the other two questions?  I didn't survey but a few DMARC records,
but RFC6068 exemplifies the following:

   Also note that it is syntactically valid to specify both to and an
   hfname whose value is to.  That is,

   mailto:addr1@an.example,addr2@an.example

   is equivalent to

   mailto:?to=addr1@an.example,addr2@an.example

   is equivalent to

   mailto:addr1@an.example?to=addr2@an.example

   However, the latter form is NOT RECOMMENDED because different user
   agents handle this case differently.  In particular, some existing
   clients ignore to hfvalues.

Yahoo instead uses 1st level syntax:

   rua=mailto:dmarc-yahoo-...@yahoo-inc.com, mailto:dmarc_y_...@yahoo.com;

Ale

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Formal specification, URI

2015-03-16 Thread Steven M Jones
On 03/16/2015 12:22 PM, Murray S. Kucherawy wrote:
 [...]

 The goal in allowing a comma-separated list of URLs is that you might
 conceivably want to put an http and a mailto URL in there, in the try
 A first, then try B sense.  We need to allow for that possibility. 
 We also need to account for the possibility of a comma that is inside
 of a URL; those are the ones that need to be encoded.  Outside of a
 URL, they're delimiters.

Just to be explicit, it also allows for multiple mailto: URIs -
something that is seen in the wild, though perhaps not if one looks up
a half dozen DMARC records at random. But at the end of January multiple
mailto: URIs could be seen in ten of the Alexa Top 100 domains, in both
rua and ruf tags.

--S.

___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Formal specification, URI

2015-03-16 Thread Murray S. Kucherawy
On Mon, Mar 16, 2015 at 12:57 PM, Steven M Jones s...@crash.com wrote:

 Just to be explicit, it also allows for multiple mailto: URIs -
 something that is seen in the wild, though perhaps not if one looks up
 a half dozen DMARC records at random. But at the end of January multiple
 mailto: URIs could be seen in ten of the Alexa Top 100 domains, in both
 rua and ruf tags.



Right, and there might be some operational reason why you want to send one
message to A and B, versus a message to A and then a message to B.  (Fewer
calls to fork()/exec(), perhaps.)

-MSK
___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Formal specification, URI

2015-03-16 Thread Murray S. Kucherawy
On Mon, Mar 16, 2015 at 12:22 PM, Murray S. Kucherawy superu...@gmail.com
wrote:

 Your question is Are they equivalent?  I believe they are.  Although it
 might be ideal to have a specification so tight that there's exactly one
 way to do something, in the end I don't think it's harmful to have two ways
 to say the same thing.  It's more of a concern if there's to ways to
 interpret a single thing; that's when we arguably have something to fix.


Sigh.  s/to ways/two ways/

-MSK
___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc


Re: [dmarc-ietf] Formal specification, URI

2015-03-15 Thread Murray S. Kucherawy
On Sun, Mar 15, 2015 at 11:53 AM, Alessandro Vesely ves...@tana.it wrote:

 This seems to be a bug:

 OLD:
  dmarc-uri   = URI [ ! 1*DIGIT [ k / m / g / t ] ]
; URI is imported from [URI]; commas (ASCII
; 0x2c) and exclamation points (ASCII 0x21)
; MUST be encoded; the numeric portion MUST fit
; within an unsigned 64-bit integer
 NEW:
  dmarc-uri   = URI [ ! 1*DIGIT [ k / m / g / t ] ]
; URI is imported from [URI]; commas (ASCII
; 0x2c), exclamation points (ASCII 0x21), and
; semicolons (ASCII 0x3b) MUST be percent-encoded;
; the numeric portion MUST fit within an unsigned
; 64-bit integer

 Is it equivalent to have, say, rua=mailto:a...@example.com%...@example.com
 and
 rua=mail...@example.com, mailto:b...@example.com?

 Is the following meant to to be allowed?
mailto:dmarc@ietf.org?subject=Formal%20specification%2c%20URI


Section 2.2 of RFC3986 lists semi-colon as a reserved character that has to
be percent-encoded in these URLs.  We don't need to repeat it here, I think.

-MSK
___
dmarc mailing list
dmarc@ietf.org
https://www.ietf.org/mailman/listinfo/dmarc