Valery, Thanks. In skimming back through this, I noticed several typos and evidence of editorial carelessness. The ones that might be unclear are:
* In (3) "other local deviations and (claimed) translation strategies" should have been "other local deviations and (claimed) transition strategies" * In (6) "IPvN, N > 6, out there, basing..." should have been "IPvN, N > 6, out there, making..." or "IPvN, N > 6, out there, defining...". In addition, my note assumed, I hope correctly, that WG participants who are working with this document understand that an arbitrary string of Unicode code points does not qualify as a U-label. The latter requires all of the processing and validity checking specified in the IDNA2008 documents. If anyone did not understand that, it was the reason for the "unrestricted Unicode strings, or even U-labels" distinction in (8)... and you do now. Good luck with this important specification. john --On Tuesday, December 27, 2022 15:31 +0300 Valery Smyslov <val...@smyslov.net> wrote: > Hi, > > below is an unofficial I18NDIR review of the draft, > performed by John Klensin (forwarded here with his permission). > Thanks to John for doing this review. > > Regards, > Valery. > > -----Original Message----- > From: John C Klensin [mailto:john-i...@jck.com] > Sent: Monday, December 26, 2022 9:53 PM > To: Valery Smyslov > Cc: uta-cha...@ietf.org; ; art-...@ietf.org; Barry Leiba > Subject: Re: [I18ndir] I18NDIR active? > > (1) Given the importance of anyone intending to use a > certificate being absolutely certain that the certificate that > should apply is actually the certificate in hand, it seems to > me desirable that Section 4.1 more carefully examine anything > identified as "SHOULD" and comment on the circumstances in > which following that rule would be inappropriate and/or the > possible consequences of not applying it. > > (2) The document makes several references to URIs, but only RFC > 3986 appears to be referenced. In the real world in which > certificates are established and used and in which differences > in specifications and practices often provide opportunities for > exploitation by would-be evildoers, there are at least two, > probably three, URI specifications (IETF/RFC3896, WHATWG, and > maybe W3C). Each is treated as authoritative by some Internet > actors and they are not consistent with each other. That > situation and its implications should be pointed out, at least > as a Security Consideration. > > (3) Similarly, there are, in practice, at least two different > specifications for IDNs. While the IETF considers it obsolete, > IDNA2003 is still referenced periodically and might constitute > another. One of those, obviously, is as specified by RFC > 5890ff. Another is specified, with the claim that it is a > transition strategy but that has shown no signs in recent years > of being used that way rather than as an alternate spec, by the > Unicode Consortium as UTS#46 [4]. These specifications, and > other local deviations and (claimed) translation strategies, > are in wide use in different communities. In particular, while > ICANN --and hence what is nominally permitted to be registered > in the DNS near the root of tree-- at least nominally conforms > to IDNA2008 (but has been unable to prevent some TLDs from > registering emoji as second-level domains), WHATWG (and hence > most or all browser vendors and implementers) have > specification written in terms of UTS#46. The same > considerations as in (2) above apply, only the > incompatibilities among the specs are much greater with emoji > in domain names being a striking difference although there are > many more subtle cases. And, as in (2) this issue should at > least be a Security Consideration. > > (4) Section 6.3 strongly implies that there are two types of > domain names: the traditional, all-ASCII, variety known as the > "preferred name syntax" in RFC 1034 Section 3.5 and IDNs. But > it does not say that but, instead, points to RFC 1034 as a > whole. RFC 1034 imposes no restrictions on what can be in a > the octets that make up a label (see, e.g., Section 11 of RFC > 2181). If you mean that the labels of the domain names you are > considering must be IDNs, all-ASCII, or "preferred syntax" (the > last two are different), figure out a way to say that > explicitly. > > (5) AFAICT, the third paragraph of Section 6.3, describing IDN > matching, is correct. You might want to say "before checking > the domain name or comparing it with others" but that is a nit. > > (6) Under 6.4, the draft says "The iPAddress field does not > include the IP version, so IPv4 addresses are distinguish from > IPv6 addresses only by their length (4 as opposed to 16 > bytes)". I don't know if that can be fixed or if this document > would be the appropriate place to fix it, but, as long is > there are people, companies, governments, or other entities > out there with bright ideas about IPvN, N > 6, out there, > basing operations on this field conditional on a heuristic > that depends on length does not seem like a good idea. And, > btw, the preferred term is usually "octets" rather than > "bytes". > > (7) Editorial nit: the first sentence of the penultimate > paragraph of 6.4 isn't one. > > (8) If all of your processing (not just comparisons) and what > you allow to store in certificates is based on A-labels, then > I'm not sure what Section 7.2 means. If you allow unrestricted > Unicode strings, or even U-labels, in labels in certificates, > then visual confusion by users is only one of many problems you > invite. And, even then, note, e.g., that > trøll ( \u0074\u0072\u00F8\u006C\u006C ) > and the identical-appearing > \u0074\u0072\u006F\u0078\u006C\u006C > generate different A-labels and are not a "visual confusion" > problem as described in the cited portions of RFC 5890. UTR#36 > (which you reference) and UTS#39 [5], especially Section 4 > (which the document does not reference and probably should) are > better in some ways, but basically point to the problems and > possible approaches. In particular, implementing even the > "whole script" algorithm of UTS#39 Section 4.1 (and then 5.1) > require fairly deep understanding of whatever scripts might > appear in the characters of any particular DNS label. That > does not quite rise to "impossible" but is certainly well in > the "infeasible" range, even for single-script labels, when all > possible IDN labels are considered. > > Again, the above is based on a very superficial reading of the > document. It is not an official review, is not a substitute > for one, and is almost certainly not complete. > > john > > _______________________________________________ Uta mailing list Uta@ietf.org https://www.ietf.org/mailman/listinfo/uta