I don't understand why there is all fuss concerning the various folding that Nameprep performs.
The Unicode standard says that two canonically equivalent strings should be treated the same (either one should work in any place that the other works). IDNA is simply honoring that recommendation. If it didn't normalize strings before encoding them or comparing them, then canonically equivalent strings wouldn't get treated the same. Canonical equivalence deals with things like "a with grave" versus "a" followed by "combining grave". The Unicode standard also defines compatible equivalence, and says that compatibility characters are characters that didn't really deserve their own code points, but were reluctantly given their own code points for the sole purpose of preserving round-trip conversions to legacy charsets. Nameprep respects this reluctance by using NFKC rather than NFC, so that the compatibility characters are folded into their equivalents. The Unicode standard also defines a locale-independent case folding algorithm explicitly intended for doing case-insensitive comparisons, which is exactly what IDNA needs it for. German sharp s is folded to ss because a case-insensitive comparison must match German sharp s with SS (the latter is the normal uppercase form of the former), and must match SS with ss (the latter is the normal lowercase form of the former), and so by transitivity it must match German sharp s with ss. John C Klensin <[EMAIL PROTECTED]> wrote: > For example, one could say, and I think we essentially have, that the > WG is solving the problem of getting things into and out of the DNS > given that the Unicode coding form is accurately known. Yes (but not just DNS, all existing protocols that use domain names). > if the WG's position and recommendations are based on that model, > we should be obligated to write it down and make it explicit in our > documents before they go onto the standards track: we owe that much > to those who think we are solving any of a number of more general > internationalization problems. Suggested text addressing that concern was posted to the list a few weeks ago, see message <[EMAIL PROTECTED]>. AMC
