Consider a domain name containing a slash-homograph.

As it stands, IDNA section 3.1 requirement 3 tells applications that
they "SHOULD" display the non-ACE form.  The security considerations
section, much later, "suggests" that applications provide visual
indications of various anomalies (from which one could extrapolate that
the slash-homograph would benefit from a visual indication).

I think we've seen that these security concerns need to be less buried,
that "visual indications" are too burdensome on implementations, and
that in some cases (like this one) the recommendation to display the
non-ACE form ought to be withdrawn, or even reversed (that is, recommend
the ASCII form).

There I propose a technical change to IDNA section 3.1 requirement 3.
For reference, here it is as it stands now in RFC-3490 (with one typo
corrected):

   3) ACE labels obtained from domain name slots SHOULD be hidden from
      users when it is known that the environment can handle the non-ACE
      form, except when the ACE form is explicitly requested.  When
      it is not known whether or not the environment can handle the
      non-ACE form, the application MAY use the non-ACE form (which
      might fail, such as by not being displayed properly), or it MAY
      use the ACE form (which will look unintelligible to the user).
      Given an internationalized domain name, an equivalent domain name
      containing no ACE labels can be obtained by applying the ToUnicode
      operation (see section 4) to each label.  When requirements 2 and
      3 both apply, requirement 2 takes precedence.

Here is my proposed replacement:

--begin--

   3) When a domain label occupying or obtained from a domain name
      slot is to be shown to a user, it SHOULD NOT simply be shown in
      whatever form it was found in; before being shown it SHOULD be
      forced into either ASCII form (which can be obtained by applying
      ToASCII) or non-ACE form (which can be obtained by applying
      ToUnicode, see section 4), according to the first applicable of
      the following rules:

      a) If requirements 2 and 3 both apply, requirement 2 takes
         precedence, and the ASCII form MUST be used.

      b) When the user has explicitly requested to see one form or the
         other, that form SHOULD be shown.

      c) When it is known that the environment cannot handle the non-ACE
         form, the ASCII form SHOULD be shown.

      d) If the non-ACE form contains any character outside Unicode
         categories L (letter), N (number), and M (mark), other than
         U+002D hyphen-minus, the ACE form SHOULD be shown.

      e) If the application determines that showing the non-ACE form
         would pose too great a risk of misleading the user, the ASCII
         form MAY be shown.  Applications MAY use complex heuristics to
         estimate this risk, but SHOULD try to minimize the negative
         impact on legitimate usage of internationalized domain names.

      f) When it is not known whether the environment can handle the
         non-ACE form, the application MAY show the non-ACE form (which
         might fail, such as by not being displayed properly), or it MAY
         show the ASCII form (which will look unintelligible to the user
         if it is an ACE).

      g) In general, when rules a-f do not apply, the non-ACE form
         SHOULD be shown.

      Rules c, d, and e above apply tests to "the" non-ACE form, but
      in fact there can be many non-ACE forms that differ only in
      capitalization and/or normalization.  If a given non-ACE label
      fails some test, it MAY be converted to an equivalent non-ACE
      label by applying the map and/or normalize steps of [NAMEPREP] (or
      all the steps), and then given another chance to pass the test.

--end--

Thoughts?

AMC

Reply via email to