Re: [idn] draft about Tradition and Simplified Chinese Conversion[version02]

Adam M. Costello Mon, 19 Nov 2001 14:34:10 -0800

[Sorry if some of you get this twice.]

xiang deng <[EMAIL PROTECTED]> wrote:


> This is a discussional TSconv draft in CDNC and JET-MEMBER.
>
> I'm happy to hear your suggestions about the solution.

Sorry it took me so long.  The internet draft
(draft-ietf-idn-tsconv-02.txt) is basically the same as the discussional
draft; I think the only significant change was to the HSE table itself.

If we assume that TC/SC character pairs must match when domain names are
compared, then this looks like a good way to achieve that.

However, it's not clear whether we should accept that assumption.  Here
are my concerns:

Having TC/SC pairs match is not necessary for the protection of
intellectual property rights.  Registrars can impose (or not impose)
policies that disallow registration of names that are very close
to existing names.  Registrants can consider these policies when
choosing a registrar.  A model policy could even be standardized, and
each registrar could simply refer to it.  This policy could be more
sophisticated than the simplistic 1-1 TC/SC matching.  I don't see
how the proposal makes protection of intellectual property rights any
better or easier.  What it does is provide a a convenience to users by
allowing them to type variant forms of domain names into any IDN-aware
application.

The only way to have SC/TC pairs match is to include the TC/SC table in
all IDNA applications, which is indeed what this proposal requires.

1-1 TC/SC matching does not solve the full TC/SC problem.  From what
I've heard, there are many cases where multiple traditional characters
match the same simplified character, and other cases even more complex.
If this means that a full solution is very likely to emerge at a higher
layer, then this proposed low-layer partial solution will end up being
redundant and unnecessary.

Using case to represent TC/SC is an overloading of semantics.  Existing
applications that convert names to all-uppercase or all-lowercase think
they know what kind of information they are throwing away, but under
this proposal they might be throwing away an entirely different kind of
information.

Now, regarding a few details of the proposal...

The idea of doing TC/SC folding and reordering with a single table is
clever.  Two for the price of one.

The parity bit in the ACE prefix looks more complex than necessary.  Why
not simply use bQ-- always?  That would always detect conversion to
all-uppercase and conversion to all-lowercase.  It wouldn't detect other
case conversions, but they are uncommon, and the parity bit would only
detect them 50% of the time anyway.  (By the way, the draft claims that
the basic code points in the encoded string are always lowercase, but
mixed-case annotation works for both the basic code points and non-basic
code points.)

AMC

Re: [idn] draft about Tradition and Simplified Chinese Conversion[version02]

Reply via email to