[Sorry if some of you get this twice.] xiang deng <[EMAIL PROTECTED]> wrote:
> This is a discussional TSconv draft in CDNC and JET-MEMBER. > > I'm happy to hear your suggestions about the solution. Sorry it took me so long. The internet draft (draft-ietf-idn-tsconv-02.txt) is basically the same as the discussional draft; I think the only significant change was to the HSE table itself. If we assume that TC/SC character pairs must match when domain names are compared, then this looks like a good way to achieve that. However, it's not clear whether we should accept that assumption. Here are my concerns: Having TC/SC pairs match is not necessary for the protection of intellectual property rights. Registrars can impose (or not impose) policies that disallow registration of names that are very close to existing names. Registrants can consider these policies when choosing a registrar. A model policy could even be standardized, and each registrar could simply refer to it. This policy could be more sophisticated than the simplistic 1-1 TC/SC matching. I don't see how the proposal makes protection of intellectual property rights any better or easier. What it does is provide a a convenience to users by allowing them to type variant forms of domain names into any IDN-aware application. The only way to have SC/TC pairs match is to include the TC/SC table in all IDNA applications, which is indeed what this proposal requires. 1-1 TC/SC matching does not solve the full TC/SC problem. From what I've heard, there are many cases where multiple traditional characters match the same simplified character, and other cases even more complex. If this means that a full solution is very likely to emerge at a higher layer, then this proposed low-layer partial solution will end up being redundant and unnecessary. Using case to represent TC/SC is an overloading of semantics. Existing applications that convert names to all-uppercase or all-lowercase think they know what kind of information they are throwing away, but under this proposal they might be throwing away an entirely different kind of information. Now, regarding a few details of the proposal... The idea of doing TC/SC folding and reordering with a single table is clever. Two for the price of one. The parity bit in the ACE prefix looks more complex than necessary. Why not simply use bQ-- always? That would always detect conversion to all-uppercase and conversion to all-lowercase. It wouldn't detect other case conversions, but they are uncommon, and the parity bit would only detect them 50% of the time anyway. (By the way, the draft claims that the basic code points in the encoded string are always lowercase, but mixed-case annotation works for both the basic code points and non-basic code points.) AMC
