Kent Karlsson <kent dot karlsson14 at telia dot com> wrote:

I see absolutely no point in reencoding the digits 0-9 even though 9 is (strangely) used to denote the value that is usually denoted 10. That is just a (very strange) usage, not different characters from the ordinary 0-9.

I suggested encoding all of them because U+0030 through U+0039 have the Nd nature.

That does not prevent anyone from using them for (ordinary) hexadecimal, octal, etc.

So I thought about this some more, and decided that what made the difference for me was that—unlike the Basic Latin digits—the Tonal digits can *only* be used for non-decimal purposes, so none of them should be Nd.

But then I thought about the properties I assigned to the Ewellic digits [1], where the first 10 (which can be either decimal or hex) are Nd and the last 6 (which can only be hex) are No. So not all members of that set have the same properties either.

What makes this troublesome for me is that, on the one hand, there are the perfectly ordinary-looking 0 through 8, and on the other hand there are the invented digits for 9 and 11 through 15, and then in the middle there's this bizarre use of an ordinary 9-glyph to mean decimal 10. That's what messes it up for me and makes me think the '9' isn't really a 9, and what the heck, maybe none of the "ordinary" digits are what they appear to be, so let's CSUR-encode all of them.

This is why the whole business of encoding something like Tonal, which I otherwise wouldn't care about, in CSUR is interesting to me: because it really does involve some of the same issues of properties and glyph identity and unification that "real" encoding does. I'd like some opinions on these "real" Unicode questions from some of the experts who normally stay away from PUA issues, and especially from the CSUR.

--
Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org
RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s ­

[1] http://www.ewellic.org/alphabet/properties.html


Reply via email to