Re: Digit/letter variants in the "same" unified script (was: stability policy on numeric type = decimal)

karl williamson Thu, 29 Jul 2010 15:15:39 -0700

Mark Davis ☕ wrote:


Mark

/— Il meglio è l’inimico del bene —/

On Thu, Jul 29, 2010 at 05:57, Philippe Verdy <verd...@wanadoo.fr<mailto:verd...@wanadoo.fr>> wrote:


    "Martin J. Dürst" <due...@it.aoyama.ac.jp
    <mailto:due...@it.aoyama.ac.jp>> wrote:
     >
     > On 2010/07/29 13:33, karl williamson wrote:
     > > Asmus Freytag wrote:
     > >> On 7/25/2010 6:05 PM, Martin J. Dürst wrote:
     >
     > >>> Well, there actually is such a script, namely Han. The digits
    (一、
     > >>> 二、三、四、五、六、七、八、九、〇) are used both as letters
    and as
     > >>> decimal place-value digits, and they are scattered widely, and of
     > >>> course there are is a lot of modern living practice.
     >
     > >> The situation is worse than you indicate, because the same
    characters
     > >> are also used as elements in a system that doesn't use
    place-value,
     > >> but uses special characters to show powers of 10.
     >
     > No. Sequences of numeric Kanji are also used in names and word-plays,
     > and as sequences of individual small numbers.

     (1) Existing exception :

    There's one example of a digit which has a numeric type = decimal, AND
    is encoded in a "scattered" way:

    19DA;6618;᧚;New Tai Lue Tham Digit One;Nd;0;L;...;1;1;1;N

    The other decimal nine digits for the Tham variant of the New Tai Lue
    digits are borrowed from another sequence of decimal digits, starting
    at U+19D0 (for digit zero) with the exception of U+19D1 which is
    replaced (for digit one). Both sets are assigned in the same
    "New_Tai_Lue" script property value.

    So the additional stability proposal will not be enforceable.

On the contrary. Were we do want such a policy, the implication would beeither to:(a) change the type of 19DA from Nd to No (what I think would be theright thing to do)

(b) grandfather in the character.

This discussion doesn't make sense to me. The original proposal toencode 19DA says that there is one set of digits in New Tai Lue, butthere is an extra digit '1' (the one that got put at 19DA), used whenthe other digit '1' is visually confusable with another character in thescript, which it resembles. That makes it sound like the two areessentially used as glyph variants of each other, and areinterchangeable as far as the computer recognizing an input number.

Thus, it is appropriate to keep it as Nd, and it isn't scattered,because it is adjacent to the block of 10 digits. My original proposalaccounted for this case, asking that the slot or two immediately abovethe digit '9' be unassigned initially in a new script encoding, just incase a situation like this one arises again.

One thing that I should have brought up earlier in this discussion isthat, as an implementor, I can deal with existing exceptions. I may notwant to, and may choose not to if my subjective calculation ofbenefit/cost indicates it's not worthwhile. Given the existing patternof code point assignments, I saw an efficient way to implement things.And, if future Unicode versions retain this pattern, neither I nor mysuccessors will have to change our code to move to that new version.Changing code takes a significant amount of time and effort. Keepingnew versions of Unicode using the same paradigms as previous versionsmeans that implementations of those new versions will be availablesooner than otherwise, and even that they get adopted at all. I wasunaware of the subtleties in Han and Arabic, but those can be handled asexceptions, but making new exceptions is really contrary to Unicode'sinterests. So it really isn't about current counter examples; there'snothing much that can be done about them. It's about adoptingguidelines to keep from unnecessarily creating new exceptions.

Re: Digit/letter variants in the "same" unified script (was: stability policy on numeric type = decimal)

Reply via email to