Mark Davis ☕ wrote:

Mark

/— Il meglio è l’inimico del bene —/


On Thu, Jul 29, 2010 at 05:57, Philippe Verdy <verd...@wanadoo.fr <mailto:verd...@wanadoo.fr>> wrote:

    "Martin J. Dürst" <due...@it.aoyama.ac.jp
    <mailto:due...@it.aoyama.ac.jp>> wrote:
     >
     > On 2010/07/29 13:33, karl williamson wrote:
     > > Asmus Freytag wrote:
     > >> On 7/25/2010 6:05 PM, Martin J. Dürst wrote:
     >
     > >>> Well, there actually is such a script, namely Han. The digits
    (一、
     > >>> 二、三、四、五、六、七、八、九、〇) are used both as letters
    and as
     > >>> decimal place-value digits, and they are scattered widely, and of
     > >>> course there are is a lot of modern living practice.
     >
     > >> The situation is worse than you indicate, because the same
    characters
     > >> are also used as elements in a system that doesn't use
    place-value,
     > >> but uses special characters to show powers of 10.
     >
     > No. Sequences of numeric Kanji are also used in names and word-plays,
     > and as sequences of individual small numbers.

     (1) Existing exception :

    There's one example of a digit which has a numeric type = decimal, AND
    is encoded in a "scattered" way:

    19DA;6618;᧚;New Tai Lue Tham Digit One;Nd;0;L;...;1;1;1;N

    The other decimal nine digits for the Tham variant of the New Tai Lue
    digits are borrowed from another sequence of decimal digits, starting
    at U+19D0 (for digit zero) with the exception of U+19D1 which is
    replaced (for digit one). Both sets are assigned in the same
    "New_Tai_Lue" script property value.

    So the additional stability proposal will not be enforceable.


On the contrary. Were we do want such a policy, the implication would be either to: (a) change the type of 19DA from Nd to No (what I think would be the right thing to do)
(b) grandfather in the character.

This discussion doesn't make sense to me. The original proposal to encode 19DA says that there is one set of digits in New Tai Lue, but there is an extra digit '1' (the one that got put at 19DA), used when the other digit '1' is visually confusable with another character in the script, which it resembles. That makes it sound like the two are essentially used as glyph variants of each other, and are interchangeable as far as the computer recognizing an input number.

Thus, it is appropriate to keep it as Nd, and it isn't scattered, because it is adjacent to the block of 10 digits. My original proposal accounted for this case, asking that the slot or two immediately above the digit '9' be unassigned initially in a new script encoding, just in case a situation like this one arises again.

One thing that I should have brought up earlier in this discussion is that, as an implementor, I can deal with existing exceptions. I may not want to, and may choose not to if my subjective calculation of benefit/cost indicates it's not worthwhile. Given the existing pattern of code point assignments, I saw an efficient way to implement things. And, if future Unicode versions retain this pattern, neither I nor my successors will have to change our code to move to that new version. Changing code takes a significant amount of time and effort. Keeping new versions of Unicode using the same paradigms as previous versions means that implementations of those new versions will be available sooner than otherwise, and even that they get adopted at all. I was unaware of the subtleties in Han and Arabic, but those can be handled as exceptions, but making new exceptions is really contrary to Unicode's interests. So it really isn't about current counter examples; there's nothing much that can be done about them. It's about adopting guidelines to keep from unnecessarily creating new exceptions.

Reply via email to