On 7/24/2010 3:00 PM, Bill Poser wrote:
On Sat, Jul 24, 2010 at 1:00 PM, Michael Everson <ever...@evertype.com> wrote:

Digits can be scattered randomly about the code space and it wouldn't make any 
difference.

Having written a library for performing conversions between Unicode
strings and numbers, I disagree. While it is not all that hard to deal
with the case in which the characters for the digits are scattered
about the code space, if they occupy a contiguous set of code points
in order of their value as they do, e.g., in ASCII, it simplifies both
the conversion itself and such tasks as identifying the numeral system
of a numeric string and checking the validity of a string as a number
in a particular numeral system.

It may well be that adopting such a policy is not realistic, but there
would be advantages to it if were.
Bill,
Michael is no programmer, hence he doesn't have first hand understanding why 
programmers distiguish between character set mapping (normally requiring 
look-up tables) and digit conversion (normally done by offset calculations).

That said, there are enough programmers on the committees so that scattered 
encoding of digits, while not prevented, is at least not the method of choice.

The problem with making this a policy is that some scripts may not have a 
decimal place-value type number system (or such use is not documented) at the 
time of their encoding. That means, a digit zero may not be known or documented.

However, a prudent encoding policy would be to leave a gap in that case, 
because there have been scripts for which use of a decimal place-value system 
was later discovered.

A./



Reply via email to