Unicode specifies several different classes of numeric character. One of these is decimal digit, which we handle using the is_digit and get_digit functions; there are about 25 distinct flavours of these. Additionally, the 'other numeric' class includes numbers outside the digit range (e.g. circled numbers from 10 to 50, and several fractions).
Questions: 1) Should the string to number conversion handle these non-digit numeric characters? Including fractions? 2) Given a consecutive string of digits from different ranges, should conversion stop at the boundary? i.e. should the sequence 'DIGIT ONE' 'DIGIT TWO' 'TAMIL DIGIT THREE' result in twelve or one hundred and twenty three? 3) What should 'DIGIT ONE' 'VULGAR FRACTION ONE QUARTER' produce? Any thoughts regarding signs, decimal points, exponent indicators etc. are also welcome -- Peter Gibbs EmKel Systems