In message <[EMAIL PROTECTED]>
          James Mastros <[EMAIL PROTECTED]> wrote:

> On Mon, 3 Dec 2001, Tom Hughes wrote:
> > It's completely wrong I would have thought - the encoding layer
> > cannot know that a given code point is a digit so it can't possibly
> > do string to number conversion.
> >
> > You need to use the encoding layer to fetch each character and
> > then the character set layer to determine what digit it represents.
> Right.  And then you need to apply some unified logic to get from this
> vector of digits (and other such symbols) to a value.

Indeed, and that logic needs to be in the string layer where it can
use both the encoding routines and the character type routines. I have
just rearranged things to reflect that.

> I'm just having nightmares of subtily different definitions of what a
> numeric constant looks like depending on the string encoding, because of
> different bits o' code not being quite in sync.  Code duplication bad,
> code sharing good.

Absolutely. That code is now in one place.

> (The charset layer should still be involved somewhere, because Unicode
> (for ex) has a "digit value" property.  This makes, say, aribic numerials
> (which don't look at all what what a normal person calls aribic numerals,
> BTW) work properly.  (OTOH, it might also do strange things with ex
> Hebrew, where the letters are also numbers (Aleph is also 1, Bet is also
> 2, etc.))

So far I have added as is_digit() call to the character type layer
to replace the existing isdigit() calls. To do things completely right
we need to extend that with calls to get the digit value, check for
sign characters etc, rather than assuming ASCIIish like it does now.

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/

Reply via email to