On Tue, 24 Nov 2015 21:02:30 +0100
Christoph Zwerschke <[email protected]> wrote:
> One solution is, as you say, to not cast to int, but to unsigned
> char, which is what isdigit expects. Or to use -funsigned-char, but
> we should not rely on that and also cast properly since the compiler
> flag may not be supported on all platforms (it's probably a gcc thing
> only).

No, isdigit takes an int.  That int must be -1 or else capable of
fitting into an unsigned char.  So, two things are required.  Cast the
arg to isdigit to int AND make sure that you use unsigned char for
strings.

> However, I think my solution is better because calling isdigit() is 
> unnecessary overhead. Remember it's a function call, not a macro,

It may be a function call but the function simply does a table lookup.
It's very efficient.  Not sure what it does for unicode.

> that also takes the locale into account. So checking >= '0' && <= '9'
> is faster, but moreover we want to be as restrictive as possible and
> not have other characters considered digits because of whatever
> strange interpretation of the locale. For instance, '\xb2' would be
> considered a digit on Windows because it is a superscript 2 in cp1252.

What about a unicode character where the second or third octet falls
into the '0' to '9' range?  It seems to me that we really need unicode
versions of ctype functions.

Maybe the answer is to move these functions into the Python wrappers
since Python already does unicode processing.

> You can still add the -funsigned-char, it cannot harm and should make 
> things a bit more deterministic.

OK, I'll do that.

-- 
D'Arcy J.M. Cain
PyGreSQL Development Group
http://www.PyGreSQL.org IM:[email protected]
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo.cgi/pygresql

Reply via email to