Hi,

Which Parrot strings are supposed to be false in a boolean context?
For instance, is "\x{FF10}" (FULLWIDTH DIGIT ZERO) false?

docs/strings.pod says[1] a string is false if it "consists of one
digit character whose numeric value (as decided by its character type)
is zero".

However, string.c says[2] 'A string is true if it is equal to anything
but "" and "0"' - implying that "\x{FF10}" is true.  But then it 
calls s->type->get_digit, and strangely enough,  chartypes/unicode.c
has a FIXME comment which implies[3] that unicode_get_digit(U+FF10)
should return 0.

Allowing things like "\x{FF10}" to be false sounds like a bit of a
nightmare to me.  There are already over 20 forms of zero in Unicode
3.1; if the next version of unicode adds another one at, say, U+33333,
does the next version of parrot change to think that "\x{33333}" is
a false string?

Thanks,
-- 
David


[1] docs/strings.pod:
> To test a string for truth, use:
> 
>     BOOLVAL string_bool(struct Parrot_Interp *, STRING* s);
> 
> A string is false if it
> 
>  o  is not yet allocated
>  o  has zero length
>  o  consists of one digit character whose numeric value (as
>     decided by its character type) is zero.
> 
> Otherwise the string will be true.



[2] string.c:
> /* A string is "true" if it is equal to anything but "" and "0" */
> BOOLVAL string_bool (const STRING* s) {
[...]
>     if (len == 1) {
>         UINTVAL c = s->encoding->decode(s->bufstart);
>         if (s->type->is_digit(c) && s->type->get_digit(c) == 0) {
>             return 0;
>         }
>     }
> 
>     return 1; /* it must be true */
> }



[3] chartypes/unicode.c:
> static BOOLVAL
> unicode_is_digit(UINTVAL c) {
>     return (BOOLVAL)(isdigit(c) ? 1 : 0); /* FIXME - Other code points are also 
>digits */
> }
> 
> static INTVAL
> unicode_get_digit(UINTVAL c) {
>     return c - '0'; /* FIXME - many more digits than this... */
> }

Reply via email to