Hi, Which Parrot strings are supposed to be false in a boolean context? For instance, is "\x{FF10}" (FULLWIDTH DIGIT ZERO) false?
docs/strings.pod says[1] a string is false if it "consists of one digit character whose numeric value (as decided by its character type) is zero". However, string.c says[2] 'A string is true if it is equal to anything but "" and "0"' - implying that "\x{FF10}" is true. But then it calls s->type->get_digit, and strangely enough, chartypes/unicode.c has a FIXME comment which implies[3] that unicode_get_digit(U+FF10) should return 0. Allowing things like "\x{FF10}" to be false sounds like a bit of a nightmare to me. There are already over 20 forms of zero in Unicode 3.1; if the next version of unicode adds another one at, say, U+33333, does the next version of parrot change to think that "\x{33333}" is a false string? Thanks, -- David [1] docs/strings.pod: > To test a string for truth, use: > > BOOLVAL string_bool(struct Parrot_Interp *, STRING* s); > > A string is false if it > > o is not yet allocated > o has zero length > o consists of one digit character whose numeric value (as > decided by its character type) is zero. > > Otherwise the string will be true. [2] string.c: > /* A string is "true" if it is equal to anything but "" and "0" */ > BOOLVAL string_bool (const STRING* s) { [...] > if (len == 1) { > UINTVAL c = s->encoding->decode(s->bufstart); > if (s->type->is_digit(c) && s->type->get_digit(c) == 0) { > return 0; > } > } > > return 1; /* it must be true */ > } [3] chartypes/unicode.c: > static BOOLVAL > unicode_is_digit(UINTVAL c) { > return (BOOLVAL)(isdigit(c) ? 1 : 0); /* FIXME - Other code points are also >digits */ > } > > static INTVAL > unicode_get_digit(UINTVAL c) { > return c - '0'; /* FIXME - many more digits than this... */ > }