Roman Neuhauser wrote:
> # [EMAIL PROTECTED] / 2007-01-17 16:59:26 +0100:
>> Roman Neuhauser wrote:
>>> re_format(7) on FreeBSD:
>>>
>>>      A bracket expression is a list of characters enclosed in `[]'.
>>>      (...)
>>>      If two characters in the list are separated by `-', this is
>>>      shorthand for the full range of characters between those two
>>>      (inclusive) in the collating sequence, e.g. `[0-9]' in ASCII
>>>      matches any decimal digit.
>>>      (...)
>>>      Ranges are very collating-sequence-dependent, and portable programs
>>>      should avoid relying on them.
>> one other thing ...
>>
>> wouldn't it be fair to assume (safety through paranoia) that
>> ctype_alnum() would suffer the same problem? (given the manual's
>> indication that ctype_alnum() and the offending regexp are equivalent?)
> 
> isalnum(3) uses isalpha(3) and isdigit(3), so yes, their results are
> locale-dependent (LC_CTYPE, see setlocale(3)), but don't depend on
> collating sequence. 

so really the doc's are slightly misleading or even incorrect,
I will try to formulate a succinct question for internals@ to ask whether
this should be reported as documentation bug.

as a side note: do you have any real world example of where this
collation issue might actually bite someone making use of the aforementioned
regexp range?

> isdigit(3):
> 
>      The isdigit() function tests for a decimal digit character.  Regardless
>      of locale, this includes the following characters only:
> 
>      ``0''         ``1''         ``2''         ``3''         ``4''
>      ``5''         ``6''         ``7''         ``8''         ``9''
> 

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to