----- Ursprüngliche Message -----
> Von: Gustavo Lopes <glo...@nebm.ist.utl.pt>
> An: 'Patrick Schaaf' <p...@bof.de>; "internals@lists.php.net" 
> <internals@lists.php.net>; Frank Liepert <frank.liep...@gmx.de>; hakre 
> <hanskren...@yahoo.de>
> CC: 'Derick Rethans' <der...@php.net>; 'Martin Jansen' <mar...@divbyzero.net>
> Gesendet: 21:19 Freitag, 8.Februar 2013
> Betreff: Re: AW: [PHP-DEV] FILTER_VALIDATE_INT and +0/-0
> 

>>  A special case still left is "±0". It is with the 'PLUS-MINUS 
> SIGN' (U+00B1).
> 
> By special case, I meant a deviation to the general rule on how the code 
> handles 
> the input. The code handles the characters 0-9 prefixed by an optional sign.

The general rule is to either allow + 'PLUS SIGN' (U+002B) and -  
'HYPHEN-MINUS' (U+002D) for all positive natural numbers excluding zero.

The discussion is about to allow those as well for zero.

The 'PLUS-MINUS SIGN' (U+00B1) is a relevant sign for the number zero in this 
context but it got unnoticed so far in the discussion.

To not deviate from the general rule to allow signs in front of all positive 
natural numbers excluding zero for the missing zero, all valid plus and minus 
signs including *both at once* as possible for zero should be properly filtered 
as valid integers.

If you aim to have UTF-8 compatibility with the input, you should also consider 
'MINUS SIGN' (U+2212), I didn't mention it so far because PHP by default 
targets ISO-8859-1 (at least commonly, historically and by popularity), so I 
only covered the sign in Latin-1.

>  The 
> PLUS-MINUS SIGN -- or, for that matter, all the other numeric characters in 
> the 
> Unicode repertoire -- are irrelevant.

Unicode is never irrelevant, it's used to communicate clearly and specifically 
about which signs I'm concerned about.

Unicode does not classify "numeric characters", you probably meant 'Number, 
Decimal Digit', 'Symbol, Math [Sm]', 'Punctuation, Dash' or 'Number, Other' but 
it remains unspecified in your email. Would you please elaborate?

> 
>>  It's an equally incorrect sign for the number 0 as "-" or 
> "+" is incorrect. Available in internet standards ISO-8859-1 and more 
> as "\xB1"  (UTF-8 as "\xC2\xB1"), 
> FILTER_VALIDATE_INT should reflect hidden dependency of input encoding here.
> 
> I'm not sure what you're arguing for here.
To make the feature complete, the input encoding needs to be hinted for those 
signs otherwise the FILTER_VALIDATE_INT won't work properly with strings with 
an unexpected encoding (UTF-8 since PHP 5.4 (?!); ISO-8859-1 in the past).

Otherwise I'd say it's important to document a note that the function is 
US-ASCII / ISO-8859-1 safe (only?) as this is string input validation.

-- hakre

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to