[email protected] ("Christoph M. Becker") wrote:
> [...] I tend to prefer the non-locale aware behavior, i.e. float to
> string conversion should always produce a decimal *point*. Users still
> can explicitly use number_format() or NumberFormatter if they wish.
We all agree that the basic features of the language should NOT be
locale-aware to make easier error reporting and logging, data file writing
and parsing, session management, and libraries portability. But I would
to restate this goal more clearly:
FLOAT TO STRING CAST CONVERSION REPLACEMENT
Given a floating-point value, retrieve its canonical PHP source-code
string representation. By "canonical" I mean something that can be
parsed by the PHP interpreter like a floating-point number, not like
an int or anything else. Then, for example, 123.0 must be rendered as
"123.0" not as "123" because it looks like an int; non-finite values
NAN and INF must also be rendered as "NAN" and "INF". The "(string)"
cast and the embedded variable in string "$f" are locale-aware, and so
are all the printf() &Co. functions, including var_dump() (this latter a
big surprise; anyone willing to send a data structure dump to end user?).
The simplest way I found to get such canonical representation is
$s = var_export($f, TRUE);
which returns exactly what I expect, does not depend on the current
locale, does not depend on exotic libraries, and it is very short and
simple. It depends only on the current serialize_precision php.ini
parameter, which should already be set right (or you are going to have
problems elsewhere).
STRING TO FLOAT CAST CONVERSION REPLACEMENT
Given a string carrying the canonical representation of a floating-point
number, retrieve the floating-point number. Syntax errors must be
detectable. The result must be "float", not int or anything else.
Unsure about how much strict the parser should be in these edge cases:
"+1.2" (redundant plus sign)
"123" (looks like int, not a float)
"0123" (looks like int octal base)
Getting all this is bit more tricky. The "(float)" cast does not work
because it does not support non-finite values NAN,INF and does not allow
to detect errors. The simplest way I found is by using the serialize()
function:
/**
* Parses the PHP canonical representation of a floating point number. This
* function parses any valid PHP source code representation of a "float",
* including NAN, INF, -INF and -0 (IEEE 754 zero negative). Not locale aware.
* @param string $s String to parse. No spaces allowed, apply trim() if needed.
* @return float Parsed floating-point number.
* @throws InvalidArgumentException Invalid syntax.
*/
function parseFloat($s)
{
// Security: untrusted strings must be checked against a basic syntax
before
// being blindly submitted to unserialize():
if( preg_match("/^[-+]?(NAN|INF|[-+.0-9eE]++)\$/sD", $s) !== 1 )
throw new InvalidArgumentException("cannot parse as a floating
point number: '$s'");
// unserialize() raises an E_NOTICE on parse error and then returns
FALSE.
$m = @unserialize("d:$s;");
if( is_int($m) )
return (float) $m; // always return what we promised
if( is_float($m) )
return $m;
throw new InvalidArgumentException("cannot parse as a floating point
number: '$s'");
}
Here again, only core libraries involved, no dependencies from the locale,
not so short but the best I found up now. Things like NumberFormatter
require the 'intl' extension be enabled, and often it isn't.
By using these functions all the possible "float" values pass the
round-trip back and forth, including NAN, INF, -INF, -0 (zero negative,
for what it worth) at the highest accuracy possible of the IEEE 754
representation.
Regards,
___
/_|_\ Umberto Salsi
\/_\/ www.icosaedro.it
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php