The default serialize precision is currently [1] set at 100. A little code inspection shows precision, in this case, takes the usual meaning of number of significant digits.

Given that the implicit precision of a (normal) IEEE 754 double precision number is slightly less than 16 digits [2], this is a serious overkill. Put another way, while the mantissa is composed of 52 bits plus 1 implicit bit, 100 decimal digits can carry up to 100*log2(10) =~ 332 bits of information, around 6 times more.

Given this, I propose changing the default precision to 17 (while the precision is slightly less than 16, a 17th digit is necessary because the first decimal digit carries little information when it is low).

From my tests, this makes serialization and unserialization of doubles around 3 times faster (counting the function calls to serialize/unserialize, plus a loop variable increment and stop condition check). It also makes the serialization data.

Crucially, from my tests, the condition that the variable stays the same before and after serialization+unserialization still holds. The test include, for little endian machines, verifies this.

If no one objects, I'll change the default precision to 17.


I've always done this, and the default of 100 has been puzzling.

We have another problem. The counterpart of serialize precision, the display "precision" INI setting is currently lower than 17 on most PHP installs, and it this leads to actual data loss in places where people expect least.

All database drivers/APIs I've tested have escape/quote routines which are affected by this supposedly "display only" precision.

The below test is performed on a 32-bit system (integers over 2^31 become floats):

$x = new PDO('mysql:localhost', '...', '...');

ini_set('precision', 12); // typical for many PHP installs
echo $x->quote(pow(2,50)); // (string) '1.12589990684E+15'

ini_set('precision', 17);
echo $x->quote(pow(2,50)); // (string) '1125899906842624'

In the first example, the E notation, the API sends this to the database: 112589990684000, and was supposed to be 1125899906842624.
It's off by -2624.

The immediate fix for this is to keep *both* precisions to 17, and in the long term, they should switch to serialize.precision and not use the display precision.

Stan Vass

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to