ID:               50894
 User updated by:  lee at projectmastermind dot com
 Reported By:      lee at projectmastermind dot com
 Status:           Open
 Bug Type:         Performance problem
 Operating System: Linux, OSX
-PHP Version:      6SVN-2010-02-01 (snap)
+PHP Version:      5.*,6
 New Comment:

updated version to indicate that the problem exists in versions 5.x and

6.


Previous Comments:
------------------------------------------------------------------------

[2010-02-01 06:13:33] lee at projectmastermind dot com

Description:
------------
given a value with a particular type, casting it to that same type 
should essentially be a no-op -- once it is determined that the 
operand already has the correct type, no further action needs to be 
taken.

Ex:
  $a = array();
  $b = (array)$a;

In this example, $a is already an array, so this should be a simple 
assignment operation.  $b should get a "lazy" copy of $a via PHP's 
copy-on-write policy.  Instead, the cast operation seems to force an 
immediate (non-lazy) full copy.

This creates a huge potential for hidden performance problems, as it 
causes code that *looks* like it would run in constant time [O(1)] to 
actually require linear time [O(n)] (where n represents the size of 
the data being copied).

I have verified that this issue does exist for string types as well.  
I assume that it applies to all PHP types.

Of course it becomes a significant performance issue primarily for 
types that can hold large amounts of data, where the data is 
duplicated whenever the zval is duplicated (AFAIK, this is only string

and array).

I have verified this on the following versions of php:
  5.2.6
  5.2.8
  6.0.0-dev (php6.0-201001312130)



Reproduce code:
---------------
<?php

for( $z=1; $z<5; ++$z ) {
   $a = array_fill(0, 100*$z, '0');

   $t_start = microtime(true);
   for($i=0;$i<100000;++$i) {

      // O(n) [should be constant time, but isn't]
      // cast triggers non-lazy copy
      //
      $b = (array)$a;

      // O(1) [constant time, as expected]
      // (comment above, and uncomment here for comparison)
      //
      //$b = $a;
   }
   $t_elapsed = (microtime(true)*1000)-($t_start*1000);
   printf(
      "(%d elements * %d copies): %f ms\n\n", 
      100*$z, $i, $t_elapsed
   );
}



Expected result:
----------------
(100 elements * 100000 loops): 11.264160 ms

(200 elements * 100000 loops): 11.363037 ms

(300 elements * 100000 loops): 11.208984 ms

(400 elements * 100000 loops): 11.809082 ms


NOTE: the time stays roughly constant as the number of elements 
increases -- the assignments are copy-on-write, so no significant 
performance hit is incurred.


Actual result:
--------------
(100 elements * 100000 copies): 736.453613 ms

(200 elements * 100000 copies): 1448.991211 ms

(300 elements * 100000 copies): 2130.541016 ms

(400 elements * 100000 copies): 2823.362793 ms


NOTE: the time increases as the size of the array increases.  (This 
happens with large strings too).  This is a good indicator that a copy

is being made [non-lazily] when the cast is applied.



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=50894&edit=1

Reply via email to