From: [EMAIL PROTECTED]
Operating system: WinME
PHP version: 4.2.3
PHP Bug Type: Performance problem
Bug description: Populating a large array of arrays doesn't scale linearly
Steps to Reproduce:
0 - Standard PHP 4.2.3 installed using the installer.
1 - Create an array of strings using something like explode on a CSV line
2 - Add that array to another array
3 - Repeat steps 1 and 2 1000, 2000, 3000, 4000, 5000 and 6000 times
Example snippet (short):
$times = 6000; /* change this to 1000-6000 */
$line =
"'x','s','n','f','n','a','c','b','y','e',?,'s','s','o','o','p','o','o','p','o','c','l','e'";
$storage = array ( );
for ( $a = 0; $a < $times; $a++ )
{
$bits = explode ( ",", $line );
$storage[] = $bits;
}
Expected Behaviour:
The time it takes to do perform this is linearly proportional to the
number of iterations it performs.
Actual Behaviour:
The results of running (on a Celeron 800) are as follows:
1000 - 2.174001 seconds
2000 - 22.422081 seconds
3000 - 74.593858 seconds
4000 - 148.223771 seconds
5000 - 254.329387 seconds
6000 - 371.621738 seconds
The time it takes pretty much doubles for every increase of 1000.
The funny thing is I was able to do similar operations independently with
linear time. For example, a slight modification to the script:
$times = 6000;
$line =
"'x','s','n','f','n','a','c','b','y','e',?,'s','s','o','o','p','o','o','p','o','c','l','e'";
$storage = array ( );
$otherBits = explode ( ",", $line );
for ( $a = 0; $a < $times; $a++ )
{
$bits = explode ( ",", $line );
$storage[] = $otherBits;
}
...and it runs in less than half of a second, even at $times = 6000. So I
know I can add arrays to an array.
A much longer repro script (that demonstrates that these operations work
fine independently in various combinations) follows:
------start long repro script----------
<pre>
<?php
$trace = true;
$start_time = 0;
$times = 6000;
function getmicrotime ( )
{
list ( $usec, $sec ) = explode ( " ", microtime () );
return ( (float) $usec + (float) $sec );
}
function start ( $message )
{
global $trace, $start_time;
$start_time = getmicrotime ( );
if ( $trace )
{
printf ( "%s... ", $message );
flush ( );
}
}
function stop ( )
{
global $trace, $start_time;
if ( $trace )
{
$current_time = getmicrotime ( );
$runningTime = $current_time - $start_time;
printf ( "%.6f seconds\n", $runningTime );
flush ( );
}
}
set_time_limit ( 0 );
error_reporting ( E_ALL );
// allocate some memory so as to not bias the first result
$storage = array ( );
$line =
"'x','s','n','f','n','a','c','b','y','e',?,'s','s','o','o','p','o','o','p','o','c','l','e'";
$otherBits = explode ( ",", $line );
for ( $a = 0; $a < $times; $a++ )
{
$storage[] = $line;
}
start ( "A: Adding $times times the same string to an array" );
$storage = array ( );
for ( $a = 0; $a < $times; $a++ )
{
$storage[] = $line;
}
stop ( );
start ( "B: Adding $times times the same array to an array" );
$storage = array ( );
for ( $a = 0; $a < $times; $a++ )
{
$storage[] = $otherBits;
}
stop ( );
start ( "C: Creating $times different arrays" );
for ( $a = 0; $a < $times; $a++ )
{
$bits = explode ( ",", $line );
}
stop ( );
start ( "D: Tests A and C" );
$storage = array ( );
for ( $a = 0; $a < $times; $a++ )
{
$storage[] = $line;
$bits = explode ( ",", $line );
}
stop ( );
start ( "E: Tests A and B" );
$storage = array ( );
for ( $a = 0; $a < $times; $a++ )
{
$storage[] = $line;
$bits = explode ( ",", $line );
}
stop ( );
start ( "F: Tests B and C with different arrays" );
$storage = array ( );
for ( $a = 0; $a < $times; $a++ )
{
$bits = explode ( ",", $line );
$storage[] = $otherBits;
}
stop ( );
start ( "G: Tests B and C with the array just created" );
$storage = array ( );
for ( $a = 0; $a < $times; $a++ )
{
$bits = explode ( ",", $line );
$storage[] = $bits;
}
stop ( );
?>
</pre>
------end long repro script------------
Test "G" in the long repro script is the one that takes "forever" compared
to the other operations. Note that even though the same $line is exploded
on every iteration, I am expecting these to be different (read from a
file) in every iteration, hence the need to explode inside the loop.
I also checked the bug database. Unlike
http://bugs.php.net/bug.php?id=13598 I am not concerned with how much
memory my script takes. And I don't think that I am experiencing the same
problem as http://bugs.php.net/bug.php?id=6333 because I *am* able to
create an array of arrays the same size. (ie: test "F" in the long repro
script) I didn't find any other bugs that were similar to mine, although
I might have missed something (sorry) because I wasn't sure how to express
the problem.
Work-arounds appreciated (I tried pre-allocating with array_fill - no
avail), but a fix would even be better.
Good luck and thanks in advance!
--
Edit bug report at http://bugs.php.net/?id=19499&edit=1
--
Try a CVS snapshot: http://bugs.php.net/fix.php?id=19499&r=trysnapshot
Fixed in CVS: http://bugs.php.net/fix.php?id=19499&r=fixedcvs
Fixed in release: http://bugs.php.net/fix.php?id=19499&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=19499&r=needtrace
Try newer version: http://bugs.php.net/fix.php?id=19499&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=19499&r=support
Expected behavior: http://bugs.php.net/fix.php?id=19499&r=notwrong
Not enough info: http://bugs.php.net/fix.php?id=19499&r=notenoughinfo
Submitted twice: http://bugs.php.net/fix.php?id=19499&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=19499&r=globals