ID:               33070
 User updated by:  lindsay at bitleap dot com
 Reported By:      lindsay at bitleap dot com
-Status:           Feedback
+Status:           Open
 Bug Type:         Performance problem
 Operating System: Linux 2.6.10 kernel
 PHP Version:      5.0.3
 New Comment:

Given that bzip can be used in a stream, could the data be decompressed
in chunks?


Previous Comments:
------------------------------------------------------------------------

[2005-05-22 15:20:22] [EMAIL PROTECTED]

And how do you think it should work?
I don't see any more effective way that will not hit memory instead of
CPU usage.

------------------------------------------------------------------------

[2005-05-20 16:56:48] lindsay at bitleap dot com

Script:
<?php

decrypt( "256K", file_get_contents("256K.bz2", "r"));
decrypt( "512K", file_get_contents("512K.bz2", "r"));
decrypt( "1M",   file_get_contents("1M.bz2", "r"));
decrypt( "2M",   file_get_contents("2M.bz2", "r"));
decrypt( "3M",   file_get_contents("3M.bz2", "r"));
decrypt( "4M",   file_get_contents("4M.bz2", "r"));

function decrypt($file_name, $file_data)
{
        echo "file length $file_name ran in: ";

        $time_start = time();
        bzdecompress($file_data, false);
        $end_start = time();

        echo ($end_start - $time_start) . " seconds\n";
}

?>


Data:
If you run linux:
dd if=/dev/urandom bs=1024 count=256 of=256K
dd if=/dev/urandom bs=1024 count=512 of=512K
dd if=/dev/urandom bs=1024 count=1024 of=1M
dd if=/dev/urandom bs=1024 count=2048 of=2M
dd if=/dev/urandom bs=1024 count=3072 of=3M
dd if=/dev/urandom bs=1024 count=4096 of=4M
bzip2 256K 512K 1M 2M 3M 4M

If not, let me know and I'll upload or email data samples.

------------------------------------------------------------------------

[2005-05-19 23:44:05] [EMAIL PROTECTED]

Please provide a short but complete reproduce script and (if possible)
the data too.

------------------------------------------------------------------------

[2005-05-19 17:22:47] lindsay at bitleap dot com

Description:
------------
I found bug #13860 regarding bzdecompress speeds on 4.2.x.  The bug
suggests its fixed for 4.2.0 but I see slowness in 5.0.3.

On 5.0.3, bzdecompress seems to get exponentionally slower as data
sizes increase.  I timed the length of decompression on increasing file
sizes containing compressed random data:

file length 256K ran in: 2 seconds
file length 512K ran in: 10 seconds
file length 1M ran in: 41 seconds
file length 2M ran in: 135 seconds
file length 3M ran in: 278 seconds
file length 4M ran in: 476 seconds


I'm not a c coder but the do while loop at line 472 of:
 
http://cvs.php.net/co.php/php-src/ext/bz2/bz2.c?r=1.9

seems to decompress the string over and over.  Each iteration attempts
a decompress w/ a buffer size based on the iteration count.  If the
decompress fails, a larger buffer is tried.  If I'm reading this
correctly, the same data could be decompressed hundreds of times until
the buffer gets large enough.



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=33070&edit=1

Reply via email to