Edit report at http://bugs.php.net/bug.php?id=51056&edit=1
ID: 51056 Updated by: lbarn...@php.net Reported by: magical...@php.net Summary: fread() on blocking stream will block even if data is available Status: Feedback Type: Documentation Problem Package: Streams related Operating System: Linux Gentoo 2.6.32 PHP Version: 5.3.1 New Comment: I see your point in wanting read() behavior. Whether or not to implement fread() or read() one is arguable. However the specific behavior you are asking for is not reliable for several reasons, and IMHO (I may be wrong) you want this behavior for bad reasons. Let me explain this : > By the way using nonblocking mode makes no sense with provided example. It would just make the program use 100% cpu. This is why you don't want to use non-blocking streams. If you use stream_select() you will never end up using 100% CPU : Your PHP process will only do an idle wait in stream_select() and consume no CPU at all. Example : stream_set_blocking($stream, 0); while (stream_select($r,$w,$e, $stream, $sec, $usec)) { /* block until data is available for read and/or write in $stream. */ $data = fread($stream, 8192); /* read all available data, up to 8192 bytes. Returns only 1 byte if only 1 byte is available and never blocks. */ } > If end of email is reached while a read is in progress and a new read is called, it will block until the server closes connections With your patch (or with the read behavior you want) it will still block. And it will block randomly, in an unpredictable manner. Please see the following example : Say the buffer has 250 bytes in it. fread(100) -> buffer.length-=100, buffer.length == 100 fread(100) -> buffer.length-=100, buffer.length == 50 fread(100) -> with your patch it would return the last 50 available bytes Now this other example with a buffer with only 200 bytes in it : Say the buffer has 200 bytes in it. fread(100) -> buffer.length-=100, buffer.length == 100 fread(100) -> buffer.mength-=100, buffer.length == 0 fread(100) -> buffer is 0, this blocks, and you can't control this (you don't control the buffer, and don't know anything about it in a php script) Please see 51056-3.phpt. With current behavior it will block too, but in a predictable maner. Previous Comments: ------------------------------------------------------------------------ [2010-03-12 03:12:35] magical...@php.net So, it is normal for php's fread() to return immediatly when less data than asked is available, unless this data arrived while a previous call of fread() was done and there was too much data ? Let me just state that this doesn't makes sense. I tested stdc's fread() and could confirm that its behaviour is consistent: it will only return when it has collected the data it needed, when EOF is reached or when an error occurs. It seems that PHP's php_stream_read() is closer to read() syscall than to stdc's fread(), except for this one specific behaviour. > It follows fread() behavior since years and I believe it should not change. I believe the problem comes from the new streams api which is an attempt to make the socket api obsolete. In fact stream functions (including fread()) behave the same way the old socket counterpart did when passed a socket. The correct behaviour (as defined by common sense, and confirmed by PHP 4.4.9) : Testing PHP version: 4.4.9 socket_read took 0.06ms to read 8 bytes socket_read took 5.08ms to read 256 bytes socket_read took 0.01ms to read 45 bytes socket_read took 0.08ms to read 8 bytes socket_read took 5.06ms to read 256 bytes socket_read took 0.01ms to read 45 bytes socket_read took 0.07ms to read 8 bytes socket_read took 5.05ms to read 256 bytes socket_read took 0.01ms to read 45 bytes socket_read took 0.08ms to read 8 bytes Testing with PHP 5.1.0 (first version containing stream_socket_pair()) exhibits a change of behaviour due to the new stream api. Both tests 51056.phpt and 51056-2.phpt pass on PHP 4.4.9. By the way using nonblocking mode makes no sense with provided example. It would just make the program use 100% cpu. For example a PHP program reading an email from a POP3 server might lockdown because of this bug in blocking mode. If end of email is reached while a read is in progress and a new read is called, it will block until the server closes connections (expected behaviour = return remaining data). As a PHP sockets programmer (I believe my experience when it comes to php and sockets is not negligeable) I say once more that *this* fread()'s behaviour is not consistent. fread() in blocking mode should block until it has enough bytes or return as soon as some bytes are avaialble. Blocking should not depend on when data has arrived. ------------------------------------------------------------------------ [2010-03-11 22:03:51] lbarn...@php.net > I still believe fread() should not hang when it has data it can return. It follows fread() behavior since years and I believe it should not change. > The C counterpart doesn't C's fread() does :) > and the manual says it doesn't. The manual looks wrong on this point, "reading will stop after a packet is available" is never true, whatever packet means. fread() (both PHP's and C's) returns less data than asked only on EOF or errors. The only reliable way of doing non-blocking i/o is still to use non-blocking streams ;-) ------------------------------------------------------------------------ [2010-03-11 21:39:48] magical...@php.net I still believe fread() should not hang when it has data it can return. The C counterpart doesn't, and the manual says it doesn't. Regarding test 51056-2.phpt.txt the manual explicitly says that this *can happen* on anything else than files (read warning in example #3 on http://php.net/fread ) While I understand your concern for people who might be relying on current bogus behaviour I find this very unlikely considering network streams are subject to lags and different kinds of behaviour due to the large amount of tcp implementations on internet. In the worst case, the manual explicitly warns against relying on fread() returning as many bytes as requested, and says buffering must be used. ------------------------------------------------------------------------ [2010-03-11 21:23:43] lbarn...@php.net > Apache [...] uses timeouts [...] to detect dead clients This is what I was meaning :) (and I though you was meaning this too : "application handling data from network should handle cases when received data is not complete") Dead clients, or situations like this are not the "normal case", and sometimes this can be handled with timeouts. If you are in situations where this is the normal case, one solution is to use non blocking streams. The following code does exactly what you are asking for (if there is something to read, return it; else, block) : stream_set_blocking(..., 0); while (stream_select(...)) { $data = fread(...); } If it does not works with SSL streams, then SSL stuff should be fixed instead. ------------------------------------------------------------------------ [2010-03-11 20:26:59] magical...@php.net > This will block anyway when the buffer is empty and you won't be able to known when it is empty, so you can't rely on this (sometimes it will block, sometimes not). PHP always calls poll() before read, so it knows if there is nothing to read. stream_select() will return the socket as "ready" if there is data pending in php buffer (even if there's no data on the socket), just so we can read it. > Also, some applications may rely on the blocking and will break if it is changed. This behavior exists since at least PHP 5.1. fread() manual explicitly warns about this: When reading from anything that is not a regular local file, such as streams returned when reading remote files or from popen() and fsockopen(), reading will stop after a packet is available. This means that you should collect the data together in chunks as shown in the examples below. On the contrary, using blocking streams together with stream_select() may lead to async program blocking because stream_select() saw there was pending data, but a new packet will not arrive anytime soon. > As this is not the normal case I would suggest to introduce some timeout handling (this is what applications like e.g. Apache does, I guess), or fixing what prevents you from using non blocking i/o with SSL streams instead. It is the normal case to receive less than expected data as documented on the php manual. Apache (or any correctly coded networking app) does not uses timeouts (except to detect dead clients), instead it uses read() which is reliable (ie. not hang when there is data that can be returned). By the way I have looked at what causes the problem I have with SSL streams, and it could be worked around by switching the streamd between blocking mode and non-blocking mode depending on the situation, however I would prefer to avoid that (and it doesn't change the fact that fread() does not comply with what is expected from it, both from read() syscall behaviour and php's manual) ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/bug.php?id=51056 -- Edit this bug report at http://bugs.php.net/bug.php?id=51056&edit=1