Doc #51056 [Fbk]: fread() on blocking stream will block even if data is available

lbarnaud Fri, 12 Mar 2010 05:54:08 -0800

Edit report at http://bugs.php.net/bug.php?id=51056&edit=1


 ID:               51056
 Updated by:       lbarn...@php.net
 Reported by:      magical...@php.net
 Summary:          fread() on blocking stream will block even if data is
                   available
 Status:           Feedback
 Type:             Documentation Problem
 Package:          Streams related
 Operating System: Linux Gentoo 2.6.32
 PHP Version:      5.3.1

 New Comment:

I see your point in wanting read() behavior. Whether or not to implement
fread() or read() one is arguable. However the specific behavior you are
asking for is not reliable for several reasons, and IMHO (I may be
wrong) you want this behavior for bad reasons. Let me explain this :



> By the way using nonblocking mode makes no sense with provided
example. It would just make the program use 100% cpu.



This is why you don't want to use non-blocking streams. If you use
stream_select() you will never end up using 100% CPU : Your PHP process
will only do an idle wait in stream_select() and consume no CPU at all.



Example :



stream_set_blocking($stream, 0);

while (stream_select($r,$w,$e, $stream, $sec, $usec)) { /* block until
data is available for read and/or write in $stream. */

  $data = fread($stream, 8192); /* read all available data, up to 8192
bytes. Returns only 1 byte if only 1 byte is available and never blocks.
*/

}





> If end of email is reached while a read is in progress and a new read
is called, it will block until the server closes connections



With your patch (or with the read behavior you want) it will still
block. And it will block randomly, in an unpredictable manner.



Please see the following example :



Say the buffer has 250 bytes in it.

fread(100) -> buffer.length-=100, buffer.length == 100

fread(100) -> buffer.length-=100, buffer.length == 50

fread(100) -> with your patch it would return the last 50 available
bytes



Now this other example with a buffer with only 200 bytes in it :



Say the buffer has 200 bytes in it.

fread(100) -> buffer.length-=100, buffer.length == 100

fread(100) -> buffer.mength-=100, buffer.length == 0

fread(100) -> buffer is 0, this blocks, and you can't control this (you
don't control the buffer, and don't know anything about it in a php
script)



Please see 51056-3.phpt.



With current behavior it will block too, but in a predictable maner.


Previous Comments:
------------------------------------------------------------------------
[2010-03-12 03:12:35] magical...@php.net

So, it is normal for php's fread() to return immediatly when less data
than asked is available, unless this data arrived while a previous call
of fread() was done and there was 

too much data ?



Let me just state that this doesn't makes sense.



I tested stdc's fread() and could confirm that its behaviour is
consistent: it will only return when it has collected the data it
needed, when EOF is reached or when an error 

occurs.



It seems that PHP's php_stream_read() is closer to read() syscall than
to stdc's fread(), except for this one specific behaviour.



> It follows fread() behavior since years and I believe it should not
change.



I believe the problem comes from the new streams api which is an attempt
to make the socket api obsolete. In fact stream functions (including
fread()) behave the same way the 

old socket counterpart did when passed a socket.



The correct behaviour (as defined by common sense, and confirmed by PHP
4.4.9) :



Testing PHP version: 4.4.9

socket_read took 0.06ms to read 8 bytes

socket_read took 5.08ms to read 256 bytes

socket_read took 0.01ms to read 45 bytes

socket_read took 0.08ms to read 8 bytes

socket_read took 5.06ms to read 256 bytes

socket_read took 0.01ms to read 45 bytes

socket_read took 0.07ms to read 8 bytes

socket_read took 5.05ms to read 256 bytes

socket_read took 0.01ms to read 45 bytes

socket_read took 0.08ms to read 8 bytes



Testing with PHP 5.1.0 (first version containing stream_socket_pair())
exhibits a change of behaviour due to the new stream api.



Both tests 51056.phpt and 51056-2.phpt pass on PHP 4.4.9.



By the way using nonblocking mode makes no sense with provided example.
It would just make the program use 100% cpu. For example a PHP program
reading an email from a POP3 

server might lockdown because of this bug in blocking mode. If end of
email is reached while a read is in progress and a new read is called,
it will block until the server 

closes connections (expected behaviour = return remaining data).



As a PHP sockets programmer (I believe my experience when it comes to
php and sockets is not negligeable) I say once more that *this*
fread()'s behaviour is not consistent. 

fread() in blocking mode should block until it has enough bytes or
return as soon as some bytes are avaialble. Blocking should not depend
on when data has arrived.

------------------------------------------------------------------------
[2010-03-11 22:03:51] lbarn...@php.net

> I still believe fread() should not hang when it has data it can
return.



It follows fread() behavior since years and I believe it should not
change.



> The C counterpart doesn't



C's fread() does :)



> and the manual says it doesn't.



The manual looks wrong on this point, "reading will stop after a packet
is available" is never true, whatever packet means.



fread() (both PHP's and C's) returns less data than asked only on EOF or
errors.



The only reliable way of doing non-blocking i/o is still to use
non-blocking streams ;-)

------------------------------------------------------------------------
[2010-03-11 21:39:48] magical...@php.net

I still believe fread() should not hang when it has data it can return.
The C 

counterpart doesn't, and the manual says it doesn't.



Regarding test 51056-2.phpt.txt the manual explicitly says that this
*can 

happen* on anything else than files (read warning in example #3 on 

http://php.net/fread )



While I understand your concern for people who might be relying on
current bogus 

behaviour I find this very unlikely considering network streams are
subject to 

lags and different kinds of behaviour due to the large amount of tcp 

implementations on internet.



In the worst case, the manual explicitly warns against relying on
fread() 

returning as many bytes as requested, and says buffering must be used.

------------------------------------------------------------------------
[2010-03-11 21:23:43] lbarn...@php.net

> Apache [...] uses timeouts [...] to detect dead clients



This is what I was meaning :) (and I though you was meaning this too :
"application handling data from network should handle cases when
received data is not complete")



Dead clients, or situations like this are not the "normal case", and
sometimes this can be handled with timeouts.



If you are in situations where this is the normal case, one solution is
to use non blocking streams.



The following code does exactly what you are asking for (if there is
something to read, return it; else, block) :



stream_set_blocking(..., 0);

while (stream_select(...)) {

  $data = fread(...);

}



If it does not works with SSL streams, then SSL stuff should be fixed
instead.

------------------------------------------------------------------------
[2010-03-11 20:26:59] magical...@php.net

> This will block anyway when the buffer is empty and you won't be able
to known 

when it is empty, so you can't rely on this (sometimes it will block,
sometimes 

not).



PHP always calls poll() before read, so it knows if there is nothing to
read. 

stream_select() will return the socket as "ready" if there is data
pending in 

php buffer (even if there's no data on the socket), just so we can read
it.



> Also, some applications may rely on the blocking and will break if it
is 

changed. This behavior exists since at least PHP 5.1.



fread() manual explicitly warns about this:



When reading from anything that is not a regular local file, such as
streams 

returned when reading remote files or from popen() and fsockopen(),
reading will 

stop after a packet is available. This means that you should collect the
data 

together in chunks as shown in the examples below.



On the contrary, using blocking streams together with stream_select()
may lead 

to async program blocking because stream_select() saw there was pending
data, 

but a new packet will not arrive anytime soon.



> As this is not the normal case I would suggest to introduce some
timeout 

handling (this is what applications like e.g. Apache does, I guess), or
fixing 

what prevents you from using non blocking i/o with SSL streams instead.



It is the normal case to receive less than expected data as documented
on the 

php manual.

Apache (or any correctly coded networking app) does not uses timeouts
(except to 

detect dead clients), instead it uses read() which is reliable (ie. not
hang 

when there is data that can be returned).



By the way I have looked at what causes the problem I have with SSL
streams, and 

it could be worked around by switching the streamd between blocking mode
and 

non-blocking mode depending on the situation, however I would prefer to
avoid 

that (and it doesn't change the fact that fread() does not comply with
what is 

expected from it, both from read() syscall behaviour and php's manual)

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    http://bugs.php.net/bug.php?id=51056


-- 
Edit this bug report at http://bugs.php.net/bug.php?id=51056&edit=1

Doc #51056 [Fbk]: fread() on blocking stream will block even if data is available

Reply via email to