Hi Yann,

On 2016-02-04 15:41, Yann Ylavic wrote:
On Thu, Feb 4, 2016 at 11:53 PM, Joachim Achtzehnter <joac...@kraut.ca> wrote:
On 2016-02-04 0:54, Yann Ylavic wrote:

On Thu, Feb 4, 2016 at 9:41 AM, Yann Ylavic <ylavic....@gmail.com> wrote:

Doesn't the socket_bucket_read() call (frame #3) enter
apr_socket_timeout_set(p, 0), and hence sononblock() which puts the
socket in O_NONBLOCK?


and resets APR_INCOMPLETE_READ.


Yes, it does:

     if (t <= 0) {
         sock->options &= ~APR_INCOMPLETE_READ;
     }

But isn't this precisely the problem? The socket_buffer_read()
implementation in "srclib/apr/network_io/unix/sendrecv.c" uses select() only
if the APR_INCOMPLETE_READ flag is SET:

We don't want to select/poll here (timeout is 0), we want to directly
read non-blocking on the socket.

Now I understand. The select() should not be needed for the zero timeout case of non-blocking read.

Since apr_socket_timeout_set(sock, 0)::sononblock() is called (and
succeeds!),

Actually, this was not the case. The sononblock() function was never called for the accepted sockets. It was called for the listen socket(s), but then never again. When I tried to figure out why it was not called I eventually discovered the problem, see below.

recv() should never block (or the system is flawed, or the
socket is in some inconsistent state).
If the timeout were <0, we'd want recv() to block (directly still) and
hence need soblock() before.
If timeout >0, we don't want to wait more than that timeout, so we use
select/poll (timed wait).

Right, I get it now.

How about this small patch:

No, we don't want to poll() before recv() in non-blocking mode, one
system call is enough.

Yes

This seems to get it working,

You mean poll()+recv() succeeds? or fails: at poll(), at recv()?
immediatly in any case?
I'm puzzled, why direct non-blocking recv() would block (or not fail)?

It was all working, but for the wrong reasons.

Are you sure the earlier sononblock() call succeeded (ie. my patch on
socket_bucket_read() does not help)?

It did not get called. The reason was that it assumed the socket was already in non-blocking mode. This happened because the build was configured with 'ac_cv_o_nonblock_inherited=yes', assuming that this property is inherited from the listen socket, but it is not.

I have to apologize for having wasted your time. I feel really bad about this. The only small excuse I have is that we were given the build configuration for this cross-compile setup by another company, so in some sense we are also a victim ourselves.

It doesn't help that the configure script for the apr library silently makes the above assumption:

if test "$cross_compiling" = yes; then :

    ac_cv_o_nonblock_inherited="yes"

This makes it non-obvious that one is creating an incorrect configuration. There are several other places in the configure script that make assumptions like this. In other places the configure script produces an error instead, forcing the user to specify a value, which is much better:

  if test "$cross_compiling" = yes; then :
  as_fn_error $? "cannot check setpgrp when cross compiling" "$LINENO" 5

Again, sorry about the wasted time.

Joachim


--
joac...@kraut.ca http://www.kraut.ca

Reply via email to