https://bz.apache.org/bugzilla/show_bug.cgi?id=62799

--- Comment #8 from m...@blackmans.org ---
(In reply to Rainer Jung from comment #7)

> > For the particular
> > log messages above, the connection is showing failed due to EINPROGRESS long
> > before the 5 second expiry, usually within milliseconds.
> 
> OK, here I suggest to check using 1.2.44 (and switching temporarily to the
> value 5 due to the "seconds instead of milliseconds" bug in 1.2.44) or
> 1.2.46 pre-release and recheck. It could well be that the poll() system call
> used there for RHEL is more reliable due to the known limitations of
> select() on Linux.

Thanks, that's a clear path to test.


> > What I want:
> > 
> > 1. Backend connection failures should be caught by a timeout, probably
> > socket_connect_timeout
> 
> For clarity: socket_connect_timeout is not a generic connection timeout, it
> is instead a timeout for socket connect, that means only for the process of
> setting up a new connection (connect() system call). That's what it is meant
> for, nothing else.

That is absolutely perfect, that is 100% the desired case I want to handle
correctly. Any backend that we cannot set up a new connection to, should be
considered failed, but only after the timeout has expired.

> 
> If you need to tackle more types of "connection failures", there's probably
> other features handling them, but not socket_connect_timeout.

For all post-new-connection cases, I am expecting to rely on the full suite of
CPING/CPONG tests and I am hopeful for these cases.

> 
> > 2. incomplete connections before the timeout expiry should never be flagged
> > as failures.
> 
> Agreed but as I wrote before, this could happen due to the select()
> limitation in 1.2.42 (not sure but could be), or time jumping forward

I will lean towards 'select' letting us down rather than time jumping forward
for now.

> 
> > What I observe:
> > 
> > If I set a socket_connect_timeout of 5000 milliseconds, loads of perfectly
> > good backends are getting failed due to EINPROGRESS DESPITE the timeout not
> > yet expiring.
> 
> How do you know that "the timeout" is "not expiring"?

I can see the initial connection attempt at time t0 and i can see failure
notifications at time t0+delta where delta is much less than 5 seconds.

> 
> > If I don't set a socket_connect_timeout at all (i.e. no timeout at all),
> > then the request hangs indefinitely when the backend is missing.
> 
> More precisely *setup of new connections* will hang (if the network doesn't
> respond with a no route to host or similar).
> 
> > In other words, I have found no configuration that satifies goals 1 and 2
> > simultaneously, primarily, I believe, because the current code is treating
> > EINPROGRESS as a failure way before the timeout expires.
> 
> As I tried to explain several times now: the current code handles EINPROGRES
> as a normal return value and after EINPROGRESS waits for the timeout to
> happen and then checks the socket status again. We might have done this
> wrong, but it is not as obvious as you think.

Ok, great, perhaps under some circumstances that handling is thwarted by other
effects, like a dodgy 'select'.  I do find the code a little hard to follow,
but I completely agree that on the face of it, it should not be failing the
connection because of EINPROGRESS.

> 
> > I can accept my interpretation of the underlying behaviours is wrong, but
> > the fact is that mod_jk is marking backends down that are not down.
> > 
> > I am open to suggestion on how I can demonstrate this behaviour more clearly
> > that are achievable outside of our production environments.
> 
> We could add some lines to the code which track the connection setup in
> mod_jk and log the details only if the setup fails due to
> socket_connect_timeout. But before doing this I would be interested in the
> behavior of recent code (1.2.44 with config "5" or 1.2.46 pre-release).

Sounds like a plan, thank you.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to