Hi Tobias,

On Thu, Jul 14, 2016 at 04:52:29PM +0200, Tobias Vau wrote:
> Hi,
> 
> a small follow up to an older thread from November 2015, where massive
> numbers of epoll_wait calls lead to 100% CPU consumption.
> 
> My installation showed the same pattern. As it's also easily
> reproducible for me with very moderate client traffic (1-10 conns/s),
> you might maybe be interested in more debug info.
> 
> For me it usually happens within some minutes into running from a clean
> start, as soon as a sinlge "rogue" session from a client IP address
> suddenly pops up with the call counter being at millions within seconds
> (all other connections / sessions by this client finished correctly).
> 
> Some points discussed on the other threads related to epoll busy loops:
> 
> - "option abortonclose" is set, didn't test to disable it, as the
>   old thread mentioned it as not being relevant
> - "option http-server-close" is set, switched it to
>   "option http-keep-alive" plus "option prefer-last-server"
>   for testing, no change here
> - "timeout client-fin" and "timeout server-fin" were set,
>   disabling them prevented the quick appearance of new epoll_wait loops
> 
> As I can very easily reproduce the behaviour with the current config
> and a very moderate traffic pattern (1-5 conns / s), just let me know,
> if you'd like to see some other debug info, than what's provided below.

That's very useful.

The detailed session state would be needed. You can have it by either
requesting "show sess <id>" ex "show sess 0x7fd3aaa28040" below, or
by issuing "show sess all" which will dump them all (much more useful
as it allows us to validate a theory across other sessions).

The problem I've been facing was how to reproduce the condition. If you
manage to reproduce it within a minute or so at 10 cps, it would be very
useful to also take a tcpdump capture in parallel of the traffic between
the client and haproxy and the traffic between haproxy and the server.
That will help understand what traffic sequence triggers the issue and
possibly what headers if any is involved. It also allows to eliminate
some theories based on the configuration.

You need to be fully aware that this will disclose a lot of private
information so you definitely don't want to post this here. You may want
to follow up with an anonymized example of a show sess if you want, and/or
with any possibly relevant new information.

That makes me think that I should probably add something like "show fd" to
report all FD status, that may make such debugging easier in the future.

Thanks!
Willy

Reply via email to