Re: CPU 100% when waiting for the client timeout

baiyang Fri, 20 Nov 2015 03:00:07 -0800

Hi Willy, 

> This one seems to have missed 3 years of bugfixes
I've just done a "apt-get update && apt-get upgrade" successfully and reboot 
the machine this week. I think the OS is fresh enough, but I'll try to upgrade 
the kernal to a newer one. :-)

> I find this very strange here because the request time is high, indicating a 
> reused keep-alive connection.
Yes, we enabled http keep-alive with a 5 minutes idle timeout. You can get my 
full config file in my previous mail.

Also, we are using the long-polling mechanism to doing the message push (comet) 
task. The long-polling request will be timed-out (return a empty http response) 
within 285 seconds if there is no message to push to the client. Of course it 
could be returned earlier if there are something to notify the client.

The HTTP long-polling API is "/FFC/App/lpWaitMessage", it should only appeared 
in the "https-in" front-end. If you saw a large request time, as you said, it 
maybe caused by a connection which previously doing a long-polling request has 
been reused.

> When the problem happens, could you please try to dump the output of "show 
> sess all" sent on the CLI ?
> So that makes me think that the output of one second of strace when the 
> trouble appears will also help 
It's very difficault, because I could not predict when it will happen 
accurately. Anyway, I will give it a try.

> I remember having seen a kernel bug affecting epoll a long time ago where a 
> deleted even was still reported...
Amm... I think it not the kernal's issue here. Because:
1. As I said, I've just done a update of it.

2. Our backend servers are all using the same Ubuntu 12.04 LTS Server image, 
and all of them are using epoll + non-blocking IO + thread pool on linux. It's 
working fine for many years, we have never seen any trouble on it.

3. Even if the epoll_wait have returned a invalid file descriptor, we still get 
into a cpu exhausion here. As far as I can remembered:
3.1. epoll can works on two mode: "edge" triggered and "level" triggered. If we 
are in level triggered mode, we must set the EPOLLONESHOT flag to avoid 
duplicated events. Otherwise we need to set EPOLLET flag to turn it to edge 
triggered mode.
3.2. The "events" field of the epoll event should be carefully checked before 
using it.
3.3. Using epoll on a full-duplex connection need some special care. Or some 
bad things may occured in some special event sequence (e.g.: read after write 
fault, write after a read fault, etc.).
3.4. Maybe we have a loop to manipulate a invalid file descriptor incorrectly 
(e.g.: Are we correctly processes all possiable return values and error codes 
of some system call like read, write, close, etc.)?

> I find this quite strange but given that we had no such report in more than 
> one year of 1.5 deployed everywhere
I remembered it firstly appeared after I upgraded haproxy to the 1.6.2, at that 
time, I seems added some options to the config file:
    timeout client-fin      10s  # 对客户端连接完成 TCP 4 次挥手超时
    timeout server-fin      5s   # 对后端 Server 连接完成 TCP 4 次挥手超时
    http-reuse always            # 对不同的前端请求，尽量重用相同的后端空闲连接（1.6+）
The "http-reuse" option could be omitted because it's 1.6+ only and I have been 
disabled it when downgrade to 1.5.15 and 1.5.14.
I have been upgrade back to 1.6.2 with all these three options disabled now, 
waiting to see what will happen. If it appeared again, I'll upgrade the linux 
kernel to the newest one of Ubuntu 12.04.

Thanks :-)

--

   Best Regards
  BaiYang
  baiy...@gmail.com
  http://baiy.cn
**** < END OF EMAIL > **** 

From: Willy Tarreau
Date: 2015-11-20 17:26
To: baiyang
CC: Lukas Tribus; haproxy
Subject: Re: Re: CPU 100% when waiting for the client timeout
Hi,

On Fri, Nov 20, 2015 at 04:11:48PM +0800, baiyang wrote:
> Sorry for send it again, mail.formilux.org told me my attachment is too 
> large: /var/spool/mail/archives-haproxy: error writing message: File too large
> So I striped it.

Thanks. I don't remember the file size limit but the list delivers to about
1000 persons so for sure we have to set sane limits.

> Unfortunately, we have been downgraded to 1.5.14, but the issue still there:
> HA-Proxy version 1.5.14 2015/07/02
> Copyright 2000-2015 Willy Tarreau <wi...@haproxy.org>

Ah interesting, at least it makes more sense!

> Build options :
(...)

I'm not seeing anything wrong there.

> # uname -a
> Linux WD-G0-SRP1 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 
> x86_64 x86_64 x86_64 GNU/Linux

This one seems to have missed 3 years of bugfixes but anyway I don't see
how any kernel bug could make haproxy fail, and if it did we'd have to
find a workaround anyway (or a way to detect it).

> Again, there are some 15 minutes timeout records at the end of the spike:
> Nov 20 11:04:56 WD-G0-SRP1 haproxy[28443]: 125.71.115.147:15351 
> [20/Nov/2015:10:48:43.099] http-in WebSvr/ws0 17791/0/0/3/973281 200 251352 - 
> - CD-- 51/7/1/0/0 0/0 "GET /whitedolphin/wd.dbc.pdf HTTP/1.1"
> ...
> Nov 20 11:05:14 WD-G0-SRP1 haproxy[28443]: 125.71.115.147:15352 
> [20/Nov/2015:10:48:43.106] http-in WebSvr/ws0 17550/0/0/3/991660 200 405432 - 
> - CD-- 50/6/0/0/0 0/0 "GET /whitedolphin/wd.overview.pdf HTTP/1.1"
> 
> But we also find a 15 minutes timeout at the early stage of the spike:
> Nov 20 10:52:31 WD-G0-SRP1 haproxy[28443]: 125.71.115.147:15231 
> [20/Nov/2015:10:36:10.183] https-in~ g0n0/n0 52555/0/0/1/981749 200 138069 - 
> - CD-- 29/24/11/11/0 0/0 "GET /mobile/lib/sencha-touch-all.js?v=1 HTTP/1.1"

Note, it's not a timeout in haproxy, the client closed first ('CD').
I find this very strange here because the request time is high,
indicating a reused keep-alive connection. If the total time is that
long, it probably means a low transfer rate, which doesn't make much
sense. This doesn't match an idle connection because idle connections
wouldn't produce a log. Or we're facing an asynchronous error that is
reported once we wake-up due to the timeout, though I have no idea when
this could happen.

Anyway your logs are very useful because they make me scratch my head.

> Again, We have been reviewed all log records involved in this case, and
> nothing special there, they are all working fine.

OK.

> I suspect if the connection is broken by a link level or physical level
> interruption (e.g.: 4G signal lost, Unplug the network cable, etc.), or other
> cases cause the Four-way handshake or even the RST signal could not to be
> sent.

I suspect it has something to do with the very long timeout. Maybe
we're facing an error while closing on timeout and we're looping on
this. I find this quite strange but given that we had no such report
in more than one year of 1.5 deployed everywhere, I'm trying to find
what could cause this. Network timeouts are fairly common these days
with mobile devices everywhere. Long connections are common as well
in certain environments (websocket, RDP, etc).

When the problem happens, could you please try to dump the output
of "show sess all" sent on the CLI ? Typically :

    $ echo "show sess all" | socat - /var/run/haproxy.stat > all-sess.txt

(or wherever you put it).

Be careful, the list can be huge, no need to send it to the mailing list.
It may also disclose some IP information that you don't necessarily want
to disclose publicly. I'll be interested in matching this output against
the log line indicating the issue. The session's flags will greatly help.

> Then some infinity look (e.g.: spin lock?) may be triggered within the
> haproxy's timeout thread until it is surely timed out?

No, there's no lock, no thread or any such thing here. The fact that it
continues to work indicates a missed event that we attempt to process
and that cannot be processed.

Note, I remember having seen a kernel bug affecting epoll a long time
ago where a deleted even was still reported. It was a problem because
when epoll_wait() reports a non-existing file descriptor, you try to
close it but you can't, as it doesn't exist anymore. So that makes me
think that the output of one second of strace when the trouble appears
will also help :

  $ strace -ttT -o haproxy-trace.txt -p $(pidof haproxy)Hi,
  <wait one second>
  [Ctrl-C]

You'll have to press Ctrl-C to stop it once you run it. It can be large
as well if the process is spinning. But it will also tell us a lot of
things.

Best regards,
Willy

Re: CPU 100% when waiting for the client timeout

Reply via email to