Hi there, > when the problem happens, does haproxy still serve traffic but at 100% CPU or > does it spin on itself and stop processing traffic ? No, the proxy service is not hang even in the spike time. you can verify it by check "haproxy.log" in my first mail. There are a lot of reqeusts have been processed normally even under the busy time.
> could it be that sometimes your processes are reloaded ? I don't very clear what's your means, If you are suspecting the haproxy process: I'm pretty sure there is no any reload when the symptoms appear. Again, you can check the pidstat.log and haproxy.log to verify it. If you mean the backend server: We have been checked the two servers involved in this case. And: 1.The server will emit if it has been reloaded or restarted, but there are no such records even in a whole day. 2.We have been located the problematic application level session: [2015-11-19 08:20:52.313756 UTC+0800][Info] (audit.fend) user logged on from 58.246.225.90 uid = xxx number = xxx user agent = Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36 @{tid = xxxxxx} @{nid = 0} And there is pretty normal just before and after this log record. 3.We have been reviewed our distributed coordinate service (something like zookeeper and etcd), and there is no log to show the serivce has any down time today. PS: I remembered the new version of haproxy has some strange statistics maybe related to this issue: http://stackoverflow.com/questions/33168469/whats-the-exact-meaning-of-session-in-haproxy/33540826?noredirect=1#comment55014864_33540826 Thanks :-) -- Best Regards BaiYang baiy...@gmail.com http://baiy.cn **** < END OF EMAIL > **** From: Willy Tarreau Date: 2015-11-20 02:21 To: baiyang CC: Lukas Tribus; haproxy Subject: Re: RE: CPU 100% when waiting for the client timeout Hello, On Fri, Nov 20, 2015 at 12:09:03AM +0800, baiyang wrote: > > Which release did you use previously that worked fine? > I'm using 1.5.14 before upgraded to 1.6.2. I believe everythins is ok before > the last upgrade. I download it from launchpad.net. That's quite problematic. I've just reviewed the changelog between 1.5.14 and 1.5.15 and am not seeing anything there which could explain this. Just to be clear, when the problem happens, does haproxy still serve traffic but at 100% CPU or does it spin on itself and stop processing traffic ? I would guess the former, which could be a but in a timeout handling somewhere. Another point while I'm thinking about it, could it be that sometimes your processes are reloaded ? I mean, if one of the last request takes 15mn to complete and you perform a soft reload, the process will not exit until this request is completed. And by definition, when it completes, it's the last one in the log before the process tries to leave. In this case you'll see another process in parallel still doing the job (the new one). Willy