Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-15 Thread Willy Tarreau
Hi John, On Thu, May 15, 2014 at 07:41:25AM +0200, John-Paul Bader wrote: Hey, good news! HA-Proxy version 1.5-dev25-a339395 2014/05/10 is running since yesterday morning with a 1/4 of our traffic and without coredumps, infinite loops or other problems. We have nbproc and kqueue enabled

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-14 Thread Chris Burroughs
On 05/07/2014 12:35 PM, Vincent Bernat wrote: ❦ 7 mai 2014 11:15 +0200, Willy Tarreau w...@1wt.eu : haproxy does not include DTrace probes by any chance right? :) No, and I have no idea how this works either. But if you feel like it can provide some value and be done without too much

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-14 Thread John-Paul Bader
Hey, good news! HA-Proxy version 1.5-dev25-a339395 2014/05/10 is running since yesterday morning with a 1/4 of our traffic and without coredumps, infinite loops or other problems. We have nbproc and kqueue enabled and everything seems to behave fine. In peak times we had 11000 sockets open

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-10 Thread Dmitry Sivachenko
On 07 мая 2014 г., at 18:24, Emeric Brun eb...@exceliance.fr wrote: Hi All, I suspect FreeBSD to not support process shared mutex (supported in both linux and solaris). I've just made a patch to add errors check on mutex init, and to fallback on SSL private session cache in error

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-10 Thread Willy Tarreau
On Sun, May 11, 2014 at 12:19:45AM +0400, Dmitry Sivachenko wrote: On 07 ?? 2014 ??., at 18:24, Emeric Brun eb...@exceliance.fr wrote: Hi All, I suspect FreeBSD to not support process shared mutex (supported in both linux and solaris). I've just made a patch to add errors

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-09 Thread John-Paul Bader
Hey Willy, I have just applied the patch and will run another test after lunch. Since we're testing with live traffic I can't leave it unattended :) Just out of curiosity, is this a bug that affects also Linux or is it FreeBSD specific? Kind regards, John Willy Tarreau wrote: Good news!

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-09 Thread Willy Tarreau
Hi John, On Fri, May 09, 2014 at 11:54:56AM +0200, John-Paul Bader wrote: Hey Willy, I have just applied the patch and will run another test after lunch. Since we're testing with live traffic I can't leave it unattended :) Just out of curiosity, is this a bug that affects also Linux or

RE: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-09 Thread Lukas Tribus
to discover a crash of haproxy on an ALOHA, and we couldn't explain it, it was in relation with a high memory usage with SSL. Now I know where it was, so we'll issue an update :-) Thanks again for all your traces and tests! Willy Can the 100% cpu load initially reported along with the segfaults

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-09 Thread Willy Tarreau
Hi Lukas, On Fri, May 09, 2014 at 12:40:11PM +0200, Lukas Tribus wrote: Can the 100% cpu load initially reported along with the segfaults be explained by the unsupported process shared mutex on FreeBSD, btw? Absolutely. Some lists are updated within the lock. I let you imagine what a double

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-09 Thread John-Paul Bader
Hey again, so for the first time haproxy is running for more than 4 hours without a coredump and without cpu spikes - so far so good. I'll observe it until tomorrow morning and report again. Kind regards, John Willy Tarreau wrote: Hi John, On Fri, May 09, 2014 at 11:54:56AM +0200,

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-09 Thread Willy Tarreau
On Fri, May 09, 2014 at 04:52:04PM +0200, John-Paul Bader wrote: Hey again, so for the first time haproxy is running for more than 4 hours without a coredump and without cpu spikes - so far so good. I'll observe it until tomorrow morning and report again. Great, so we might have addressed

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread John-Paul Bader
Hey, so I have downloaded the haproxy-ss-Latest from the website and applied your patches. I have compiled it with: make TARGET=freebsd USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1 It ran very good for 2 hours but then 6 out of 12 processes coredumped, this time however in the haproxy code realm and

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread John-Paul Bader
Maybe the full backtrace is more helpful: (gdb) bt full #0 kill_mini_session (s=0x804269c00) at src/session.c:299 level = 6 conn = (struct connection *) 0x0 err_msg = value optimized out #1 0x00463928 in conn_session_complete (conn=0x8039f2a80) at

RE: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread Lukas Tribus
Hi Willy, When it uses the private cache, I would also have to change the configuration to allow ssl sessions over multiple http requests right? No you don't need to change anymore, what Emeric's patch does is to reimplement a hand-crafted spinlock mechanism. Two slightly unrelated

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread Willy Tarreau
Hi John, On Thu, May 08, 2014 at 09:15:20AM +0200, John-Paul Bader wrote: Hey, so I have downloaded the haproxy-ss-Latest from the website and applied your patches. I have compiled it with: make TARGET=freebsd USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1 It ran very good for 2 hours but then 6

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread John-Paul Bader
Hey, Sure thing! I've just put an nokqueue in the config and its running again. Lets see :) Kind regards, John Willy Tarreau wrote: Hi John, On Thu, May 08, 2014 at 09:15:20AM +0200, John-Paul Bader wrote: Hey, so I have downloaded the haproxy-ss-Latest from the website and applied your

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread John-Paul Bader
Ok after about roughly another 2 hours the next core dump happened - this time with the nokqueue option set. The coredump looks very similar and now crashes with ev_poll instead of ev_kqueue: (gdb) bt full #0 kill_mini_session (s=0x80337f800) at src/session.c:299 level = 6

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread Willy Tarreau
On Thu, May 08, 2014 at 12:47:06PM +0200, John-Paul Bader wrote: Ok after about roughly another 2 hours the next core dump happened - this time with the nokqueue option set. The coredump looks very similar and now crashes with ev_poll instead of ev_kqueue: Indeed it's exactly the same.

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread Willy Tarreau
On Thu, May 08, 2014 at 01:46:30PM +0200, Willy Tarreau wrote: On Thu, May 08, 2014 at 12:47:06PM +0200, John-Paul Bader wrote: Ok after about roughly another 2 hours the next core dump happened - this time with the nokqueue option set. The coredump looks very similar and now crashes

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread John-Paul Bader
Ok - I will happily test more patches to pin those issues down :) There is one thing where I'm not sure if its related. We have currently set up the system with a rather small msl setting: net.inet.tcp.msl=5000 Also we have fast_finwait2_recycle enabled so we don't have that many TIME_WAIT

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread Willy Tarreau
Good news! I found it by reading the code :-) And I could even reproduce it. The bug happens when running with a handshake (typically SSL) and when an out of memory or a socket error happens when calling setsockopt(TCP_NODELAY), which might possibly be made more common by your TCP settings,

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-08 Thread Willy Tarreau
Hi Lukas, On Thu, May 08, 2014 at 10:20:28AM +0200, Lukas Tribus wrote: Hi Willy, When it uses the private cache, I would also have to change the configuration to allow ssl sessions over multiple http requests right? No you don't need to change anymore, what Emeric's patch does is to

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread John-Paul Bader
Hey Willy, this morning I was running another test without kqueue but sadly with the same result. Here is my test protocol: Running fine with nokqueue for about an hour at about 20% CPU per process, then sudden CPU spike on all processes up to 90%, I started ktrace but meanwhile the CPU

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Willy Tarreau
Hi John-Paul, On Wed, May 07, 2014 at 09:22:32AM +0200, John-Paul Bader wrote: Hey Willy, this morning I was running another test without kqueue but sadly with the same result. OK so let's rule out any possible kqueue issue there for now. Here is my test protocol: Running fine with

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread John-Paul Bader
Willy Tarreau wrote: It's very interesting, it contains a call to ssl_update_cache(). I didn't know you were using SSL, but in multi-process mode we have the shared context model to share the SSL sessions between processes. Yes, sorry. In the initial email on this thread I posted our

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Willy Tarreau
On Wed, May 07, 2014 at 10:28:18AM +0200, John-Paul Bader wrote: Willy Tarreau wrote: It's very interesting, it contains a call to ssl_update_cache(). I didn't know you were using SSL, but in multi-process mode we have the shared context model to share the SSL sessions between processes.

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Emeric Brun
On 05/07/2014 11:15 AM, Willy Tarreau wrote: On Wed, May 07, 2014 at 10:28:18AM +0200, John-Paul Bader wrote: Willy Tarreau wrote: It's very interesting, it contains a call to ssl_update_cache(). I didn't know you were using SSL, but in multi-process mode we have the shared context model to

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread John-Paul Bader
Hey Emeric, I have just consulted the Readme of the haproxy source and it says in the OpenSSL section: »The BSD and OSX makefiles do not support build options for OpenSSL nor zlib. Also, at least on OpenBSD, pthread_mutexattr_setpshared() does not exist so the SSL session cache cannot be

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Vincent Bernat
❦ 7 mai 2014 11:15 +0200, Willy Tarreau w...@1wt.eu : haproxy does not include DTrace probes by any chance right? :) No, and I have no idea how this works either. But if you feel like it can provide some value and be done without too much effort, feel free to try :-) Here is a proof of

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Willy Tarreau
Hi John, On Wed, May 07, 2014 at 06:14:13PM +0200, John-Paul Bader wrote: Ok, I have just built haproxy with your patches like this: gmake TARGET=freebsd USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1 When trying to start haproxy it failed with: [ALERT] 126/160108 (25333) : Unable to allocate

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Willy Tarreau
Hi Vincent, On Wed, May 07, 2014 at 06:35:06PM +0200, Vincent Bernat wrote: ??? 7 mai 2014 11:15 +0200, Willy Tarreau w...@1wt.eu : haproxy does not include DTrace probes by any chance right? :) No, and I have no idea how this works either. But if you feel like it can provide some

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Emeric BRUN
My fix is broken, it should only show a warning and fallback on private cache, i've just pointed the issue. I will try to send you a workarounf patch soon. Emeric original message- De: John-Paul Bader john-paul.ba...@wooga.net A: Willy Tarreau w...@1wt.eu Copie

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread John-Paul Bader
Hmm yeah I noticed from what you wrote in the mail and by reading through the patch - but still it confirmed that the shared pthread thing was not available on FreeBSD right? Would I also need to compile with USE_PRIVATE_CACHE=1 or would you patch take care of that? When it uses the private

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Vincent Bernat
❦ 7 mai 2014 22:19 +0200, Willy Tarreau w...@1wt.eu : Here is a proof of concept. To test, use `make TARGET=linux2628 USE_DTRACE=1`. On Linux, you need systemtap-sdt-dev or something like that. Then, there is a quick example in example/haproxy.stp. Interesting, but just for my

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Vincent Bernat
❦ 7 mai 2014 22:56 +0200, Vincent Bernat ber...@luffy.cx : So the main interest of those probes are: * low overhead, they can be left in production to be here when you really need them And you enable/disable them while the program is running. -- panic (No CPUs found. System

Dtrace for haproxy (Was: haproxy 1.5-dev24: 100% CPU Load or Core Dumped)

2014-05-07 Thread Willy Tarreau
On Wed, May 07, 2014 at 10:59:43PM +0200, Vincent Bernat wrote: ??? 7 mai 2014 22:56 +0200, Vincent Bernat ber...@luffy.cx : So the main interest of those probes are: * low overhead, they can be left in production to be here when you really need them And you enable/disable them

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-07 Thread Willy Tarreau
Hi John, On Wed, May 07, 2014 at 10:54:33PM +0200, John-Paul Bader wrote: Hmm yeah I noticed from what you wrote in the mail and by reading through the patch - but still it confirmed that the shared pthread thing was not available on FreeBSD right? Yes that's it. Old freebsd code did not

haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-06 Thread John-Paul Bader
Hey, I'm currently attempting to replace our commercial Loadbalancer with SSL termination with haproxy. I'm running it on FreeBSD 9.2 Stable. We have thousands of requests per second and for a while everything runs extremely smooth. No queues are running full, machine load is at 0.5,

RE: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-06 Thread Lukas Tribus
Hi, I'm currently attempting to replace our commercial Loadbalancer with SSL termination with haproxy. I'm running it on FreeBSD 9.2 Stable. We have thousands of requests per second and for a while everything runs extremely smooth. No queues are running full, machine load is at 0.5,

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-06 Thread Willy Tarreau
Hi, On Tue, May 06, 2014 at 12:02:59PM +0200, Lukas Tribus wrote: Hi, I'm currently attempting to replace our commercial Loadbalancer with SSL termination with haproxy. I'm running it on FreeBSD 9.2 Stable. We have thousands of requests per second and for a while everything runs

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-06 Thread John-Paul Bader
Hey, I will do more elaborate test runs in the next couple of days. I will create traces with ktrace which is not as nice as strace but at least will provide more context. Is there anything in particular you'd be interested in like only syscalls? Meanwhile I have build haproxy with debug

Re: haproxy 1.5-dev24: 100% CPU Load or Core Dumped

2014-05-06 Thread Willy Tarreau
Hi John-Paul, On Tue, May 06, 2014 at 11:57:08PM +0200, John-Paul Bader wrote: Hey, I will do more elaborate test runs in the next couple of days. No problem. I will create traces with ktrace which is not as nice as strace but at least will provide more context. Is there anything in

Re: FIXED: Re: 100% cpu load....

2013-07-22 Thread Willy Tarreau
On Mon, Jul 22, 2013 at 06:07:03PM +0200, Mark Janssen wrote: The setup has been running for a few days now (still with nbproc 1) and is performing admirably. I'd say this issue is fixed/resolved. Excellent, thank you very much for your feedback Mark. I'm merging the patches then. Best

Re: FIXED: Re: 100% cpu load....

2013-07-22 Thread Mark Janssen
The setup has been running for a few days now (still with nbproc 1) and is performing admirably. I'd say this issue is fixed/resolved. Thanks! On Fri, Jul 19, 2013 at 9:58 PM, Mark Janssen maniac...@gmail.com wrote: I've applied the patches and am running with the new version now. I'll let

Re: FIXED: Re: 100% cpu load....

2013-07-19 Thread Mark Janssen
I've applied the patches and am running with the new version now. I'll let it run overnight (with nbproc back at 1...). I'll probably switch back to nbproc 1 in the morning, as traffic starts ramping up. Mark On Thu, Jul 18, 2013 at 10:42 PM, Willy Tarreau w...@1wt.eu wrote: Hi Mark, OK I

Re: FIXED: Re: 100% cpu load....

2013-07-19 Thread Willy Tarreau
On Fri, Jul 19, 2013 at 09:58:19PM +0200, Mark Janssen wrote: I've applied the patches and am running with the new version now. I'll let it run overnight (with nbproc back at 1...). I'll probably switch back to nbproc 1 in the morning, as traffic starts ramping up. Thank you Mark, then I'm

FIXED: Re: 100% cpu load....

2013-07-18 Thread Willy Tarreau
Hi Mark, OK I could reproduce, debug and fix. It was a tough one, really... More a problem of internal semantics than anything else, so I had to test several possibilities and study their impacts and the corner cases. In the end we get something that's fixed and better :-) The issue was mostly

RE: 100% cpu load....

2013-07-17 Thread Lukas Tribus
Hi Willy, This explains why this only happens for short durations (at most the duration of a client timeout). Good to hear you pinpointed this. What is important is to know that CPU usage aside, there is no loss of information nor service. Connections are correctly handled Mark early

Re: 100% cpu load....

2013-07-17 Thread Willy Tarreau
On Wed, Jul 17, 2013 at 08:16:18PM +0200, Lukas Tribus wrote: Hi Willy, This explains why this only happens for short durations (at most the duration of a client timeout). Good to hear you pinpointed this. What is important is to know that CPU usage aside, there is no loss

Re: 100% cpu load....

2013-07-16 Thread Willy Tarreau
? I haven't tried yet... and can't really test this currently Do you have the possibility to run without nbproc? We were running without nbproc, and saw the 100% cpu load (though this was still on dev18). We switched to nbproc=7 to work-around the 100% cpu load, as I was told

Re: 100% cpu load....

2013-07-16 Thread Mark Janssen
I've tried with 'noepoll' and 'nosplice' ... and this seems to have solved the cpu spikes... though the base-load is now a lot higher, due to using poll instead of epoll. Next I tried with noepoll, but with splice enabled. This resulted in the same higher base-load, but still the occasional peak

Re: 100% cpu load....

2013-07-13 Thread Mark Janssen
On Fri, Jul 12, 2013 at 5:30 PM, Tomas Pospisek t...@sourcepole.ch wrote: Am 11.07.2013 11:45, schrieb Mark Janssen: I did see large amounts of sequential epoll_wait calls in the processes with 100% cpu load, and not with the other processes. epoll_wait(0, {}, 200, 0) = 0

Re: 100% cpu load....

2013-07-13 Thread Mark Janssen
the possibility to run without nbproc? We were running without nbproc, and saw the 100% cpu load (though this was still on dev18). We switched to nbproc=7 to work-around the 100% cpu load, as I was told that the sites weren't responsive at the 100% cpu load times (though I couldn't reproduce this) How

Re: 100% cpu load....

2013-07-12 Thread Tomas Pospisek
Am 11.07.2013 11:45, schrieb Mark Janssen: I've noticed that the HAProxy processes occasionally jump to 100% cpu load, while the load before and after these peaks is only 3-5%, and the traffic is also the same as outside of these cpu-peaks. I saw a thread about this earlier (april/may

RE: 100% cpu load....

2013-07-12 Thread Lukas Tribus
Hi Mark, Hi list... I've noticed that the HAProxy processes occasionally jump to 100% cpu load, while the load before and after these peaks is only 3-5%, and the traffic is also the same as outside of these cpu-peaks. I saw a thread about this earlier (april/may), which concluded

100% cpu load....

2013-07-11 Thread Mark Janssen
Hi list... I've noticed that the HAProxy processes occasionally jump to 100% cpu load, while the load before and after these peaks is only 3-5%, and the traffic is also the same as outside of these cpu-peaks. I saw a thread about this earlier (april/may), which concluded that there was a bug