Hi John, On Thu, May 08, 2014 at 09:15:20AM +0200, John-Paul Bader wrote: > Hey, > > so I have downloaded the haproxy-ss-Latest from the website and applied > your patches. I have compiled it with: > > make TARGET=freebsd USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1 > > It ran very good for 2 hours but then 6 out of 12 processes coredumped, > this time however in the haproxy code realm and apparently session related:
Great, so good news and bad news at the same time. Good news being that the shared context was definitely causing the trouble, the bad news being that the slightly-tested kqueue is still having some trouble. > Maybe the full backtrace is more helpful: > > (gdb) bt full > #0 kill_mini_session (s=0x804269c00) at src/session.c:299 > level = 6 > conn = (struct connection *) 0x0 > err_msg = <value optimized out> > #1 0x0000000000463928 in conn_session_complete (conn=0x8039f2a80) at > src/session.c:355 > s = (struct session *) 0x804269c00 > #2 0x0000000000432769 in conn_fd_handler (fd=<value optimized out>) at > src/connection.c:88 > conn = <value optimized out> > flags = 41997063 > #3 0x00000000004127db in fd_process_polled_events (fd=<value optimized > out>) at src/fd.c:271 > new_updt = <value optimized out> > old_updt = 1 > #4 0x000000000046ed85 in _do_poll (p=<value optimized out>, exp=<value > optimized out>) > at src/ev_kqueue.c:141 > status = 1 > count = 0 > fd = <value optimized out> > delta_ms = <value optimized out> > timeout = {tv_sec = 0, tv_nsec = 27000000} > updt_idx = <value optimized out> > en = <value optimized out> > eo = <value optimized out> > changes = <value optimized out> (...) OK this trace tends to show that we were called for an event on an FD in a strange state, half-open, half-closed :-/ If the fd were closed, in conn_fd_handler() it would have simply returned because .owner == NULL. Since it managed to go to conn_session_complete(), it means that fd.owner was correct but the connection was not attached to this session, quite strange. It seems like something was deinitialized, but I can't find a code sequence which could produce this. Would you please retry with poll ? I'm not really convinced that a bug in kqueue could cause this :-/ Thanks Willy