Hi John,

On Thu, May 08, 2014 at 09:15:20AM +0200, John-Paul Bader wrote:
> Hey,
> 
> so I have downloaded the haproxy-ss-Latest from the website and applied 
> your patches. I have compiled it with:
> 
> make TARGET=freebsd USE_PCRE=1 USE_OPENSSL=1 USE_ZLIB=1
> 
> It ran very good for 2 hours but then 6 out of 12 processes coredumped, 
> this time however in the haproxy code realm and apparently session related:

Great, so good news and bad news at the same time. Good news being that
the shared context was definitely causing the trouble, the bad news being
that the slightly-tested kqueue is still having some trouble.

> Maybe the full backtrace is more helpful:
> 
> (gdb) bt full
> #0  kill_mini_session (s=0x804269c00) at src/session.c:299
>       level = 6
>       conn = (struct connection *) 0x0
>       err_msg = <value optimized out>
> #1  0x0000000000463928 in conn_session_complete (conn=0x8039f2a80) at 
> src/session.c:355
>       s = (struct session *) 0x804269c00
> #2  0x0000000000432769 in conn_fd_handler (fd=<value optimized out>) at 
> src/connection.c:88
>       conn = <value optimized out>
>       flags = 41997063
> #3  0x00000000004127db in fd_process_polled_events (fd=<value optimized 
> out>) at src/fd.c:271
>       new_updt = <value optimized out>
>       old_updt = 1
> #4  0x000000000046ed85 in _do_poll (p=<value optimized out>, exp=<value 
> optimized out>)
>     at src/ev_kqueue.c:141
>       status = 1
>       count = 0
>       fd = <value optimized out>
>       delta_ms = <value optimized out>
>       timeout = {tv_sec = 0, tv_nsec = 27000000}
>       updt_idx = <value optimized out>
>       en = <value optimized out>
>       eo = <value optimized out>
>       changes = <value optimized out>

(...)

OK this trace tends to show that we were called for an event on
an FD in a strange state, half-open, half-closed :-/

If the fd were closed, in conn_fd_handler() it would have simply
returned because .owner == NULL. Since it managed to go to
conn_session_complete(), it means that fd.owner was correct but
the connection was not attached to this session, quite strange.
It seems like something was deinitialized, but I can't find a
code sequence which could produce this.

Would you please retry with poll ? I'm not really convinced that
a bug in kqueue could cause this :-/

Thanks
Willy


Reply via email to