I've forgot to mention that the backtrace is from 2.2.11 built from
http://git.haproxy.org/?p=haproxy-2.2.git;a=commit;h=601704962bc9d82b3b1cc97d90d2763db0ae4479

śr., 31 mar 2021 o 13:28 Maciej Zdeb <mac...@zdeb.pl> napisał(a):

> Hi,
>
> Well it's a bit better situation than earlier because only one thread is
> looping forever and the rest is working properly. I've tried to verify
> where exactly the thread looped but doing "n" in gdb fixed the problem :(
> After quitting gdb session all threads were idle. Before I started gdb it
> looped about 3h not serving any traffic, because I've put it into
> maintenance as soon as I observed abnormal cpu usage.
>
> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> 0x00007f2cf0df6a47 in epoll_wait (epfd=3, events=0x55d7aaa04920,
> maxevents=200, timeout=timeout@entry=39) at
> ../sysdeps/unix/sysv/linux/epoll_wait.c:30
> 30 ../sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
> (gdb) thread 11
> [Switching to thread 11 (Thread 0x7f2c3c53d700 (LWP 20608))]
> #0  trace (msg=..., cb=<optimized out>, a4=<optimized out>, a3=<optimized
> out>, a2=<optimized out>, a1=<optimized out>, func=<optimized out>,
> where=..., src=<optimized out>, mask=<optimized out>,
>     level=<optimized out>) at include/haproxy/trace.h:149
> 149 if (unlikely(src->state != TRACE_STATE_STOPPED))
> (gdb) bt
> #0  trace (msg=..., cb=<optimized out>, a4=<optimized out>, a3=<optimized
> out>, a2=<optimized out>, a1=<optimized out>, func=<optimized out>,
> where=..., src=<optimized out>, mask=<optimized out>,
>     level=<optimized out>) at include/haproxy/trace.h:149
> #1  h2_resume_each_sending_h2s (h2c=h2c@entry=0x7f2c18dca740,
> head=head@entry=0x7f2c18dcabf8) at src/mux_h2.c:3255
> #2  0x000055d7a426c8e2 in h2_process_mux (h2c=0x7f2c18dca740) at
> src/mux_h2.c:3329
> #3  h2_send (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3479
> #4  0x000055d7a42734bd in h2_process (h2c=h2c@entry=0x7f2c18dca740) at
> src/mux_h2.c:3624
> #5  0x000055d7a4276678 in h2_io_cb (t=<optimized out>, ctx=0x7f2c18dca740,
> status=<optimized out>) at src/mux_h2.c:3583
> #6  0x000055d7a4381f62 in run_tasks_from_lists 
> (budgets=budgets@entry=0x7f2c3c51a35c)
> at src/task.c:454
> #7  0x000055d7a438282d in process_runnable_tasks () at src/task.c:679
> #8  0x000055d7a4339467 in run_poll_loop () at src/haproxy.c:2942
> #9  0x000055d7a4339819 in run_thread_poll_loop (data=<optimized out>) at
> src/haproxy.c:3107
> #10 0x00007f2cf1e606db in start_thread (arg=0x7f2c3c53d700) at
> pthread_create.c:463
> #11 0x00007f2cf0df671f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> (gdb) bt full
> #0  trace (msg=..., cb=<optimized out>, a4=<optimized out>, a3=<optimized
> out>, a2=<optimized out>, a1=<optimized out>, func=<optimized out>,
> where=..., src=<optimized out>, mask=<optimized out>,
>     level=<optimized out>) at include/haproxy/trace.h:149
> No locals.
> #1  h2_resume_each_sending_h2s (h2c=h2c@entry=0x7f2c18dca740,
> head=head@entry=0x7f2c18dcabf8) at src/mux_h2.c:3255
>         h2s = <optimized out>
>         h2s_back = <optimized out>
>         __FUNCTION__ = "h2_resume_each_sending_h2s"
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
> #2  0x000055d7a426c8e2 in h2_process_mux (h2c=0x7f2c18dca740) at
> src/mux_h2.c:3329
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
> #3  h2_send (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3479
>         flags = <optimized out>
>         released = <optimized out>
>         buf = <optimized out>
>         conn = 0x7f2bf658b8d0
>         done = 0
>         sent = 0
>         __FUNCTION__ = "h2_send"
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
> #4  0x000055d7a42734bd in h2_process (h2c=h2c@entry=0x7f2c18dca740) at
> src/mux_h2.c:3624
>         conn = 0x7f2bf658b8d0
>         __FUNCTION__ = "h2_process"
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
> #5  0x000055d7a4276678 in h2_io_cb (t=<optimized out>, ctx=0x7f2c18dca740,
> status=<optimized out>) at src/mux_h2.c:3583
>         conn = 0x7f2bf658b8d0
>         tl = <optimized out>
>         conn_in_list = 0
>         h2c = 0x7f2c18dca740
>         ret = <optimized out>
>         __FUNCTION__ = "h2_io_cb"
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
>         __x = <optimized out>
>         __l = <optimized out>
> #6  0x000055d7a4381f62 in run_tasks_from_lists 
> (budgets=budgets@entry=0x7f2c3c51a35c)
> at src/task.c:454
>         process = <optimized out>
>         tl_queues = <optimized out>
>         t = 0x7f2c0d3fa1c0
>         budget_mask = 7 '\a'
>         done = <optimized out>
>         queue = <optimized out>
>         state = <optimized out>
> ---Type <return> to continue, or q <return> to quit---
>         ctx = <optimized out>
>         __ret = <optimized out>
>         __n = <optimized out>
>         __p = <optimized out>
> #7  0x000055d7a438282d in process_runnable_tasks () at src/task.c:679
>         tt = 0x55d7a47a6d00 <task_per_thread+1280>
>         lrq = <optimized out>
>         grq = <optimized out>
>         t = <optimized out>
>         max = {0, 0, 141}
>         max_total = <optimized out>
>         tmp_list = <optimized out>
>         queue = 3
>         max_processed = <optimized out>
> #8  0x000055d7a4339467 in run_poll_loop () at src/haproxy.c:2942
>         next = <optimized out>
>         wake = <optimized out>
> #9  0x000055d7a4339819 in run_thread_poll_loop (data=<optimized out>) at
> src/haproxy.c:3107
>         ptaf = <optimized out>
>         ptif = <optimized out>
>         ptdf = <optimized out>
>         ptff = <optimized out>
>         init_left = 0
>         init_mutex = pthread_mutex_t = {Type = Normal, Status = Not
> acquired, Robust = No, Shared = No, Protocol = None}
>         init_cond = pthread_cond_t = {Threads known to still execute a
> wait function = 0, Clock ID = CLOCK_REALTIME, Shared = No}
> #10 0x00007f2cf1e606db in start_thread (arg=0x7f2c3c53d700) at
> pthread_create.c:463
>         pd = 0x7f2c3c53d700
>         now = <optimized out>
>         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139827967416064,
> 7402574823425717764, 139827967272192, 0, 10, 140729081389088,
> -7430022153605859836, -7430137459590154748}, mask_was_saved = 0}},
>           priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup
> = 0x0, canceltype = 0}}}
>         not_first_call = <optimized out>
> #11 0x00007f2cf0df671f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
> No locals.
> (gdb) n
> h2_resume_each_sending_h2s (h2c=h2c@entry=0x7f2c18dca740, 
> head=head@entry=0x7f2c18dcabf8)
> at src/mux_h2.c:3255
> 3255 TRACE_ENTER(H2_EV_H2C_SEND|H2_EV_H2S_WAKE, h2c->conn);
> (gdb)
> 3257 list_for_each_entry_safe(h2s, h2s_back, head, list) {
> (gdb)
> 3289 TRACE_LEAVE(H2_EV_H2C_SEND|H2_EV_H2S_WAKE, h2c->conn);
> (gdb)
> 3290 }
> (gdb)
> h2_process_mux (h2c=0x7f2c18dca740) at src/mux_h2.c:3330
> 3330 h2_resume_each_sending_h2s(h2c, &h2c->send_list);
> (gdb)
> 3334 if (h2c->st0 == H2_CS_ERROR) {
> (gdb)
> 3345 TRACE_LEAVE(H2_EV_H2C_WAKE, h2c->conn);
> (gdb)
> h2_send (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3478
> 3478 while (((h2c->flags & (H2_CF_MUX_MFULL|H2_CF_MUX_MALLOC)) == 0) &&
> !done)
> (gdb)
> 3479 done = h2_process_mux(h2c);
> (gdb)
> 3482 done = 1; // we won't go further without extra buffers
> (gdb)
> 3484 if ((conn->flags & (CO_FL_SOCK_WR_SH|CO_FL_ERROR)) ||
> (gdb)
> 3485    (h2c->st0 == H2_CS_ERROR2) || (h2c->flags & H2_CF_GOAWAY_FAILED))
> (gdb)
> 3491 for (buf = br_head(h2c->mbuf); b_size(buf); buf =
> br_del_head(h2c->mbuf)) {
> (gdb)
> 3488 if (h2c->flags & (H2_CF_MUX_MFULL | H2_CF_DEM_MBUSY |
> H2_CF_DEM_MROOM))
> (gdb)
> 3491 for (buf = br_head(h2c->mbuf); b_size(buf); buf =
> br_del_head(h2c->mbuf)) {
> (gdb)
> 3514 if (sent)
> (gdb)
> 3472 while (!done) {
> (gdb)
> 3518 if (conn->flags & CO_FL_SOCK_WR_SH) {
> (gdb)
> 3525 if (!(h2c->flags & (H2_CF_MUX_MFULL | H2_CF_DEM_MROOM)) && h2c->st0
> >= H2_CS_FRAME_H)
> (gdb)
> 3526 h2_resume_each_sending_h2s(h2c, &h2c->send_list);
> (gdb)
> 3529 if (!br_data(h2c->mbuf)) {
> (gdb)
> 3530 TRACE_DEVEL("leaving with everything sent", H2_EV_H2C_SEND,
> h2c->conn);
> (gdb)
> 3541 }
> (gdb)
> h2_process (h2c=h2c@entry=0x7f2c18dca740) at src/mux_h2.c:3626
> 3626 if (unlikely(h2c->proxy->state == PR_STSTOPPED) && !(h2c->flags &
> H2_CF_IS_BACK)) {
> (gdb)
> 3643 if (!(h2c->flags & H2_CF_WAIT_FOR_HS) &&
> (gdb)
> 3644    (conn->flags & (CO_FL_EARLY_SSL_HS | CO_FL_WAIT_XPRT |
> CO_FL_EARLY_DATA)) == CO_FL_EARLY_DATA) {
> (gdb)
> 3643 if (!(h2c->flags & H2_CF_WAIT_FOR_HS) &&
> (gdb)
> 3659 if (conn->flags & CO_FL_ERROR || h2c_read0_pending(h2c) ||
> (gdb)
> 3660    h2c->st0 == H2_CS_ERROR2 || h2c->flags & H2_CF_GOAWAY_FAILED ||
> (gdb)
> 3659 if (conn->flags & CO_FL_ERROR || h2c_read0_pending(h2c) ||
> (gdb)
> 3660    h2c->st0 == H2_CS_ERROR2 || h2c->flags & H2_CF_GOAWAY_FAILED ||
> (gdb)
> 3661    (eb_is_empty(&h2c->streams_by_id) && h2c->last_sid >= 0 &&
> (gdb)
> 3677 else if (h2c->st0 == H2_CS_ERROR) {
> (gdb)
> 3684 if (!b_data(&h2c->dbuf))
> (gdb)
> 3687 if ((conn->flags & CO_FL_SOCK_WR_SH) ||
> (gdb)
> 3688    h2c->st0 == H2_CS_ERROR2 || (h2c->flags & H2_CF_GOAWAY_FAILED) ||
> (gdb)
> 3687 if ((conn->flags & CO_FL_SOCK_WR_SH) ||
> (gdb)
> 3688    h2c->st0 == H2_CS_ERROR2 || (h2c->flags & H2_CF_GOAWAY_FAILED) ||
> (gdb)
> 3690     !br_data(h2c->mbuf) &&
> (gdb)
> 3689    (h2c->st0 != H2_CS_ERROR &&
> (gdb)
> 3690     !br_data(h2c->mbuf) &&
> (gdb)
> 3691     (h2c->mws <= 0 || LIST_ISEMPTY(&h2c->fctl_list)) &&
> (gdb)
> 3692     ((h2c->flags & H2_CF_MUX_BLOCK_ANY) ||
> LIST_ISEMPTY(&h2c->send_list))))
> (gdb)
> 3680 MT_LIST_DEL((struct mt_list *)&conn->list);
> (gdb)
> 3693 h2_release_mbuf(h2c);
> (gdb)
> 3695 if (h2c->task) {
> (gdb)
> 3696 if (h2c_may_expire(h2c))
> (gdb)
> 3697 h2c->task->expire = tick_add(now_ms, h2c->last_sid < 0 ? h2c->timeout
> : h2c->shut_timeout);
> (gdb)
> 3700 task_queue(h2c->task);
> (gdb)
> 3703 h2_send(h2c);
> (gdb)
> 3704 TRACE_LEAVE(H2_EV_H2C_WAKE, conn);
> (gdb)
> 3705 return 0;
> (gdb)
> 3704 TRACE_LEAVE(H2_EV_H2C_WAKE, conn);
> (gdb)
> 3706 }
> (gdb)
> h2_io_cb (t=<optimized out>, ctx=0x7f2c18dca740, status=<optimized out>)
> at src/mux_h2.c:3590
> 3590 if (!ret && conn_in_list) {
> (gdb)
> 3600 TRACE_LEAVE(H2_EV_H2C_WAKE);
> (gdb)
> 3602 }
> (gdb)
> run_tasks_from_lists (budgets=budgets@entry=0x7f2c3c51a35c) at
> src/task.c:456
> 456 sched->current = NULL;
> (gdb)
> 455 done++;
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 457 __ha_barrier_store();
> (gdb)
> 458 continue;
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1)))
> {
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 424 if (LIST_ISEMPTY(&tl_queues[queue])) {
> (gdb)
> 430 if (!budgets[queue]) {
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 442 ctx = t->context;
> (gdb)
> 443 process = t->process;
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *,
> list);
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 441 activity[tid].ctxsw++;
> (gdb)
> 444 t->calls++;
> (gdb)
> 445 sched->current = t;
> (gdb)
> 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 450 LIST_DEL_INIT(&((struct tasklet *)t)->list);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 451 __ha_barrier_store();
> (gdb)
> 452 state = _HA_ATOMIC_XCHG(&t->state, state);
> (gdb)
> 454 process(t, ctx, state);
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 455 done++;
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 457 __ha_barrier_store();
> (gdb)
> 458 continue;
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1)))
> {
> (gdb)
>
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 424 if (LIST_ISEMPTY(&tl_queues[queue])) {
> (gdb)
> 430 if (!budgets[queue]) {
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 442 ctx = t->context;
> (gdb)
> 443 process = t->process;
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *,
> list);
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 441 activity[tid].ctxsw++;
> (gdb)
> 444 t->calls++;
> (gdb)
> 445 sched->current = t;
> (gdb)
> 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 450 LIST_DEL_INIT(&((struct tasklet *)t)->list);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 451 __ha_barrier_store();
> (gdb)
> 452 state = _HA_ATOMIC_XCHG(&t->state, state);
> (gdb)
> 454 process(t, ctx, state);
> (gdb)
> 456 sched->current = NULL;
> (gdb)
>
> 455 done++;
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 457 __ha_barrier_store();
> (gdb)
> 458 continue;
> (gdb)
>
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
>
> 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1)))
> {
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 424 if (LIST_ISEMPTY(&tl_queues[queue])) {
> (gdb)
> 430 if (!budgets[queue]) {
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 442 ctx = t->context;
> (gdb)
> 443 process = t->process;
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *,
> list);
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 441 activity[tid].ctxsw++;
> (gdb)
> 444 t->calls++;
> (gdb)
> 445 sched->current = t;
> (gdb)
> 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 450 LIST_DEL_INIT(&((struct tasklet *)t)->list);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 451 __ha_barrier_store();
> (gdb)
> 452 state = _HA_ATOMIC_XCHG(&t->state, state);
> (gdb)
> 454 process(t, ctx, state);
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 455 done++;
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 457 __ha_barrier_store();
> (gdb)
> 458 continue;
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1)))
> {
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 424 if (LIST_ISEMPTY(&tl_queues[queue])) {
> (gdb)
> 430 if (!budgets[queue]) {
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 442 ctx = t->context;
> (gdb)
> 443 process = t->process;
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *,
> list);
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 441 activity[tid].ctxsw++;
> (gdb)
> 444 t->calls++;
> (gdb)
> 445 sched->current = t;
> (gdb)
> 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 450 LIST_DEL_INIT(&((struct tasklet *)t)->list);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 451 __ha_barrier_store();
> (gdb)
> 452 state = _HA_ATOMIC_XCHG(&t->state, state);
> (gdb)
> 454 process(t, ctx, state);
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 455 done++;
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 457 __ha_barrier_store();
> (gdb)
> 458 continue;
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1)))
> {
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 424 if (LIST_ISEMPTY(&tl_queues[queue])) {
> (gdb)
> 430 if (!budgets[queue]) {
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 442 ctx = t->context;
> (gdb)
> 443 process = t->process;
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *,
> list);
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 441 activity[tid].ctxsw++;
> (gdb)
> 444 t->calls++;
> (gdb)
> 445 sched->current = t;
> (gdb)
> 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 450 LIST_DEL_INIT(&((struct tasklet *)t)->list);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 451 __ha_barrier_store();
> (gdb)
> 452 state = _HA_ATOMIC_XCHG(&t->state, state);
> (gdb)
> 454 process(t, ctx, state);
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 455 done++;
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 457 __ha_barrier_store();
> (gdb)
> 458 continue;
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 399 if (unlikely(sched->tl_class_mask & budget_mask & ((1 << queue) - 1)))
> {
> (gdb)
> 398 if (global.tune.options & GTUNE_SCHED_LOW_LATENCY) {
> (gdb)
> 424 if (LIST_ISEMPTY(&tl_queues[queue])) {
> (gdb)
> 430 if (!budgets[queue]) {
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 442 ctx = t->context;
> (gdb)
> 443 process = t->process;
> (gdb)
> 436 budgets[queue]--;
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 437 t = (struct task *)LIST_ELEM(tl_queues[queue].n, struct tasklet *,
> list);
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 438 state = t->state & (TASK_SHARED_WQ|TASK_SELF_WAKING|TASK_KILLED);
> (gdb)
> 440 ti->flags &= ~TI_FL_STUCK; // this thread is still running
> (gdb)
> 441 activity[tid].ctxsw++;
> (gdb)
> 444 t->calls++;
> (gdb)
> 445 sched->current = t;
> (gdb)
> 447 _HA_ATOMIC_SUB(&tasks_run_queue, 1);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 450 LIST_DEL_INIT(&((struct tasklet *)t)->list);
> (gdb)
> 449 if (TASK_IS_TASKLET(t)) {
> (gdb)
> 451 __ha_barrier_store();
> (gdb)
> 452 state = _HA_ATOMIC_XCHG(&t->state, state);
> (gdb)
> 454 process(t, ctx, state);
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 455 done++;
> (gdb)
> 456 sched->current = NULL;
> (gdb)
> 457 __ha_barrier_store();
> (gdb) q
> A debugging session is active.
>
> Inferior 1 [process 20598] will be detached.
>
> Quit anyway? (y or n) y
> Detaching from program: /usr/sbin/haproxy, process 20598
>
> czw., 25 mar 2021 o 13:51 Christopher Faulet <cfau...@haproxy.com>
> napisał(a):
>
>> Le 25/03/2021 à 13:38, Maciej Zdeb a écrit :
>> > Hi,
>> >
>> > I deployed a patched (with volatile hlua_not_dumpable) HAProxy and so
>> far so
>> > good, no looping. Christopher I saw new patches with hlua_traceback
>> used
>> > instead, looks much cleaner to me, should I verify them instead? :)
>> >
>> > Christopher & Willy I've forgotten to thank you for help!
>> >
>> Yes please, try the last 2.2 snapshot. It is a really a better way to fix
>> this
>> issue because the Lua traceback is never ignored. And it is really safer
>> to not
>> allocate memory in the debugger.
>>
>> So now, we should be able to figure out why the Lua fires the watchdog.
>> Because,
>> under the hood, it is the true issue :)
>>
>> --
>> Christopher Faulet
>>
>

Reply via email to