On Tue, Nov 10, 2020 at 04:14:52PM +0100, Willy Tarreau wrote: > Seems like we're getting closer. Will continue digging now.
I found that among the 5 crashes I got, 3 were under pool_flush() that is precisely called during the soft stopping. I tried to disable that function with the patch below and I can't reproduce the problem anymore, it would be nice if you could test it. I'm suspecting that either it copes badly with the lockless pools, or that pool_gc() itself, called from the signal handler, could possibly damage some of the pools and cause some lose objects to be used, returned and reused once reallocated. I see no reason for the relation with SPOE like this, but maybe it just helps trigger the complex condition. diff --git a/src/pool.c b/src/pool.c index 321f8bc67..5e2f41fe9 100644 --- a/src/pool.c +++ b/src/pool.c @@ -246,7 +246,7 @@ void pool_flush(struct pool_head *pool) void **next, *temp; int removed = 0; - if (!pool) + //if (!pool) return; HA_SPIN_LOCK(POOL_LOCK, &pool->lock); do { I'm continuing to investigate. Willy