On Tue, Nov 10, 2020 at 04:14:52PM +0100, Willy Tarreau wrote:
> Seems like we're getting closer. Will continue digging now.

I found that among the 5 crashes I got, 3 were under pool_flush()
that is precisely called during the soft stopping. I tried to
disable that function with the patch below and I can't reproduce
the problem anymore, it would be nice if you could test it. I'm
suspecting that either it copes badly with the lockless pools,
or that pool_gc() itself, called from the signal handler, could
possibly damage some of the pools and cause some lose objects to
be used, returned and reused once reallocated. I see no reason
for the relation with SPOE like this, but maybe it just helps
trigger the complex condition.

diff --git a/src/pool.c b/src/pool.c
index 321f8bc67..5e2f41fe9 100644
--- a/src/pool.c
+++ b/src/pool.c
@@ -246,7 +246,7 @@ void pool_flush(struct pool_head *pool)
        void **next, *temp;
        int removed = 0;
 
-       if (!pool)
+       //if (!pool)
                return;
        HA_SPIN_LOCK(POOL_LOCK, &pool->lock);
        do {

I'm continuing to investigate.

Willy

Reply via email to