Hi William,

Thanks for looking into it! The timeout also would work very well.
I just tried it out and no issues at all.

Regarding the CI problem: I found a way to avoid using signals by using "debug 
dev delay" + "nbthread 1" for the vtests.
Please let me know if that would work in the CI environment.

I composed everything into a new patch and added you as the Co-Author.
Please let me know if the patch looks good now.

Best regards,

Alexander


-----Original Message-----
From: William Lallemand <[email protected]> 
Sent: Thursday, April 30, 2026 3:49 PM
To: Stephan, Alexander <[email protected]>
Cc: [email protected]
Subject: Re: [PATCH] BUG/MEDIUM: cli: fix master CLI connection slot leak on 
client disconnect

Hello Alexander,

On Wed, Apr 29, 2026 at 05:04:20PM +0000, Stephan, Alexander wrote:
> The patch works like this:
>
>   1.  In pcli_wait_for_request(), when the response analyzer is active 
> and the frontend stream connector shows a client disconnect (SC_FL_EOS 
> or SC_FL_ABRT_DONE on scf), explicitly call
> sc_abort(s->scb) to propagate the disconnect to the backend.
>   2.  In pcli_wait_for_response(), extend the error condition to also 
> check for SC_FL_ABRT_DONE on scb. This flag is only set by the 
> explicit sc_abort() above, so normal one-shot CLI tools that close 
> their TCP connection after receiving a response are not affected.

The disconnect should already be propagated to the backend, but if the backend 
does not respond anymore, we never receive the FIN/ACK so we can't clean 
correctly. It's suppose to be handled with timeouts in a standard proxy. But we 
don't have timeouts on the master CLI. Adding a server timeout might be 
problematic because it would change the behavior on long waiting commands, 
however adding a timeout server-fin might help closing without waiting for the 
worker.


diff --git a/src/cli.c b/src/cli.c
index 3bd7098981..3043ceb8e2 100644
--- a/src/cli.c
+++ b/src/cli.c
@@ -3834,8 +3834,9 @@ int mworker_cli_create_master_proxy(char **errmsg)
        mworker_proxy->mode = PR_MODE_CLI;
        /* default to 10 concurrent connections */
        mworker_proxy->maxconn = 10;
-       /* no timeout */
-       mworker_proxy->timeout.client = 0;
+       mworker_proxy->timeout.client = 0; /* no timeout */
+       mworker_proxy->timeout.serverfin = MS_TO_TICKS(1000); /* 1s 
+ timeout in case worker is not responding on shutdown */
+
        mworker_proxy->conf.file = strdup("MASTER");
        mworker_proxy->conf.line = 0;
        mworker_proxy->accept = frontend_accept;


I tested the above change and it seems to fix the issue, could you confirm with 
your setup?

> A regression test is included.
> This should be backported to all stable branches.
> 

That kind of test might be problematic in the CI, signals are not really 
reliable in the CI environment, it might cause flappy results.


Regards,

--
William Lallemand

Attachment: 0001-BUG-MEDIUM-cli-fix-master-CLI-connection-slot-leak-o.patch
Description: 0001-BUG-MEDIUM-cli-fix-master-CLI-connection-slot-leak-o.patch

Reply via email to