Re: [3.0.5] Unexpected SD-- on (almost) successful requests

Christopher Faulet Thu, 26 Sep 2024 06:35:22 -0700

Hi Luke,

Le 26/09/2024 à 12:28, Luke Seelenbinder a écrit :

On upgrading to 3.0.5, we began to see a lot of failed backend requests. Theyare successful status codes but fail with connection state `SD--`. On theupstream side, the request succeeds (the upstream is also HAProxy, its state is`----`).
The data appears to be fully transferred without error, but something goes wrongtowards the end of the request. This happens on a rather small percentage ofrequests, but I'm struggling to determine how to isolate the problem further.Timing and bytes transferred on both sides match up. Varnish is in the loop formost of these requests (but not all), and it ends up returning an errorresponse, so it's not a spurious log line where the client doesn't register anerror. To make matters worse, the response status code from the backend issuccessful, so the requests can't be retried using L7.

Sorry, I don't understand, the response was successfully sent to the client whenthis happens or not ? It is "just" an issue with the termination state or thereis also an issue with the response itself ?

The only thing that was changed should be the upgrade between 3.0.4 and 3.0.5.
Our settings are pretty standard. TLS on both sides; a mix of H3, H2, and H1.1for the frontend; exclusively client-cert TLS + H1.1 for the backend. Errorshappen on all FE protocols.
Any tips on how to debug this further? Possibly relevant config below.

Well, if it is a issue with the termination state while the response is fullysent to the client, it may be a server shutdown that is caught too early, whenit is received with the last bytes of data.

At first glance, there is not so much fix that can explain that. Maybe thefollowing one, not sure:


commit e2a93b649286b30245333eec5851acd3991fda47
Author: Christopher Faulet <[email protected]>
Date:   Mon Jul 29 17:48:16 2024 +0200

    BUG/MEDIUM: stconn: Report error on SC on send if a previous SE error was 
set

    When a send on a connection is performed, if a SE error (or a pending error)
    was already reported earlier, we leave immediately. No send is performed.
    However, we must be sure to report the error at the SC level if necessary.
    Indeed, the SE error may have been reported during the zero-copy data
    forwarding. So during receive on the opposite side. In that case, we may
    have missed the opportunity to report it at the SC level.

    The patch must be backported as far as 2.8.

    (cherry picked from commit 5dc45445ff18207dbacebf1f777e1f1abcd5065d)
    Signed-off-by: Christopher Faulet <[email protected]>

You may try do disable the zero-copy data forwarding with -dZ command line 
option.

--
Christopher Faulet

Re: [3.0.5] Unexpected SD-- on (almost) successful requests

Reply via email to