Re: [users@httpd] graceful-stop closes established connections without response
On 1/29/24 12:25 PM, Yann Ylavic wrote: That's where we are, I think, if this first/light patch eventually helps significantly with the "local" graceful-stop which you care about still, it's possibly worth it since it requires no opt-in (but needs testing..), but going further looks overkill/risky for httpd. i (theoretically) applied your patch to the debian source and rebuilt and installed the packages, but the behavior still persists. it has been a long time since i have had to patch debian source (it was apache back then as well :-) thundering herd patch IIRC) so i'm not 100% sure that i've gotten it right. mod_mpm_event.so does indeed have an updated modtime -rw-r--r-- 1 root root 75984 Jan 30 00:13 /usr/lib/apache2/modules/mod_mpm_event.so vs -rw-r--r-- 1 root root 75984 Apr 13 2023 /usr/lib/apache2/modules/mod_mpm_event.so but the identical file size makes me wonder whether i was successful in applying the patch. i was going to add some debugging lines, but when i took a quick look at the patch, i wasn't clear on which sections of the code i should be guaranteed to hit. can you be so kind as to send an updated patch with some gratuitous logging in the appropriate sections so that there will be positive affirmation that the patch has (or hasn't) been applied and is falling into the expected sections? - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
On Mon, Jan 29, 2024 at 4:59 PM Sherrard Burton wrote: > > On 1/29/24 10:17 AM, Yann Ylavic wrote: > > On Mon, Jan 29, 2024 at 3:06 PM Eric Covener wrote: > > > > The patch helps in this case because we no longer close the listening > > sockets unconditionally, I mean without first checking if there are > > new connections in the backlog. So I thought the option was needed > > because if nothing stops new connections from arriving it could > > prevent the child from stopping indefinitely? How could we know if a > > LB/VIP is in place? > > it sounds like this issue is all but resolved, but i would like to > understand whether the above (preventing the child from stopping > indefinitely) is an actual possibility. > > my (naive) expectation is that if a given child has been signaled while > handling an existing request then it "knows" not to accept() a new > request after completing the existing request. so it seems that the > child is not under any danger of continuing indefinitely, regardless of > the contents of the backlog. Yes, a stopping child won't accept any new connection currently in httpd, but this is what I proposed to change: each child continues to accept new connections after the graceful signal, until there is nothing to accept anymore. Though this needs an opt-in obviously. Sorry for the confusion because this is not what the patch I initially proposed is doing either, the patch simply allows for one more try at emptying the backlog after the signal was received, so it won't by itself prevent the child from stopping, but it might (likely) not be enough if resets don't happen mainly because of some bad timing in the listener thread (which this patch addresses, only). So before we go to the opt-in, as Eric said, we might as well consider that since it's not fully addressable in httpd anyway (without races), we'd rather let this be handled outside httpd (better/fully). That's where we are, I think, if this first/light patch eventually helps significantly with the "local" graceful-stop which you care about still, it's possibly worth it since it requires no opt-in (but needs testing..), but going further looks overkill/risky for httpd. - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
On 1/29/24 10:17 AM, Yann Ylavic wrote: On Mon, Jan 29, 2024 at 3:06 PM Eric Covener wrote: The patch helps in this case because we no longer close the listening sockets unconditionally, I mean without first checking if there are new connections in the backlog. So I thought the option was needed because if nothing stops new connections from arriving it could prevent the child from stopping indefinitely? How could we know if a LB/VIP is in place? it sounds like this issue is all but resolved, but i would like to understand whether the above (preventing the child from stopping indefinitely) is an actual possibility. my (naive) expectation is that if a given child has been signaled while handling an existing request then it "knows" not to accept() a new request after completing the existing request. so it seems that the child is not under any danger of continuing indefinitely, regardless of the contents of the backlog. - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
On 1/29/24 9:05 AM, Eric Covener wrote: Maybe I wasn't clear enough but this patch makes sense only if there is something in place that prevents new connections from arriving at the stopping httpd children processes (like a frontend/load-balancer or a tcp/bpf filter), otherwise they may never really stop which does not help for a graceful stop/restart obviously. So this change (if useful) should be guarded by a GracefulDrain on/off or something config option to not hurt the other use cases. Thanks Yann! It seems to me If there is no such LB/VIP that stops new connections from landing on this server, the new option should be avoided. But if there is such a LB/VIP, the option is not really needed. Is it fair? indeed. i tried to convey this in my most-recent response, but i think you put it much more clearly. we have essentially starting removing servers from the backend pool before issuing the graceful-stop. _when_ this is feasible, it does obviate the need for any changes at the httpd level. but it still leaves us with the possibility of resetting established connections during unplanned shutdown, such as an unexpected host reboot. but, as indicated previously, i think i am satisfied with this approach after gaining understanding the underlying mechanisms. - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
On 1/29/24 8:59 AM, Yann Ylavic wrote: Maybe I wasn't clear enough but this patch makes sense only if there is something in place that prevents new connections from arriving at the stopping httpd children processes (like a frontend/load-balancer or a tcp/bpf filter), otherwise they may never really stop which does not help for a graceful stop/restart obviously. So this change (if useful) should be guarded by a GracefulDrain on/off or something config option to not hurt the other use cases. Yann, thank you for the feedback and the thorough explanation (i hadn't thought about the OS-level backlog). after discovery of this problem i began removing the target server from the backend pool in the load-balancer before issuing the graceful-stop. this seems to work as expected for planned maintenance, but the problem can still occur if the backend server ends up needing to be stopped outside of my direct command. we know that we cannot guard against all situations, but the systemd unit files on debian seem to trigger graceful-stop for its 'stop' action ExecStop=/usr/sbin/apachectl graceful-stop so if indeed this were addressable within httpd (which it sounds like it is not) then we would theoretically be protected from all but unexpected hard crashes in httpd or at the OS level. my takeaway at this point, not having yet finished reading Eric's other responses on this thread, is that this mostly "is what it is", and i think that i am (provisionally) satisfied with knowing that the behavior that i am observing is not a bug, per-se. ps--is there any possibility of "shrinking" this window by tuning either ListenBacklog or any analogous settings at the OS level? or do i risk creating a different problem? i ask because theoretically, the backlog/queuing (from the client perspective) should mostly be handled by the load-balancer, which is currently L4, not L7. so if reducing/eliminating the backlog on the server might positively impact this situation then i may spend some time on it. thanks again - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
On Mon, Jan 29, 2024 at 4:21 PM Eric Covener wrote: > > > > It seems to me If there is no such LB/VIP that stops new connections > > > from landing on this server, the new option should be avoided. > > > > Correct. > > > > > But if there is such a LB/VIP, the option is not really needed. Is it > > > fair? > > > > The patch helps in this case because we no longer close the listening > > sockets unconditionally, I mean without first checking if there are > > new connections in the backlog. So I thought the option was needed > > because if nothing stops new connections from arriving it could > > prevent the child from stopping indefinitely? How could we know if a > > LB/VIP is in place? > > I mean the initial patch vs. the status quo, not just the opt-in part. Even if there is a LB that stops routing new connections to the stopping httpd we might kill the ones that are in the backlog already. But yes I suppose that the switch on the LB could precede the graceful-stop by a few seconds to let httpd drain the backlog normally, in any case the race is hardly addressable fully in httpd so we might consider doing nothing to minimize it too, that's fair enough :) - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
> > It seems to me If there is no such LB/VIP that stops new connections > > from landing on this server, the new option should be avoided. > > Correct. > > > But if there is such a LB/VIP, the option is not really needed. Is it fair? > > The patch helps in this case because we no longer close the listening > sockets unconditionally, I mean without first checking if there are > new connections in the backlog. So I thought the option was needed > because if nothing stops new connections from arriving it could > prevent the child from stopping indefinitely? How could we know if a > LB/VIP is in place? I mean the initial patch vs. the status quo, not just the opt-in part. - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
On Mon, Jan 29, 2024 at 3:06 PM Eric Covener wrote: > > > Maybe I wasn't clear enough but this patch makes sense only if there > > is something in place that prevents new connections from arriving at > > the stopping httpd children processes (like a frontend/load-balancer > > or a tcp/bpf filter), otherwise they may never really stop which does > > not help for a graceful stop/restart obviously. So this change (if > > useful) should be guarded by a GracefulDrain on/off or something > > config option to not hurt the other use cases. > > Thanks Yann! > > It seems to me If there is no such LB/VIP that stops new connections > from landing on this server, the new option should be avoided. Correct. > But if there is such a LB/VIP, the option is not really needed. Is it fair? The patch helps in this case because we no longer close the listening sockets unconditionally, I mean without first checking if there are new connections in the backlog. So I thought the option was needed because if nothing stops new connections from arriving it could prevent the child from stopping indefinitely? How could we know if a LB/VIP is in place? - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
> Maybe I wasn't clear enough but this patch makes sense only if there > is something in place that prevents new connections from arriving at > the stopping httpd children processes (like a frontend/load-balancer > or a tcp/bpf filter), otherwise they may never really stop which does > not help for a graceful stop/restart obviously. So this change (if > useful) should be guarded by a GracefulDrain on/off or something > config option to not hurt the other use cases. Thanks Yann! It seems to me If there is no such LB/VIP that stops new connections from landing on this server, the new option should be avoided. But if there is such a LB/VIP, the option is not really needed. Is it fair? - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
On Mon, Jan 29, 2024 at 2:23 PM Yann Ylavic wrote: > > On Sun, Jan 28, 2024 at 5:26 AM Sherrard Burton wrote: > > > > On 1/27/24 09:46 PM, Eric Covener wrote: > > > > > > Both worker and event MPMs have a dedicated listener thread per child > > > process, so it will close those copies of the listening sockets much > > > more quickly. > > > > so that i am clear, are you saying that this behavior is still possible, > > although less likely under the worker and event MPMs? > > I think it's possible regardless of the MPM, and there is quite little > a server can do about it without the help of the system or some > tcp/bpf filter (something that prepares the graceful shutdown at the > system level to prevent the 3-way handshake from completing). > This is because when the connections are ready to be accept()ed (i.e. > in the listening socket's backlog), they are already fully established > and likely contain the request data (at least partly), the system has > done this underneath httpd already. > So if/when httpd closes its listening socket(s) all the connections in > the backlog(s) are lost/reset anyway, and there is always going to be > a race condition with the draining of the backlog if nothing stops new > connections from being established at the system level. > > To minimize the race condition maybe httpd can do better at trying to > drain the backlog before closing the listeners. Does the attached > patch help for instance (it's against mpm_event 2.4.x)? > But I don't think it can be fully solved at httpd level anyway, with > this change the effective stop could be longer (so long as there are > incoming/pending connections routed to each child by the system), it > could even last forever theoretically if connections keep coming > indefinitely.. Maybe I wasn't clear enough but this patch makes sense only if there is something in place that prevents new connections from arriving at the stopping httpd children processes (like a frontend/load-balancer or a tcp/bpf filter), otherwise they may never really stop which does not help for a graceful stop/restart obviously. So this change (if useful) should be guarded by a GracefulDrain on/off or something config option to not hurt the other use cases. > > Regards; > Yann. - To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org For additional commands, e-mail: users-h...@httpd.apache.org
Re: [users@httpd] graceful-stop closes established connections without response
On Sun, Jan 28, 2024 at 5:26 AM Sherrard Burton wrote: > > On 1/27/24 09:46 PM, Eric Covener wrote: > > > > Both worker and event MPMs have a dedicated listener thread per child > > process, so it will close those copies of the listening sockets much > > more quickly. > > so that i am clear, are you saying that this behavior is still possible, > although less likely under the worker and event MPMs? I think it's possible regardless of the MPM, and there is quite little a server can do about it without the help of the system or some tcp/bpf filter (something that prepares the graceful shutdown at the system level to prevent the 3-way handshake from completing). This is because when the connections are ready to be accept()ed (i.e. in the listening socket's backlog), they are already fully established and likely contain the request data (at least partly), the system has done this underneath httpd already. So if/when httpd closes its listening socket(s) all the connections in the backlog(s) are lost/reset anyway, and there is always going to be a race condition with the draining of the backlog if nothing stops new connections from being established at the system level. To minimize the race condition maybe httpd can do better at trying to drain the backlog before closing the listeners. Does the attached patch help for instance (it's against mpm_event 2.4.x)? But I don't think it can be fully solved at httpd level anyway, with this change the effective stop could be longer (so long as there are incoming/pending connections routed to each child by the system), it could even last forever theoretically if connections keep coming indefinitely.. Regards; Yann. Index: server/mpm/event/event.c === --- server/mpm/event/event.c (revision 1915442) +++ server/mpm/event/event.c (working copy) @@ -174,7 +174,7 @@ static int had_healthy_child = 0; static volatile int dying = 0; static volatile int workers_may_exit = 0; static volatile int start_thread_may_exit = 0; -static volatile int listener_may_exit = 0; +static volatile apr_uint32_t listener_may_exit = 0; static int listener_is_wakeable = 0;/* Pollset supports APR_POLLSET_WAKEABLE */ static int num_listensocks = 0; static apr_int32_t conns_this_child;/* MaxConnectionsPerChild, only access @@ -481,8 +481,7 @@ static void disable_listensocks(void) static void enable_listensocks(void) { int i; -if (listener_may_exit -|| apr_atomic_cas32(&listensocks_disabled, 0, 1) != 1) { +if (dying || apr_atomic_cas32(&listensocks_disabled, 0, 1) != 1) { return; } ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, ap_server_conf, APLOGNO(00457) @@ -575,8 +574,7 @@ static void wakeup_listener(void) ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, ap_server_conf, "wake up listener%s", listener_may_exit ? " again" : ""); -listener_may_exit = 1; -disable_listensocks(); +apr_atomic_cas32(&listener_may_exit, 1, 0); /* Unblock the listener if it's poll()ing */ if (event_pollset && listener_is_wakeable) { @@ -1184,12 +1182,9 @@ read_request: cs->pub.state = CONN_STATE_READ_REQUEST_LINE; goto read_request; } -else if (!listener_may_exit) { +else { cs->pub.state = CONN_STATE_CHECK_REQUEST_LINE_READABLE; } -else { -cs->pub.state = CONN_STATE_LINGER; -} } if (cs->pub.state == CONN_STATE_CHECK_REQUEST_LINE_READABLE) { @@ -1654,7 +1649,7 @@ static void * APR_THREAD_FUNC listener_thread(apr_ proc_info *ti = dummy; int process_slot = ti->pslot; struct process_score *ps = ap_get_scoreboard_process(process_slot); -int closed = 0; +int may_exit = 0, closed = 0; int have_idle_worker = 0; apr_time_t last_log; @@ -1678,7 +1673,7 @@ static void * APR_THREAD_FUNC listener_thread(apr_ if (conns_this_child <= 0) check_infinite_requests(); -if (listener_may_exit) { +if (may_exit) { int first_close = close_listeners(&closed); if (terminate_mode == ST_UNGRACEFUL @@ -1899,7 +1894,7 @@ static void * APR_THREAD_FUNC listener_thread(apr_ "Idle workers: %u", ap_queue_info_num_idlers(worker_queue_info)); } -else if (!listener_may_exit) { +else { void *csd = NULL; ap_listen_rec *lr = (ap_listen_rec *) pt->baton; apr_pool_t *ptrans; /* Pool for per-transaction stuff */ @@ -1960,6 +1955,14 @@ static void * APR_THREAD_FUNC listener_thread(apr_ } /* if:else on pt->type */ } /* for processing poll */ +/* On graceful shutdown/stop we can close the listening sockets + * since the backlog should be drained now. +