Re: child exit on self-proxy
> Am 07.07.2021 um 18:34 schrieb Yann Ylavic : > > Does this attached patch help? This is to verify an hypothesis where > killed keepalive connections are still in the defer_linger_chain > without a worker to handle them. > > I don't see "interrupted, but continuing" triggering in your log but > possibly it happened earlier.. [Wed Jul 07 16:48:04.237865 2021] [mpm_event:debug] [pid 55134:tid 4643098112] event.c(599): AH: wakeup_listener: start [Wed Jul 07 16:48:04.237895 2021] [mpm_event:debug] [pid 55134:tid 4643098112] event.c(603): AH: wakeup_listener: apr_pollset_wakeup [Wed Jul 07 16:48:04.238042 2021] [mpm_event:debug] [pid 55134:tid 4643098112] event.c(610): AH: wakeup_listener: ap_queue_info_term [Wed Jul 07 16:48:04.238224 2021] [mpm_event:debug] [pid 55134:tid 123145362530304] event.c(1971): (4)Interrupted system call: AH: pollset returned listener_may_exit=1 connection_count=1 [Wed Jul 07 16:48:04.238381 2021] [mpm_event:trace1] [pid 55134:tid 123145362530304] event.c(1809): All workers are busy or dying, will close 1 keep-alive connections [Wed Jul 07 16:48:04.238196 2021] [mpm_event:debug] [pid 55134:tid 4643098112] event.c(629): AH: wakeup_listener: pthread_kill [Wed Jul 07 16:48:04.238513 2021] [mpm_event:debug] [pid 55134:tid 123145362530304] event.c(1971): (4)Interrupted system call: AH: pollset returned listener_may_exit=1 connection_count=1 [Wed Jul 07 16:48:04.238506 2021] [mpm_event:debug] [pid 55134:tid 4643098112] event.c(635): AH: wakeup_listener: end [Wed Jul 07 16:48:04.238555 2021] [mpm_event:trace1] [pid 55134:tid 4643098112] event.c(2928): graceful termination received, join workers [Wed Jul 07 16:48:08.443130 2021] [mpm_event:debug] [pid 55134:tid 123145362530304] event.c(1971): (70007)The timeout specified has expired: AH: pollset returned listener_may_exit=1 connection_count=1 [Wed Jul 07 16:48:08.443149 2021] [mpm_event:trace1] [pid 55134:tid 123145362530304] event.c(1809): All workers are busy or dying, will close 0 keep-alive connections > > On Wed, Jul 7, 2021 at 5:51 PM Stefan Eissing > wrote: >> >> The things I do for you... >> >> [Wed Jul 07 15:48:52.591983 2021] [mpm_event:debug] [pid 48747:tid >> 4385197568] event.c(599): AH: wakeup_listener: start >> [Wed Jul 07 15:48:52.592000 2021] [mpm_event:debug] [pid 48747:tid >> 4385197568] event.c(603): AH: wakeup_listener: apr_pollset_wakeup >> [Wed Jul 07 15:48:52.592046 2021] [mpm_event:debug] [pid 48747:tid >> 4385197568] event.c(610): AH: wakeup_listener: ap_queue_info_term >> [Wed Jul 07 15:48:52.592188 2021] [mpm_event:debug] [pid 48747:tid >> 4385197568] event.c(629): AH: wakeup_listener: pthread_kill >> [Wed Jul 07 15:48:52.592166 2021] [mpm_event:debug] [pid 48747:tid >> 123145502474240] event.c(1985): (4)Interrupted system call: AH: pollset >> returned listener_may_exit=1 connection_count=2 >> [Wed Jul 07 15:48:52.592523 2021] [mpm_event:debug] [pid 48747:tid >> 4385197568] event.c(635): AH: wakeup_listener: end >> [Wed Jul 07 15:48:52.592727 2021] [mpm_event:trace1] [pid 48747:tid >> 4385197568] event.c(2942): graceful termination received, join workers >> [Wed Jul 07 15:48:52.592515 2021] [mpm_event:trace1] [pid 48747:tid >> 123145502474240] event.c(1809): All workers are busy or dying, will close 2 >> keep-alive connections >> [Wed Jul 07 15:48:52.593381 2021] [mpm_event:debug] [pid 48747:tid >> 123145502474240] event.c(1985): AH: pollset returned listener_may_exit=1 >> connection_count=2 >> [Wed Jul 07 15:48:52.593478 2021] [mpm_event:debug] [pid 48747:tid >> 123145502474240] event.c(1985): (4)Interrupted system call: AH: pollset >> returned listener_may_exit=1 connection_count=1 >> [Wed Jul 07 15:48:57.552623 2021] [mpm_event:debug] [pid 48747:tid >> 123145502474240] event.c(1985): (70007)The timeout specified has expired: >> AH: pollset returned listener_may_exit=1 connection_count=1 >> [Wed Jul 07 15:48:57.552658 2021] [mpm_event:trace1] [pid 48747:tid >> 123145502474240] event.c(1809): All workers are busy or dying, will close 0 >> keep-alive connections >> >> >> >> Btw. I opened a bottle of nice wine - this is entertaining. ;-) >> >> - Stefan >> >>> Am 07.07.2021 um 17:48 schrieb Yann Ylavic : >>> >>> On Wed, Jul 7, 2021 at 5:45 PM Stefan Eissing >>> wrote: > Am 07.07.2021 um 17:39 schrieb Ruediger Pluem : > > Hm. The ap_log_error statically writes listener_may_exit=0. Can you put > the actual value of listener_may_exit in the log message? I put it statically since that was the value in the "if ()" 6 lines above. it should have been visible then and there to work. >>> >>> The above if () does not always break. >>> >>> >>> Cheers; >>> Yann. >> >
Re: child exit on self-proxy
Does this attached patch help? This is to verify an hypothesis where killed keepalive connections are still in the defer_linger_chain without a worker to handle them. I don't see "interrupted, but continuing" triggering in your log but possibly it happened earlier.. On Wed, Jul 7, 2021 at 5:51 PM Stefan Eissing wrote: > > The things I do for you... > > [Wed Jul 07 15:48:52.591983 2021] [mpm_event:debug] [pid 48747:tid > 4385197568] event.c(599): AH: wakeup_listener: start > [Wed Jul 07 15:48:52.592000 2021] [mpm_event:debug] [pid 48747:tid > 4385197568] event.c(603): AH: wakeup_listener: apr_pollset_wakeup > [Wed Jul 07 15:48:52.592046 2021] [mpm_event:debug] [pid 48747:tid > 4385197568] event.c(610): AH: wakeup_listener: ap_queue_info_term > [Wed Jul 07 15:48:52.592188 2021] [mpm_event:debug] [pid 48747:tid > 4385197568] event.c(629): AH: wakeup_listener: pthread_kill > [Wed Jul 07 15:48:52.592166 2021] [mpm_event:debug] [pid 48747:tid > 123145502474240] event.c(1985): (4)Interrupted system call: AH: pollset > returned listener_may_exit=1 connection_count=2 > [Wed Jul 07 15:48:52.592523 2021] [mpm_event:debug] [pid 48747:tid > 4385197568] event.c(635): AH: wakeup_listener: end > [Wed Jul 07 15:48:52.592727 2021] [mpm_event:trace1] [pid 48747:tid > 4385197568] event.c(2942): graceful termination received, join workers > [Wed Jul 07 15:48:52.592515 2021] [mpm_event:trace1] [pid 48747:tid > 123145502474240] event.c(1809): All workers are busy or dying, will close 2 > keep-alive connections > [Wed Jul 07 15:48:52.593381 2021] [mpm_event:debug] [pid 48747:tid > 123145502474240] event.c(1985): AH: pollset returned listener_may_exit=1 > connection_count=2 > [Wed Jul 07 15:48:52.593478 2021] [mpm_event:debug] [pid 48747:tid > 123145502474240] event.c(1985): (4)Interrupted system call: AH: pollset > returned listener_may_exit=1 connection_count=1 > [Wed Jul 07 15:48:57.552623 2021] [mpm_event:debug] [pid 48747:tid > 123145502474240] event.c(1985): (70007)The timeout specified has expired: AH: > pollset returned listener_may_exit=1 connection_count=1 > [Wed Jul 07 15:48:57.552658 2021] [mpm_event:trace1] [pid 48747:tid > 123145502474240] event.c(1809): All workers are busy or dying, will close 0 > keep-alive connections > > > > Btw. I opened a bottle of nice wine - this is entertaining. ;-) > > - Stefan > > > Am 07.07.2021 um 17:48 schrieb Yann Ylavic : > > > > On Wed, Jul 7, 2021 at 5:45 PM Stefan Eissing > > wrote: > >> > >>> Am 07.07.2021 um 17:39 schrieb Ruediger Pluem : > >>> > >>> Hm. The ap_log_error statically writes listener_may_exit=0. Can you put > >>> the actual value of listener_may_exit in the log message? > >> > >> I put it statically since that was the value in the "if ()" 6 lines above. > >> it should have been visible then and there to work. > > > > The above if () does not always break. > > > > > > Cheers; > > Yann. > Index: server/mpm/event/event.c === --- server/mpm/event/event.c (revision 1891217) +++ server/mpm/event/event.c (working copy) @@ -1940,15 +1940,6 @@ static void * APR_THREAD_FUNC listener_thread(apr_ rc = apr_pollset_poll(event_pollset, timeout_interval, &num, &out_pfd); if (rc != APR_SUCCESS) { if (APR_STATUS_IS_EINTR(rc)) { -/* Woken up, if we are exiting or listeners are disabled we - * must fall through to kill kept-alive connections or test - * whether listeners should be re-enabled. Otherwise we only - * need to update timeouts (logic is above, so simply restart - * the loop). - */ -if (!listener_may_exit && !listeners_disabled()) { -continue; -} timeout_time = 0; } else if (!APR_STATUS_IS_TIMEUP(rc)) {
Re: child exit on self-proxy
The things I do for you... [Wed Jul 07 15:48:52.591983 2021] [mpm_event:debug] [pid 48747:tid 4385197568] event.c(599): AH: wakeup_listener: start [Wed Jul 07 15:48:52.592000 2021] [mpm_event:debug] [pid 48747:tid 4385197568] event.c(603): AH: wakeup_listener: apr_pollset_wakeup [Wed Jul 07 15:48:52.592046 2021] [mpm_event:debug] [pid 48747:tid 4385197568] event.c(610): AH: wakeup_listener: ap_queue_info_term [Wed Jul 07 15:48:52.592188 2021] [mpm_event:debug] [pid 48747:tid 4385197568] event.c(629): AH: wakeup_listener: pthread_kill [Wed Jul 07 15:48:52.592166 2021] [mpm_event:debug] [pid 48747:tid 123145502474240] event.c(1985): (4)Interrupted system call: AH: pollset returned listener_may_exit=1 connection_count=2 [Wed Jul 07 15:48:52.592523 2021] [mpm_event:debug] [pid 48747:tid 4385197568] event.c(635): AH: wakeup_listener: end [Wed Jul 07 15:48:52.592727 2021] [mpm_event:trace1] [pid 48747:tid 4385197568] event.c(2942): graceful termination received, join workers [Wed Jul 07 15:48:52.592515 2021] [mpm_event:trace1] [pid 48747:tid 123145502474240] event.c(1809): All workers are busy or dying, will close 2 keep-alive connections [Wed Jul 07 15:48:52.593381 2021] [mpm_event:debug] [pid 48747:tid 123145502474240] event.c(1985): AH: pollset returned listener_may_exit=1 connection_count=2 [Wed Jul 07 15:48:52.593478 2021] [mpm_event:debug] [pid 48747:tid 123145502474240] event.c(1985): (4)Interrupted system call: AH: pollset returned listener_may_exit=1 connection_count=1 [Wed Jul 07 15:48:57.552623 2021] [mpm_event:debug] [pid 48747:tid 123145502474240] event.c(1985): (70007)The timeout specified has expired: AH: pollset returned listener_may_exit=1 connection_count=1 [Wed Jul 07 15:48:57.552658 2021] [mpm_event:trace1] [pid 48747:tid 123145502474240] event.c(1809): All workers are busy or dying, will close 0 keep-alive connections event_debugv2.diff Description: Binary data Btw. I opened a bottle of nice wine - this is entertaining. ;-) - Stefan > Am 07.07.2021 um 17:48 schrieb Yann Ylavic : > > On Wed, Jul 7, 2021 at 5:45 PM Stefan Eissing > wrote: >> >>> Am 07.07.2021 um 17:39 schrieb Ruediger Pluem : >>> >>> Hm. The ap_log_error statically writes listener_may_exit=0. Can you put the >>> actual value of listener_may_exit in the log message? >> >> I put it statically since that was the value in the "if ()" 6 lines above. >> it should have been visible then and there to work. > > The above if () does not always break. > > > Cheers; > Yann.
Re: child exit on self-proxy
On Wed, Jul 7, 2021 at 5:45 PM Stefan Eissing wrote: > > > Am 07.07.2021 um 17:39 schrieb Ruediger Pluem : > > > > Hm. The ap_log_error statically writes listener_may_exit=0. Can you put the > > actual value of listener_may_exit in the log message? > > I put it statically since that was the value in the "if ()" 6 lines above. it > should have been visible then and there to work. The above if () does not always break. Cheers; Yann.
Re: child exit on self-proxy
On Wed, Jul 7, 2021 at 5:39 PM Ruediger Pluem wrote: > > > > On 7/7/21 5:25 PM, Stefan Eissing wrote: > > In order to reproduce the logs I see on this weird behaviour, I'll attach > > the patch I made: > > > > > > > > > > With this, I see in a hanging process: > > > > [Wed Jul 07 15:20:23.014980 2021] [core:trace1] [pid 42471:tid > > 123145448968192] core_filters.c(429): (32)Broken pipe: [client > > 127.0.0.1:49824] core_output_filter: writing data to the network > > [Wed Jul 07 15:20:23.044541 2021] [mpm_event:debug] [pid 42471:tid > > 4525448704] event.c(599): AH: wakeup_listener: start > > [Wed Jul 07 15:20:23.044601 2021] [mpm_event:debug] [pid 42471:tid > > 4525448704] event.c(603): AH: wakeup_listener: apr_pollset_wakeup > > [Wed Jul 07 15:20:23.044715 2021] [mpm_event:debug] [pid 42471:tid > > 4525448704] event.c(610): AH: wakeup_listener: ap_queue_info_term > > [Wed Jul 07 15:20:23.044861 2021] [mpm_event:debug] [pid 42471:tid > > 123145461309440] event.c(1985): (4)Interrupted system call: AH: pollset > > returned listener_may_exit=0 > > Hm. The ap_log_error statically writes listener_may_exit=0. Can you put the > actual value of listener_may_exit in the log message? It would be interesting to log apr_atomic_read32(&connection_count) too.
Re: child exit on self-proxy
> Am 07.07.2021 um 17:39 schrieb Ruediger Pluem : > > > > On 7/7/21 5:25 PM, Stefan Eissing wrote: >> In order to reproduce the logs I see on this weird behaviour, I'll attach >> the patch I made: >> >> >> >> >> With this, I see in a hanging process: >> >> [Wed Jul 07 15:20:23.014980 2021] [core:trace1] [pid 42471:tid >> 123145448968192] core_filters.c(429): (32)Broken pipe: [client >> 127.0.0.1:49824] core_output_filter: writing data to the network >> [Wed Jul 07 15:20:23.044541 2021] [mpm_event:debug] [pid 42471:tid >> 4525448704] event.c(599): AH: wakeup_listener: start >> [Wed Jul 07 15:20:23.044601 2021] [mpm_event:debug] [pid 42471:tid >> 4525448704] event.c(603): AH: wakeup_listener: apr_pollset_wakeup >> [Wed Jul 07 15:20:23.044715 2021] [mpm_event:debug] [pid 42471:tid >> 4525448704] event.c(610): AH: wakeup_listener: ap_queue_info_term >> [Wed Jul 07 15:20:23.044861 2021] [mpm_event:debug] [pid 42471:tid >> 123145461309440] event.c(1985): (4)Interrupted system call: AH: pollset >> returned listener_may_exit=0 > > Hm. The ap_log_error statically writes listener_may_exit=0. Can you put the > actual value of listener_may_exit in the log message? I put it statically since that was the value in the "if ()" 6 lines above. it should have been visible then and there to work. > >> [Wed Jul 07 15:20:23.044817 2021] [mpm_event:debug] [pid 42471:tid >> 4525448704] event.c(629): AH: wakeup_listener: pthread_kill >> [Wed Jul 07 15:20:23.044954 2021] [mpm_event:trace1] [pid 42471:tid >> 123145461309440] event.c(1809): All workers are busy or dying, will close 1 >> keep-alive connections >> [Wed Jul 07 15:20:23.044987 2021] [mpm_event:debug] [pid 42471:tid >> 4525448704] event.c(635): AH: wakeup_listener: end >> [Wed Jul 07 15:20:23.045154 2021] [mpm_event:trace1] [pid 42471:tid >> 4525448704] event.c(2940): graceful termination received, join workers >> [Wed Jul 07 15:20:25.927437 2021] [mpm_event:debug] [pid 42471:tid >> 123145461309440] event.c(1985): (70007)The timeout specified has expired: >> AH: pollset returned listener_may_exit=0 > > Looks like we get back in the poll and have to wait for the timeout. > >> [Wed Jul 07 15:20:25.927468 2021] [mpm_event:trace1] [pid 42471:tid >> 123145461309440] event.c(1809): All workers are busy or dying, will close 0 >> keep-alive connections >> >> Which says (to my eyes) that the pollset_wakeup happened, but the listener >> saw "listener_may_exit == 0". Is the volatile maybe not enough? > > Regards > > Rüdiger
Re: child exit on self-proxy
On 7/7/21 5:25 PM, Stefan Eissing wrote: > In order to reproduce the logs I see on this weird behaviour, I'll attach the > patch I made: > > > > > With this, I see in a hanging process: > > [Wed Jul 07 15:20:23.014980 2021] [core:trace1] [pid 42471:tid > 123145448968192] core_filters.c(429): (32)Broken pipe: [client > 127.0.0.1:49824] core_output_filter: writing data to the network > [Wed Jul 07 15:20:23.044541 2021] [mpm_event:debug] [pid 42471:tid > 4525448704] event.c(599): AH: wakeup_listener: start > [Wed Jul 07 15:20:23.044601 2021] [mpm_event:debug] [pid 42471:tid > 4525448704] event.c(603): AH: wakeup_listener: apr_pollset_wakeup > [Wed Jul 07 15:20:23.044715 2021] [mpm_event:debug] [pid 42471:tid > 4525448704] event.c(610): AH: wakeup_listener: ap_queue_info_term > [Wed Jul 07 15:20:23.044861 2021] [mpm_event:debug] [pid 42471:tid > 123145461309440] event.c(1985): (4)Interrupted system call: AH: pollset > returned listener_may_exit=0 Hm. The ap_log_error statically writes listener_may_exit=0. Can you put the actual value of listener_may_exit in the log message? > [Wed Jul 07 15:20:23.044817 2021] [mpm_event:debug] [pid 42471:tid > 4525448704] event.c(629): AH: wakeup_listener: pthread_kill > [Wed Jul 07 15:20:23.044954 2021] [mpm_event:trace1] [pid 42471:tid > 123145461309440] event.c(1809): All workers are busy or dying, will close 1 > keep-alive connections > [Wed Jul 07 15:20:23.044987 2021] [mpm_event:debug] [pid 42471:tid > 4525448704] event.c(635): AH: wakeup_listener: end > [Wed Jul 07 15:20:23.045154 2021] [mpm_event:trace1] [pid 42471:tid > 4525448704] event.c(2940): graceful termination received, join workers > [Wed Jul 07 15:20:25.927437 2021] [mpm_event:debug] [pid 42471:tid > 123145461309440] event.c(1985): (70007)The timeout specified has expired: AH: > pollset returned listener_may_exit=0 Looks like we get back in the poll and have to wait for the timeout. > [Wed Jul 07 15:20:25.927468 2021] [mpm_event:trace1] [pid 42471:tid > 123145461309440] event.c(1809): All workers are busy or dying, will close 0 > keep-alive connections > > Which says (to my eyes) that the pollset_wakeup happened, but the listener > saw "listener_may_exit == 0". Is the volatile maybe not enough? Regards Rüdiger
Re: child exit on self-proxy
In order to reproduce the logs I see on this weird behaviour, I'll attach the patch I made: event_debug.diff Description: Binary data With this, I see in a hanging process: [Wed Jul 07 15:20:23.014980 2021] [core:trace1] [pid 42471:tid 123145448968192] core_filters.c(429): (32)Broken pipe: [client 127.0.0.1:49824] core_output_filter: writing data to the network [Wed Jul 07 15:20:23.044541 2021] [mpm_event:debug] [pid 42471:tid 4525448704] event.c(599): AH: wakeup_listener: start [Wed Jul 07 15:20:23.044601 2021] [mpm_event:debug] [pid 42471:tid 4525448704] event.c(603): AH: wakeup_listener: apr_pollset_wakeup [Wed Jul 07 15:20:23.044715 2021] [mpm_event:debug] [pid 42471:tid 4525448704] event.c(610): AH: wakeup_listener: ap_queue_info_term [Wed Jul 07 15:20:23.044861 2021] [mpm_event:debug] [pid 42471:tid 123145461309440] event.c(1985): (4)Interrupted system call: AH: pollset returned listener_may_exit=0 [Wed Jul 07 15:20:23.044817 2021] [mpm_event:debug] [pid 42471:tid 4525448704] event.c(629): AH: wakeup_listener: pthread_kill [Wed Jul 07 15:20:23.044954 2021] [mpm_event:trace1] [pid 42471:tid 123145461309440] event.c(1809): All workers are busy or dying, will close 1 keep-alive connections [Wed Jul 07 15:20:23.044987 2021] [mpm_event:debug] [pid 42471:tid 4525448704] event.c(635): AH: wakeup_listener: end [Wed Jul 07 15:20:23.045154 2021] [mpm_event:trace1] [pid 42471:tid 4525448704] event.c(2940): graceful termination received, join workers [Wed Jul 07 15:20:25.927437 2021] [mpm_event:debug] [pid 42471:tid 123145461309440] event.c(1985): (70007)The timeout specified has expired: AH: pollset returned listener_may_exit=0 [Wed Jul 07 15:20:25.927468 2021] [mpm_event:trace1] [pid 42471:tid 123145461309440] event.c(1809): All workers are busy or dying, will close 0 keep-alive connections Which says (to my eyes) that the pollset_wakeup happened, but the listener saw "listener_may_exit == 0". Is the volatile maybe not enough? > Am 07.07.2021 um 16:56 schrieb Stefan Eissing : > > > >> Am 07.07.2021 um 13:51 schrieb Yann Ylavic : >> >> On Wed, Jul 7, 2021 at 12:16 PM Stefan Eissing >> wrote: >>> >>> I added a TRACE1 log in event.c before/after join_workers (line 2921) and >>> see: >>> >>> [Wed Jul 07 10:06:03.144044 2021] [mpm_event:trace1] [pid 72886:tid >>> 4493635072] event.c(2921): graceful termination received, join workers >>> [Wed Jul 07 10:06:03.144213 2021] [mpm_event:trace1] [pid 72886:tid >>> 123145435672576] event.c(1799): All workers are busy or dying, will close 1 >>> keep-alive connections >>> [Wed Jul 07 10:06:08.079690 2021] [mpm_event:trace1] [pid 72886:tid >>> 123145435672576] event.c(1799): All workers are busy or dying, will close 0 >>> keep-alive connections >>> [Wed Jul 07 10:06:10.78 2021] [core:warn] [pid 72813:tid 4493635072] >>> AH00045: child process 72886 still did not exit, sending a SIGTERM >>> [Wed Jul 07 10:06:12.789335 2021] [core:warn] [pid 72813:tid 4493635072] >>> AH00045: child process 72886 still did not exit, sending a SIGTERM >>> [Wed Jul 07 10:06:14.791281 2021] [core:warn] [pid 72813:tid 4493635072] >>> AH00045: child process 72886 still did not exit, sending a SIGTERM >>> [Wed Jul 07 10:06:16.792983 2021] [core:error] [pid 72813:tid 4493635072] >>> AH00046: child process 72886 still did not exit, sending a SIGKILL >>> >>> So, I assume the keep-alive connection is the mod_proxy_http connection to >>> the sever itself. Since the join_workers() never returns, there seems to be >>> a thread not finishing. >> >> No, MPM event doesn't handle mod_proxy kept alive connections, those >> are client connections. >> >>> >>> On another run, I got a stacktrace of the child: >>> >>> Call graph: >>> 264 Thread_36766542 DispatchQueue_1: com.apple.main-thread (serial) >>> + 264 start (in libdyld.dylib) + 1 [0x7fff2032cf5d] >>> + 264 main (in httpd) + 2278 [0x104729b86] main.c:862 >>> + 264 ap_run_mpm (in httpd) + 75 [0x10473917b] mpm_common.c:100 >>> + 264 event_run (in mod_mpm_event.so) + 2994 [0x10502ae62] >>> event.c:3398 >>> + 264 make_child (in mod_mpm_event.so) + 436 [0x10502bbc4] >>> event.c:2997 >>> + 264 child_main (in mod_mpm_event.so) + 1734 [0x10502c2e6] >>> event.c:2924 >>> + 264 join_workers (in mod_mpm_event.so) + 386 >>> [0x10502cc72] event.c:2717 >>> + 264 apr_thread_join (in libapr-2.0.dylib) + 44 >>> [0x1048b347c] thread.c:256 >>> + 264 _pthread_join (in libsystem_pthread.dylib) + 362 >>> [0x7fff20312f60] >>> + 264 __ulock_wait (in libsystem_kernel.dylib) + 10 >>> [0x7fff202dd9ee] >> >> OK, that's where the child waits for worker threads. >> >>> 264 Thread_36766548 >>> + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>> + 264 _pthread_start (in libsystem_pthread.dylib) + 224 >>> [0x7fff203118fc] >>> + 264 dummy_worker (in l
Re: child exit on self-proxy
> Am 07.07.2021 um 13:51 schrieb Yann Ylavic : > > On Wed, Jul 7, 2021 at 12:16 PM Stefan Eissing > wrote: >> >> I added a TRACE1 log in event.c before/after join_workers (line 2921) and >> see: >> >> [Wed Jul 07 10:06:03.144044 2021] [mpm_event:trace1] [pid 72886:tid >> 4493635072] event.c(2921): graceful termination received, join workers >> [Wed Jul 07 10:06:03.144213 2021] [mpm_event:trace1] [pid 72886:tid >> 123145435672576] event.c(1799): All workers are busy or dying, will close 1 >> keep-alive connections >> [Wed Jul 07 10:06:08.079690 2021] [mpm_event:trace1] [pid 72886:tid >> 123145435672576] event.c(1799): All workers are busy or dying, will close 0 >> keep-alive connections >> [Wed Jul 07 10:06:10.78 2021] [core:warn] [pid 72813:tid 4493635072] >> AH00045: child process 72886 still did not exit, sending a SIGTERM >> [Wed Jul 07 10:06:12.789335 2021] [core:warn] [pid 72813:tid 4493635072] >> AH00045: child process 72886 still did not exit, sending a SIGTERM >> [Wed Jul 07 10:06:14.791281 2021] [core:warn] [pid 72813:tid 4493635072] >> AH00045: child process 72886 still did not exit, sending a SIGTERM >> [Wed Jul 07 10:06:16.792983 2021] [core:error] [pid 72813:tid 4493635072] >> AH00046: child process 72886 still did not exit, sending a SIGKILL >> >> So, I assume the keep-alive connection is the mod_proxy_http connection to >> the sever itself. Since the join_workers() never returns, there seems to be >> a thread not finishing. > > No, MPM event doesn't handle mod_proxy kept alive connections, those > are client connections. > >> >> On another run, I got a stacktrace of the child: >> >> Call graph: >>264 Thread_36766542 DispatchQueue_1: com.apple.main-thread (serial) >>+ 264 start (in libdyld.dylib) + 1 [0x7fff2032cf5d] >>+ 264 main (in httpd) + 2278 [0x104729b86] main.c:862 >>+ 264 ap_run_mpm (in httpd) + 75 [0x10473917b] mpm_common.c:100 >>+ 264 event_run (in mod_mpm_event.so) + 2994 [0x10502ae62] >> event.c:3398 >>+ 264 make_child (in mod_mpm_event.so) + 436 [0x10502bbc4] >> event.c:2997 >>+ 264 child_main (in mod_mpm_event.so) + 1734 [0x10502c2e6] >> event.c:2924 >>+ 264 join_workers (in mod_mpm_event.so) + 386 >> [0x10502cc72] event.c:2717 >>+ 264 apr_thread_join (in libapr-2.0.dylib) + 44 >> [0x1048b347c] thread.c:256 >>+ 264 _pthread_join (in libsystem_pthread.dylib) + 362 >> [0x7fff20312f60] >>+ 264 __ulock_wait (in libsystem_kernel.dylib) + 10 >> [0x7fff202dd9ee] > > OK, that's where the child waits for worker threads. > >>264 Thread_36766548 >>+ 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>+ 264 _pthread_start (in libsystem_pthread.dylib) + 224 >> [0x7fff203118fc] >>+ 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] >> thread.c:148 >>+ 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] >> h2_workers.c:260 >>+ 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 >> [0x7fff20311e49] >>+ 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 >> [0x7fff202decde] >>264 Thread_36766549 >>+ 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>+ 264 _pthread_start (in libsystem_pthread.dylib) + 224 >> [0x7fff203118fc] >>+ 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] >> thread.c:148 >>+ 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] >> h2_workers.c:260 >>+ 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 >> [0x7fff20311e49] >>+ 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 >> [0x7fff202decde] >>264 Thread_36766550 >>+ 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>+ 264 _pthread_start (in libsystem_pthread.dylib) + 224 >> [0x7fff203118fc] >>+ 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] >> thread.c:148 >>+ 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] >> h2_workers.c:260 >>+ 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 >> [0x7fff20311e49] >>+ 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 >> [0x7fff202decde] >>264 Thread_36766551 >>+ 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>+ 264 _pthread_start (in libsystem_pthread.dylib) + 224 >> [0x7fff203118fc] >>+ 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] >> thread.c:148 >>+ 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] >> h2_workers.c:260 >>+ 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 >> [0x7fff20311e49] >>+ 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 >> [0x7fff202decde] > > The above mod_http2 threads are not accounted for by the MPM
Re: child exit on self-proxy
Was busy on other things. Will try to get answers... > Am 07.07.2021 um 13:51 schrieb Yann Ylavic : > > On Wed, Jul 7, 2021 at 12:16 PM Stefan Eissing > wrote: >> >> I added a TRACE1 log in event.c before/after join_workers (line 2921) and >> see: >> >> [Wed Jul 07 10:06:03.144044 2021] [mpm_event:trace1] [pid 72886:tid >> 4493635072] event.c(2921): graceful termination received, join workers >> [Wed Jul 07 10:06:03.144213 2021] [mpm_event:trace1] [pid 72886:tid >> 123145435672576] event.c(1799): All workers are busy or dying, will close 1 >> keep-alive connections >> [Wed Jul 07 10:06:08.079690 2021] [mpm_event:trace1] [pid 72886:tid >> 123145435672576] event.c(1799): All workers are busy or dying, will close 0 >> keep-alive connections >> [Wed Jul 07 10:06:10.78 2021] [core:warn] [pid 72813:tid 4493635072] >> AH00045: child process 72886 still did not exit, sending a SIGTERM >> [Wed Jul 07 10:06:12.789335 2021] [core:warn] [pid 72813:tid 4493635072] >> AH00045: child process 72886 still did not exit, sending a SIGTERM >> [Wed Jul 07 10:06:14.791281 2021] [core:warn] [pid 72813:tid 4493635072] >> AH00045: child process 72886 still did not exit, sending a SIGTERM >> [Wed Jul 07 10:06:16.792983 2021] [core:error] [pid 72813:tid 4493635072] >> AH00046: child process 72886 still did not exit, sending a SIGKILL >> >> So, I assume the keep-alive connection is the mod_proxy_http connection to >> the sever itself. Since the join_workers() never returns, there seems to be >> a thread not finishing. > > No, MPM event doesn't handle mod_proxy kept alive connections, those > are client connections. > >> >> On another run, I got a stacktrace of the child: >> >> Call graph: >>264 Thread_36766542 DispatchQueue_1: com.apple.main-thread (serial) >>+ 264 start (in libdyld.dylib) + 1 [0x7fff2032cf5d] >>+ 264 main (in httpd) + 2278 [0x104729b86] main.c:862 >>+ 264 ap_run_mpm (in httpd) + 75 [0x10473917b] mpm_common.c:100 >>+ 264 event_run (in mod_mpm_event.so) + 2994 [0x10502ae62] >> event.c:3398 >>+ 264 make_child (in mod_mpm_event.so) + 436 [0x10502bbc4] >> event.c:2997 >>+ 264 child_main (in mod_mpm_event.so) + 1734 [0x10502c2e6] >> event.c:2924 >>+ 264 join_workers (in mod_mpm_event.so) + 386 >> [0x10502cc72] event.c:2717 >>+ 264 apr_thread_join (in libapr-2.0.dylib) + 44 >> [0x1048b347c] thread.c:256 >>+ 264 _pthread_join (in libsystem_pthread.dylib) + 362 >> [0x7fff20312f60] >>+ 264 __ulock_wait (in libsystem_kernel.dylib) + 10 >> [0x7fff202dd9ee] > > OK, that's where the child waits for worker threads. > >>264 Thread_36766548 >>+ 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>+ 264 _pthread_start (in libsystem_pthread.dylib) + 224 >> [0x7fff203118fc] >>+ 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] >> thread.c:148 >>+ 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] >> h2_workers.c:260 >>+ 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 >> [0x7fff20311e49] >>+ 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 >> [0x7fff202decde] >>264 Thread_36766549 >>+ 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>+ 264 _pthread_start (in libsystem_pthread.dylib) + 224 >> [0x7fff203118fc] >>+ 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] >> thread.c:148 >>+ 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] >> h2_workers.c:260 >>+ 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 >> [0x7fff20311e49] >>+ 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 >> [0x7fff202decde] >>264 Thread_36766550 >>+ 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>+ 264 _pthread_start (in libsystem_pthread.dylib) + 224 >> [0x7fff203118fc] >>+ 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] >> thread.c:148 >>+ 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] >> h2_workers.c:260 >>+ 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 >> [0x7fff20311e49] >>+ 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 >> [0x7fff202decde] >>264 Thread_36766551 >>+ 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] >>+ 264 _pthread_start (in libsystem_pthread.dylib) + 224 >> [0x7fff203118fc] >>+ 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] >> thread.c:148 >>+ 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] >> h2_workers.c:260 >>+ 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 >> [0x7fff20311e49] >>+ 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 >> [0x7fff202decde] > > The above
Re: migration of the HTTPD project website
On 07/07/2021 14.14, Eric Covener wrote: https://bz.apache.org/bugzilla/show_bug.cgi?id=65439 whoops, looks like the manual links from for the following are not working: https://httpd.apache.org/mod_ftp/ https://httpd.apache.org/mod_cgid/ Daniel, do you recall how this worked before? Maybe we lost some of this extra content when you moved where httpd.apache.org points? I know where it originally pointed too, and I'll try to weave that into the new site checkout. Should apply within 30 minutes. On Mon, Jul 5, 2021 at 4:15 AM Yann Ylavic wrote: On Mon, Jul 5, 2021 at 9:56 AM Ruediger Pluem wrote: I don't think that we want to release them. I am fine with omitting trunk from the path. Anyone objected to svn mv https://svn.apache.org/repos/asf/httpd/site/trunk/tools https://svn.apache.org/repos/asf/httpd/dev-tools/ ? +1 Regards; Yann. -- Eric Covener cove...@gmail.com
Re: migration of the HTTPD project website
https://bz.apache.org/bugzilla/show_bug.cgi?id=65439 whoops, looks like the manual links from for the following are not working: https://httpd.apache.org/mod_ftp/ https://httpd.apache.org/mod_cgid/ Daniel, do you recall how this worked before? Maybe we lost some of this extra content when you moved where httpd.apache.org points? On Mon, Jul 5, 2021 at 4:15 AM Yann Ylavic wrote: > > On Mon, Jul 5, 2021 at 9:56 AM Ruediger Pluem wrote: > > > > I don't think that we want to release them. I am fine with omitting trunk > > from the path. Anyone objected to > > > > svn mv https://svn.apache.org/repos/asf/httpd/site/trunk/tools > > https://svn.apache.org/repos/asf/httpd/dev-tools/ > > > > ? > > +1 > > Regards; > Yann. -- Eric Covener cove...@gmail.com
Re: child exit on self-proxy
On Wed, Jul 7, 2021 at 12:16 PM Stefan Eissing wrote: > > I added a TRACE1 log in event.c before/after join_workers (line 2921) and see: > > [Wed Jul 07 10:06:03.144044 2021] [mpm_event:trace1] [pid 72886:tid > 4493635072] event.c(2921): graceful termination received, join workers > [Wed Jul 07 10:06:03.144213 2021] [mpm_event:trace1] [pid 72886:tid > 123145435672576] event.c(1799): All workers are busy or dying, will close 1 > keep-alive connections > [Wed Jul 07 10:06:08.079690 2021] [mpm_event:trace1] [pid 72886:tid > 123145435672576] event.c(1799): All workers are busy or dying, will close 0 > keep-alive connections > [Wed Jul 07 10:06:10.78 2021] [core:warn] [pid 72813:tid 4493635072] > AH00045: child process 72886 still did not exit, sending a SIGTERM > [Wed Jul 07 10:06:12.789335 2021] [core:warn] [pid 72813:tid 4493635072] > AH00045: child process 72886 still did not exit, sending a SIGTERM > [Wed Jul 07 10:06:14.791281 2021] [core:warn] [pid 72813:tid 4493635072] > AH00045: child process 72886 still did not exit, sending a SIGTERM > [Wed Jul 07 10:06:16.792983 2021] [core:error] [pid 72813:tid 4493635072] > AH00046: child process 72886 still did not exit, sending a SIGKILL > > So, I assume the keep-alive connection is the mod_proxy_http connection to > the sever itself. Since the join_workers() never returns, there seems to be a > thread not finishing. No, MPM event doesn't handle mod_proxy kept alive connections, those are client connections. > > On another run, I got a stacktrace of the child: > > Call graph: > 264 Thread_36766542 DispatchQueue_1: com.apple.main-thread (serial) > + 264 start (in libdyld.dylib) + 1 [0x7fff2032cf5d] > + 264 main (in httpd) + 2278 [0x104729b86] main.c:862 > + 264 ap_run_mpm (in httpd) + 75 [0x10473917b] mpm_common.c:100 > + 264 event_run (in mod_mpm_event.so) + 2994 [0x10502ae62] > event.c:3398 > + 264 make_child (in mod_mpm_event.so) + 436 [0x10502bbc4] > event.c:2997 > + 264 child_main (in mod_mpm_event.so) + 1734 [0x10502c2e6] > event.c:2924 > + 264 join_workers (in mod_mpm_event.so) + 386 > [0x10502cc72] event.c:2717 > + 264 apr_thread_join (in libapr-2.0.dylib) + 44 > [0x1048b347c] thread.c:256 > + 264 _pthread_join (in libsystem_pthread.dylib) + 362 > [0x7fff20312f60] > + 264 __ulock_wait (in libsystem_kernel.dylib) + 10 > [0x7fff202dd9ee] OK, that's where the child waits for worker threads. > 264 Thread_36766548 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] > h2_workers.c:260 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] > 264 Thread_36766549 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] > h2_workers.c:260 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] > 264 Thread_36766550 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] > h2_workers.c:260 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] > 264 Thread_36766551 > + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] > + 264 _pthread_start (in libsystem_pthread.dylib) + 224 > [0x7fff203118fc] > + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] > thread.c:148 > + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] > h2_workers.c:260 > + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 > [0x7fff20311e49] > + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 > [0x7fff202decde] The above mod_http2 threads are not accounted for by the MPM I suppose (they should be stopped by the future child_exit hook, but still the MPM will only care about its own worker threads, i.e. thos
Re: child exit on self-proxy
> Am 07.07.2021 um 11:49 schrieb Ruediger Pluem : > > > > On 7/7/21 11:45 AM, Stefan Eissing wrote: >> In my h2 test suite, I do a setup where I use proxy configs against the >> server itself. We seem to have a problem performing a clean child exit with >> that. Test in 2.4.48 and trunk: >> - run tests with several graceful restarts >> - no proxied request, clean exit >> - with proxied requests >> AH00045: child process 53682 still did not exit, sending a SIGTERM >> AH00045: child process 53682 still did not exit, sending a SIGTERM >> [often stops here, sometimes] >> ... >> AH00046: child process 53682 still did not exit, sending a SIGKILL >> >> Question: >> - is such a test setup doomed to fail in general? >> - are we sure that we cannot encounter such states in "normal" setups? >> >> If someone wants at log at whatever LogLevels, let me know. It's seems >> highly reproducible. > > Do you have stack traces where these processes are hanging and a simple > config that causes this? I added a TRACE1 log in event.c before/after join_workers (line 2921) and see: [Wed Jul 07 10:06:03.144044 2021] [mpm_event:trace1] [pid 72886:tid 4493635072] event.c(2921): graceful termination received, join workers [Wed Jul 07 10:06:03.144213 2021] [mpm_event:trace1] [pid 72886:tid 123145435672576] event.c(1799): All workers are busy or dying, will close 1 keep-alive connections [Wed Jul 07 10:06:08.079690 2021] [mpm_event:trace1] [pid 72886:tid 123145435672576] event.c(1799): All workers are busy or dying, will close 0 keep-alive connections [Wed Jul 07 10:06:10.78 2021] [core:warn] [pid 72813:tid 4493635072] AH00045: child process 72886 still did not exit, sending a SIGTERM [Wed Jul 07 10:06:12.789335 2021] [core:warn] [pid 72813:tid 4493635072] AH00045: child process 72886 still did not exit, sending a SIGTERM [Wed Jul 07 10:06:14.791281 2021] [core:warn] [pid 72813:tid 4493635072] AH00045: child process 72886 still did not exit, sending a SIGTERM [Wed Jul 07 10:06:16.792983 2021] [core:error] [pid 72813:tid 4493635072] AH00046: child process 72886 still did not exit, sending a SIGKILL So, I assume the keep-alive connection is the mod_proxy_http connection to the sever itself. Since the join_workers() never returns, there seems to be a thread not finishing. On another run, I got a stacktrace of the child: Call graph: 264 Thread_36766542 DispatchQueue_1: com.apple.main-thread (serial) + 264 start (in libdyld.dylib) + 1 [0x7fff2032cf5d] + 264 main (in httpd) + 2278 [0x104729b86] main.c:862 + 264 ap_run_mpm (in httpd) + 75 [0x10473917b] mpm_common.c:100 + 264 event_run (in mod_mpm_event.so) + 2994 [0x10502ae62] event.c:3398 + 264 make_child (in mod_mpm_event.so) + 436 [0x10502bbc4] event.c:2997 + 264 child_main (in mod_mpm_event.so) + 1734 [0x10502c2e6] event.c:2924 + 264 join_workers (in mod_mpm_event.so) + 386 [0x10502cc72] event.c:2717 + 264 apr_thread_join (in libapr-2.0.dylib) + 44 [0x1048b347c] thread.c:256 + 264 _pthread_join (in libsystem_pthread.dylib) + 362 [0x7fff20312f60] + 264 __ulock_wait (in libsystem_kernel.dylib) + 10 [0x7fff202dd9ee] 264 Thread_36766548 + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] + 264 _pthread_start (in libsystem_pthread.dylib) + 224 [0x7fff203118fc] + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] thread.c:148 + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] h2_workers.c:260 + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 [0x7fff20311e49] + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fff202decde] 264 Thread_36766549 + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] + 264 _pthread_start (in libsystem_pthread.dylib) + 224 [0x7fff203118fc] + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] thread.c:148 + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] h2_workers.c:260 + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 [0x7fff20311e49] + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fff202decde] 264 Thread_36766550 + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] + 264 _pthread_start (in libsystem_pthread.dylib) + 224 [0x7fff203118fc] + 264 dummy_worker (in libapr-2.0.dylib) + 30 [0x1048b33ee] thread.c:148 + 264 slot_run (in mod_http2.so) + 189 [0x104f99a8d] h2_workers.c:260 + 264 _pthread_cond_wait (in libsystem_pthread.dylib) + 1298 [0x7fff20311e49] + 264 __psynch_cvwait (in libsystem_kernel.dylib) + 10 [0x7fff202decde] 264 Thread_36766551 + 264 thread_start (in libsystem_pthread.dylib) + 15 [0x7fff2030d443] +
Re: child exit on self-proxy
On 7/7/21 11:45 AM, Stefan Eissing wrote: > In my h2 test suite, I do a setup where I use proxy configs against the > server itself. We seem to have a problem performing a clean child exit with > that. Test in 2.4.48 and trunk: > - run tests with several graceful restarts > - no proxied request, clean exit > - with proxied requests > AH00045: child process 53682 still did not exit, sending a SIGTERM > AH00045: child process 53682 still did not exit, sending a SIGTERM > [often stops here, sometimes] > ... > AH00046: child process 53682 still did not exit, sending a SIGKILL > > Question: > - is such a test setup doomed to fail in general? > - are we sure that we cannot encounter such states in "normal" setups? > > If someone wants at log at whatever LogLevels, let me know. It's seems highly > reproducible. Do you have stack traces where these processes are hanging and a simple config that causes this? Regards Rüdiger
child exit on self-proxy
In my h2 test suite, I do a setup where I use proxy configs against the server itself. We seem to have a problem performing a clean child exit with that. Test in 2.4.48 and trunk: - run tests with several graceful restarts - no proxied request, clean exit - with proxied requests AH00045: child process 53682 still did not exit, sending a SIGTERM AH00045: child process 53682 still did not exit, sending a SIGTERM [often stops here, sometimes] ... AH00046: child process 53682 still did not exit, sending a SIGKILL Question: - is such a test setup doomed to fail in general? - are we sure that we cannot encounter such states in "normal" setups? If someone wants at log at whatever LogLevels, let me know. It's seems highly reproducible. - Stefan
Re: svn commit: r1891321 - in /httpd/test/framework/trunk/t: conf/proxy.conf.in modules/proxy_balancer.t
On 07/07/2021 09:00, Ruediger Pluem wrote: On 7/6/21 6:56 PM, jfcl...@apache.org wrote: Author: jfclere Date: Tue Jul 6 16:56:47 2021 New Revision: 1891321 URL: http://svn.apache.org/viewvc?rev=1891321&view=rev Log: Add tests for dynamic part of mod_proxy_balancer. Modified: httpd/test/framework/trunk/t/conf/proxy.conf.in httpd/test/framework/trunk/t/modules/proxy_balancer.t Modified: httpd/test/framework/trunk/t/modules/proxy_balancer.t URL: http://svn.apache.org/viewvc/httpd/test/framework/trunk/t/modules/proxy_balancer.t?rev=1891321&r1=1891320&r2=1891321&view=diff == --- httpd/test/framework/trunk/t/modules/proxy_balancer.t (original) +++ httpd/test/framework/trunk/t/modules/proxy_balancer.t Tue Jul 6 16:56:47 2021 @@ -54,4 +87,39 @@ foreach my $t (@echos) { skip $skipbodyfailover, t_cmp($r->content, $t, "response body echoed"); } +# test dynamic part +$r = GET("/balancer-manager"); +ok t_cmp($r->code, 200, "Can't find balancer-manager"); + +# get the nonce and add a worker +my $result = GetNonce("/balancer-manager", "dynproxy"); + +my $query = "b_lbm=byrequests&b_tmo=0&b_max=0&b_sforce=0&b_ss=&b_nwrkr=ajp%3A%2F%2F%5B0%3A0%3A0%3A0%3A0%3A0%3A0%3A1%5D%3A8080&b_wyes=1&b=dynproxy&nonce=" . $result; +my @proxy_balancer_headers; +my $vars = Apache::Test::vars(); +push @proxy_balancer_headers, "Referer" => "http://"; . $vars->{servername} . ":" . $vars->{port} . "/balancer-manager"; + +# First try with the referer it should fail. s/with/without/ ? Oops fixed, thanks. +if (have_min_apache_version("2.4.41")) { + $r = POST("/balancer-manager", content => $query); + ok t_cmp($r->code, 200, "request failed"); + ok !t_cmp($r->content, qr/ajp/, "AJP worker created"); +} Regards Rüdiger -- Cheers Jean-Frederic
Re: svn commit: r1891321 - in /httpd/test/framework/trunk/t: conf/proxy.conf.in modules/proxy_balancer.t
On 7/6/21 6:56 PM, jfcl...@apache.org wrote: > Author: jfclere > Date: Tue Jul 6 16:56:47 2021 > New Revision: 1891321 > > URL: http://svn.apache.org/viewvc?rev=1891321&view=rev > Log: > Add tests for dynamic part of mod_proxy_balancer. > > Modified: > httpd/test/framework/trunk/t/conf/proxy.conf.in > httpd/test/framework/trunk/t/modules/proxy_balancer.t > > Modified: httpd/test/framework/trunk/t/modules/proxy_balancer.t > URL: > http://svn.apache.org/viewvc/httpd/test/framework/trunk/t/modules/proxy_balancer.t?rev=1891321&r1=1891320&r2=1891321&view=diff > == > --- httpd/test/framework/trunk/t/modules/proxy_balancer.t (original) > +++ httpd/test/framework/trunk/t/modules/proxy_balancer.t Tue Jul 6 16:56:47 > 2021 > @@ -54,4 +87,39 @@ foreach my $t (@echos) { > skip $skipbodyfailover, t_cmp($r->content, $t, "response body echoed"); > } > > +# test dynamic part > +$r = GET("/balancer-manager"); > +ok t_cmp($r->code, 200, "Can't find balancer-manager"); > + > +# get the nonce and add a worker > +my $result = GetNonce("/balancer-manager", "dynproxy"); > + > +my $query = > "b_lbm=byrequests&b_tmo=0&b_max=0&b_sforce=0&b_ss=&b_nwrkr=ajp%3A%2F%2F%5B0%3A0%3A0%3A0%3A0%3A0%3A0%3A1%5D%3A8080&b_wyes=1&b=dynproxy&nonce=" > . $result; > +my @proxy_balancer_headers; > +my $vars = Apache::Test::vars(); > +push @proxy_balancer_headers, "Referer" => "http://"; . $vars->{servername} . > ":" . $vars->{port} . "/balancer-manager"; > + > +# First try with the referer it should fail. s/with/without/ ? > +if (have_min_apache_version("2.4.41")) { > + $r = POST("/balancer-manager", content => $query); > + ok t_cmp($r->code, 200, "request failed"); > + ok !t_cmp($r->content, qr/ajp/, "AJP worker created"); > +} > Regards Rüdiger