[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 Graham Leggett changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #19 from Graham Leggett --- Backported to 2.4.52. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 Jean Traullé changed: What|Removed |Added CC||jtrau...@opencomp.fr -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 Arkadiusz Miskiewicz changed: What|Removed |Added CC||ar...@maven.pl -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #18 from Yann Ylavic --- (In reply to acmondor from comment #16) > > From that, the reason that mod_itk is not in the stack trace is because its > hook has run its course and returned to prefork_run in mpm-prefork. Ah indeed, I misread the mpm_itk code and thought that ap_lingering_close() was also called explicitely in the parent process. Thanks for testing, I'll propose a backport to 2.4.x ASAP. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #17 from acmondor --- Yann's patch works fine with httpd 2.4.51 on Gentoo. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #16 from acmondor --- Created attachment 38065 --> https://bz.apache.org/bugzilla/attachment.cgi?id=38065=edit error log with additional info from mpm_itk Out of curiousity I added some additional error log statement to mod_itk to answer the question "why stack trace does not seem to involve mod_itk". The result is shown in the attached log file. The lines with 'itk_fork_process:' are from the itk_fork_process function shown above in comment #3, their locations should be obvious. >From that, the reason that mod_itk is not in the stack trace is because its hook has run its course and returned to prefork_run in mpm-prefork. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #15 from Ruediger Pluem --- (In reply to Yann Ylavic from comment #10) > I checked r1894171 in trunk but it would be useful to hear from Jean in > comment 2. > > Jean, your stack trace does not seem to involve mod_itk but simply > mod_prefork, am I correct? Do you know which other module sets the socket to > NULL? This remains somewhat a mystery. Hence the same data provided as for mpm-itk case would be very helpful > > The fix in trunk allows for the socket to be NULL in ap_lingering_close() > but not in ap_start_lingering_close() which is called by mpm_event only > (supposedly), so there I added an ap_assert() to catch this unexpected > situation (it will kill the process should that happen), but it won't help > if any module could be run by mpm_event and set the socket to NULL.. I guess this should be fine. We may run into a lot of trouble with event MPM if people fiddle around with the socket on their own. Hence failing with the assert seems better then strange hard to debug other issues. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #14 from Ruediger Pluem --- Thanks. My understanding now is that this only happens with mpm-itk and the reason why this happens in this case seems to be clear from the analysis of Yann. His patch should fix this. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 acmondor changed: What|Removed |Added Attachment #38063|0 |1 is obsolete|| --- Comment #13 from acmondor --- Created attachment 38064 --> https://bz.apache.org/bugzilla/attachment.cgi?id=38064=edit server info output The html output didn't come through properly, so here's the same info as plain text. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #12 from acmondor --- Created attachment 38063 --> https://bz.apache.org/bugzilla/attachment.cgi?id=38063=edit server info output -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 acmondor changed: What|Removed |Added Status|NEEDINFO|NEW --- Comment #11 from acmondor --- Here is the additional info requested. First the the gdb info: Core was generated by `/usr/sbin/apache2 -D INFO -D SSL -D PHP -D MPM_ITK -D STATUS -D SECURITY -d /us'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x7f26d402fdac in apr_socket_close (thesocket=0x0) at network_io/unix/sockets.c:213 213 return apr_pool_cleanup_run(thesocket->pool, thesocket, socket_cleanup); (gdb) (gdb) bt #0 0x7f26d402fdac in apr_socket_close (thesocket=0x0) at network_io/unix/sockets.c:213 #1 0x563794dd8292 in ap_lingering_close (c=0x563797134010) at connection.c:159 #2 0x563794df5829 in child_main (child_num_arg=4, child_bucket=0) at prefork.c:655 #3 0x563794df5a87 in make_child (s=0x563796e4e888, slot=4) at prefork.c:756 #4 0x563794df5ae7 in startup_children (number_to_start=1) at prefork.c:774 #5 0x563794df6159 in prefork_run (_pconf=0x563796e253c8, plog=0x563796e52608, s=0x563796e4e888) at prefork.c:936 #6 0x563794d9ba43 in ap_run_mpm (pconf=0x563796e253c8, plog=0x563796e52608, s=0x563796e4e888) at mpm_common.c:95 #7 0x563794d915fc in main (argc=19, argv=0x7ffcb763ef38) at main.c:819 (gdb) (gdb) frame 1 #1 0x563794dd8292 in ap_lingering_close (c=0x563797134010) at connection.c:159 159 apr_socket_close(csd); (gdb) print *(c->conn_config) $1 = (gdb) print *(c) $2 = {pool = 0x563797133da8, base_server = 0x563796f456b0, vhost_lookup_data = 0x563796f56b18, local_addr = 0x563797133e70, client_addr = 0x563797133f30, client_ip = 0x563797134418 "192.168.1.13", remote_host = 0x0, remote_logname = 0x0, local_ip = 0x563797134408 "192.168.1.13", local_host = 0x0, id = 4, conn_config = 0x5637971340e0, notes = 0x563797134268, input_filters = 0x563797134440, output_filters = 0x5637971344b0, sbh = 0x563797131ef0, bucket_alloc = 0x563797138e58, cs = 0x0, data_in_input_filters = 0, data_in_output_filters = 0, clogging_input_filters = 0, double_reverse = 0, aborted = 0, keepalive = AP_CONN_UNKNOWN, keepalives = 0, log = 0x0, log_id = 0x0, current_thread = 0x563797131e10, master = 0x0, outgoing = 0} (gdb) print c->aborted $3 = 0 (gdb) x/100xw 0x5637971340e0 0x5637971340e0: 0x 0x 0x 0x 0x5637971340f0: 0x 0x 0x 0x 0x563797134100: 0x 0x 0x 0x 0x563797134110: 0x 0x 0x 0x 0x563797134120: 0x 0x 0x 0x 0x563797134130: 0x 0x 0x 0x 0x563797134140: 0x 0x 0x 0x 0x563797134150: 0x 0x 0x 0x 0x563797134160: 0x 0x 0x 0x 0x563797134170: 0x 0x 0x 0x 0x563797134180: 0x 0x 0x 0x 0x563797134190: 0x 0x 0x 0x 0x5637971341a0: 0x 0x 0x 0x 0x5637971341b0: 0x 0x 0x 0x 0x5637971341c0: 0x 0x 0x 0x 0x5637971341d0: 0x 0x 0x 0x 0x5637971341e0: 0x97134428 0x5637 0x 0x 0x5637971341f0: 0x 0x 0x 0x 0x563797134200: 0x 0x 0x 0x 0x563797134210: 0x 0x 0x 0x 0x563797134220: 0x 0x 0x 0x 0x563797134230: 0x 0x 0x 0x 0x563797134240: 0x 0x 0x 0x 0x563797134250: 0x 0x 0x 0x 0x563797134260: 0x 0x 0x97133da8 0x5637 Even after looking through the httpd code I couldn't figure how to get around that '' from 'print *(c->conn_config)', so I just did a memory dump. Here is the trace8 error log: [Tue Oct 12 10:54:04.886976 2021] [ELF1] [default] [default] [core:trace5] [pid 11324:tid protocol.c(711): [client 192.168.1.13:41666] Request received from client: GET / HTTP/1.1 [Tue Oct 12 10:54:04.887549 2021] [ELF1] [test001.acmondor.ca] [test001.acmondor.ca] [setenvif:trace4] [pid 11324:tid util_expr_eval.c(859): [client 192.168.1.13:41666] Evaluation of expression from /home/www.mac003/config/request_denied.include:84 gave: 0 [Tue Oct 12 10:54:04.887585 2021] [ELF1]
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 Yann Ylavic changed: What|Removed |Added Keywords||FixedInTrunk --- Comment #10 from Yann Ylavic --- I checked r1894171 in trunk but it would be useful to hear from Jean in comment 2. Jean, your stack trace does not seem to involve mod_itk but simply mod_prefork, am I correct? Do you know which other module sets the socket to NULL? The fix in trunk allows for the socket to be NULL in ap_lingering_close() but not in ap_start_lingering_close() which is called by mpm_event only (supposedly), so there I added an ap_assert() to catch this unexpected situation (it will kill the process should that happen), but it won't help if any module could be run by mpm_event and set the socket to NULL.. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #9 from Ruediger Pluem --- (In reply to Yann Ylavic from comment #8) > (In reply to Ruediger Pluem from comment #7) > > > > The question is, how we want to allow if at all another module to say that > > we should get out of the way with regards to lingering closes. Do we allow > > to set the socket to NULL via ap_set_core_module_config or do we demand > > that is has to set c->aborted to 1 as you suggest. > > Yeah indeed that's the question. Thinking more about it, c->aborted = 1 will > still call the output filter chain so in the case of mod_itk it may cause > issues (no request_rec in the forking/parent process). > We have supported the NULL socket so far so we probably still need to in > 2.4.x, mpm_prefork (which mpm_itk is still requiring AFAICT) will call > ap_lingering_close() after ap_process_connection() in any case, so it seems > that NULL socket is the only safe option for third-party modules as of now. Fair enough. Then I am fine with the patch. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #8 from Yann Ylavic --- (In reply to Ruediger Pluem from comment #7) > > The question is, how we want to allow if at all another module to say that > we should get out of the way with regards to lingering closes. Do we allow > to set the socket to NULL via ap_set_core_module_config or do we demand > that is has to set c->aborted to 1 as you suggest. Yeah indeed that's the question. Thinking more about it, c->aborted = 1 will still call the output filter chain so in the case of mod_itk it may cause issues (no request_rec in the forking/parent process). We have supported the NULL socket so far so we probably still need to in 2.4.x, mpm_prefork (which mpm_itk is still requiring AFAICT) will call ap_lingering_close() after ap_process_connection() in any case, so it seems that NULL socket is the only safe option for third-party modules as of now. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #7 from Ruediger Pluem --- (In reply to Yann Ylavic from comment #6) > (In reply to Yann Ylavic from comment #5) > > > > So I think that attachment 38061 [details] is the right thing to do, for > > compatibility. > > The other option is to change mpm_itk to: > - apr_socket_close(ap_get_conn_socket(c)); > - ap_set_core_module_config(c->conn_config, NULL); > + c->aborted = 1; The question is, how we want to allow if at all another module to say that we should get out of the way with regards to lingering closes. Do we allow to set the socket to NULL via ap_set_core_module_config or do we demand that is has to set c->aborted to 1 as you suggest. Depending on this we could modify the proposed patch to either make it an assert that csd != NULL or log an error message in case csd == NULL as this is not expected. If setting the socket to NULL is the accepted way though, the patch proposed here is correct as it is. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #6 from Yann Ylavic --- (In reply to Yann Ylavic from comment #5) > > So I think that attachment 38061 [details] is the right thing to do, for > compatibility. The other option is to change mpm_itk to: - apr_socket_close(ap_get_conn_socket(c)); - ap_set_core_module_config(c->conn_config, NULL); + c->aborted = 1; -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #5 from Yann Ylavic --- (In reply to Yann Ylavic from comment #3) > > So: > ap_set_core_module_config(c->conn_config, NULL); > is what sets csd to NULL in ap_lingering_close. But before r1891721, the apr_socket_close(csd) was protected by: AP_DECLARE(int) ap_start_lingering_close(conn_rec *c) { apr_socket_t *csd = ap_get_conn_socket(c); if (!csd) { return 1; } [...] return 0; } So I think that attachment 38061 is the right thing to do, for compatibility. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #4 from Ruediger Pluem --- Can you please issue the following gdb commands with this core dump: frame 1 print *(c->conn_config) print c->aborted Furthermore an error log which is set to trace8 around the point of time the crash happens could be quite helpful and if possible the request that triggered it. Can you also temporarily configure mod_info (http://httpd.apache.org/docs/2.4/mod/mod_info.html) for such a server and provide the output of the server info page? -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #3 from Yann Ylavic --- Does this always imply mpm_itk? I see that mpm_itk does this: ap_hook_process_connection(itk_fork_process, NULL, NULL, APR_HOOK_REALLY_FIRST); then: int itk_fork_process(conn_rec *c) { if (have_forked) { return DECLINED; } pid_t pid = fork(), child_pid; int status; switch (pid) { case -1: [...] return HTTP_INTERNAL_SERVER_ERROR; case 0: /* Child; runs processing as usual, then dies. * This is a bit tricky in that we need to run ap_run_process_connection() * even though we are a process_connection hook ourselves! * That is the only way we can exit cleanly after the hook * is done. Thus, we set have_forked to signal that we don't * want to end up in infinite recursion. */ have_forked = 1; ap_close_listeners(); ap_run_process_connection(c); ap_lingering_close(c); exit(0); default: /* parent; just wait for child to be done */ do { child_pid = waitpid(pid, , 0); } while (child_pid == -1 && errno == EINTR); [...] /* * It is important that ap_lingering_close() is called in the child * and not here, ... * However, we close the socket itself here so that we don't keep a * reference to it around, and then set the socket pointer to NULL so * that when prefork tries to close it, it goes into early exit. */ apr_socket_close(ap_get_conn_socket(c)); ap_set_core_module_config(c->conn_config, NULL); /* make sure the MPM does not process this connection */ return OK; } } So: ap_set_core_module_config(c->conn_config, NULL); is what sets csd to NULL in ap_lingering_close. -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 --- Comment #2 from Jean Weisbuch --- The exact same bug happens on 2.4.49 to 2.4.51, here is a full backtrace with 2.4.51 : Program received signal SIGSEGV, Segmentation fault. 0x7fcac0ea5113 in apr_socket_close (thesocket=0x0) at ./network_io/unix/sockets.c:183 183 ./network_io/unix/sockets.c: No such file or directory. (gdb) bt f #0 0x7fcac0ea5113 in apr_socket_close (thesocket=0x0) at ./network_io/unix/sockets.c:183 No locals. #1 0x558958ff7735 in ap_lingering_close (c=0x7fcac1699290) at connection.c:159 dummybuf = "\002", '\000' , "(\260i\301\312\177\000\000\360\241\",\374\177\000\000\020\027\373X\211U\000\000p\250\",\374\177", '\000' ... nbytes = 4294967295 now = 94048392017996 timeup = 0 csd = 0x0 #2 0x55895901392f in child_main (child_num_arg=9, child_bucket=0) at prefork.c:655 current_conn = 0x7fcac1699290 csd = 0x7fcac16990a0 thd = 0x7fcac169b0a0 osthd = 140508805886016 sig_mask = {__val = {0, 0, 0, 0, 0, 0, 0, 0, 0, 536870912, 0, 0, 94048392146393, 0, 0, 0}} ptrans = 0x7fcac1699028 allocator = 0x55895ab6ad20 status = 0 i = -1 lr = 0x7fcac1743380 pollset = 0x7fcac169b138 sbh = 0x7fcac169b130 bucket_alloc = 0x7fcac1695028 last_poll_idx = 1 lockfile = 0x0 #3 0x558959013b92 in make_child (s=0x7fcac173e328, slot=9) at prefork.c:756 bucket = 0 pid = 0 #4 0x558959013f6a in perform_idle_server_maintenance (p=0x7fcac176d028) at prefork.c:860 i = 1 idle_count = 8 ws = 0x7fcac16a0ad0 free_length = 4 free_slots = {8, 9, 10, 11, 0, 0, 0, 0, 0, 0, -1058364471, 32714, 0, 0, -1049187848, 32714, 199, 0, -1058360763, 32714, 0, 0, 0, 0, 1492850448, 21897, 1410853888, 1240662886, 740468064, 32764, 1492901478, 21897} last_non_dead = 7 total_non_dead = 8 #5 0x558959014718 in prefork_run (_pconf=0x7fcac176d028, plog=0x7fcac173a028, s=0x7fcac173e328) at prefork.c:1053 status = 1493285039 pid = {pid = -1, in = 0x49f303665417ec00, out = 0x55895901b8af, err = 0x7fcac16bc478} child_slot = -1058447294 exitwhy = (APR_PROC_EXIT | unknown: 21896) processed_status = 32714 index = 21897 remaining_children_to_start = 0 i = 32764 #6 0x558958fbd272 in ap_run_mpm (pconf=0x7fcac176d028, plog=0x7fcac173a028, s=0x7fcac173e328) at mpm_common.c:95 pHook = 0x7fcac16d71d8 n = 0 rv = -1 #7 0x558958fb3653 in main (argc=3, argv=0x7ffc2c22a878) at main.c:819 c = 0 '\000' showcompile = 0 showdirectives = 0 confname = 0x558959017f63 "conf/httpd.conf" def_server_root = 0x558959017f73 "/opt/apache" temp_error_log = 0x0 error = 0x0 process = 0x7fcac176f118 pconf = 0x7fcac176d028 plog = 0x7fcac173a028 ptemp = 0x7fcac173c028 pcommands = 0x7fcac1744028 opt = 0x7fcac1744118 rv = 0 mod = 0x558959237b40 opt_arg = 0x6562b026 signal_server = 0x558958ffb2ee rc = 0 -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 Ruediger Pluem changed: What|Removed |Added Status|NEW |NEEDINFO --- Comment #1 from Ruediger Pluem --- While the patch is useful, csd should not really be NULL here. There was an issue in 2.4.49 that caused these segfaults under certain conditions, but this was fixed in 2.4.50 via https://svn.apache.org/viewvc?view=revision=1893654. Can you please provide a stacktrace with 2.4.50 or 2.4.51 that shows the crash as something seems to wrong somewhere else and this might need fixing as well? -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org
[Bug 65627] apache httpd segfault on child exit
https://bz.apache.org/bugzilla/show_bug.cgi?id=65627 Sam James changed: What|Removed |Added CC||s...@gentoo.org -- You are receiving this mail because: You are the assignee for the bug. - To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org For additional commands, e-mail: bugs-h...@httpd.apache.org