https://bz.apache.org/bugzilla/show_bug.cgi?id=60487

            Bug ID: 60487
           Summary: Core dumps in mpm_event during graceful restart
           Product: Apache httpd-2
           Version: 2.4-HEAD
          Hardware: PC
                OS: FreeBSD
            Status: NEW
          Severity: normal
          Priority: P2
         Component: mpm_event
          Assignee: bugs@httpd.apache.org
          Reporter: apa...@wheelhouse.org
  Target Milestone: ---

Created attachment 34528
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34528&action=edit
Don't dereference retained if it isn't set yet.

It seems like if signals are sent to an httpd process in a particular
order/speed, Apache will segfault in mpm_event's ap_start_restart() function on
the line:

retained->is_graceful = graceful;

This behavior has been observed on a number of different systems at different
times.

When explored from gdb:

#0  0x00000008019dd2cc in ap_start_restart (graceful=1) at event.c:696
696         retained->is_graceful = graceful;
[New Thread 802006400 (LWP 101541/<unknown>)]
Current language:  auto; currently minimal
(gdb) print retained
$1 = (event_retained_data *) 0x0

The "retained" static variable is never explicitly set to NULL in the code.  It
is only directly assigned in two places, both in event_pre_config():

    retained = ap_retained_data_get(userdata_key);
    if (!retained) {
        retained = ap_retained_data_create(userdata_key, sizeof(*retained));

The processes that this happens to frequently crash after a couple of days, and
they are the top-level run-as-root parent process, so it is not a case where
the signal is coming in before the pointer has been allocated.

This doesn't appear to be related to receiving a restart signal while shutting
down; gdb reports shutdown_pending is not set:

(gdb) print shutdown_pending
$1 = 0

According to the call stack, this is happening from inside config parsing on
the LoadModule directive for mpm_event:

#0  0x00000008019dd2cc in ap_start_restart (graceful=1) at event.c:696
#1  0x00000008019dd27f in restart (sig=30) at event.c:706
#2  0x0000000801408b4a in pthread_sigmask () from /lib/libthr.so.3
#3  0x0000000801407c08 in pthread_getspecific () from /lib/libthr.so.3
#4  0x0000000801407abd in pthread_getspecific () from /lib/libthr.so.3
#5  0x000000080140cbd7 in pthread_timedjoin_np () from /lib/libthr.so.3
#6  0x00000008006c79fb in r_debug_state () from /libexec/ld-elf.so.1
#7  0x00000008006cc437 in _rtld_is_dlopened () from /libexec/ld-elf.so.1
#8  0x00000008006c8ea0 in dlopen () from /libexec/ld-elf.so.1
#9  0x0000000800fc0b00 in apr_dso_load () from /usr/local/lib/libapr-1.so.0
#10 0x0000000000493c40 in dso_load (cmd=0x7fffffffcfd0, 
    modhandlep=0x7fffffffca88, 
    filename=0x802095138 "libexec/mod_mpm_event.so", 
    used_filename=0x7fffffffca70) at mod_so.c:162
#11 0x0000000000493705 in load_module (cmd=0x7fffffffcfd0, 
    dummy=0x7fffffffce80, modname=0x802095120 "mpm_event_module", 
    filename=0x802095138 "libexec/mod_mpm_event.so") at mod_so.c:263
#12 0x0000000000478ef3 in invoke_cmd (cmd=0x4b5130, parms=0x7fffffffcfd0, 
    mconfig=0x7fffffffce80, args=0x80207b445 "") at config.c:923
#13 0x00000000004799c0 in execute_now (cmd_line=0x80207b4e0 "LoadModule", 
    args=0x80207b41b "mpm_event_module libexec/mod_mpm_event.so", 
    parms=0x7fffffffcfd0, p=0x802021028, ptemp=0x80207b028, 
    sub_tree=0x7fffffffce80, parent=0x0) at config.c:1688

This makes me think that what is happening is that two restart signals are
arriving in rapid succession.  The first initiates a restart, and then second
requests a restart after the previous restart has begun but before
event_pre_config() has initialized the retained variable in the newly-loaded
mod_mpm_event.so.

If that's the case, then it may be sufficient simply to check retained before
writing to it and just return if it is NULL (similar to what's done if
restart_pending is already set).  If so, the (trivial) attached patch
accomplishes that.

However, if a NULL value for retained indicates that the server hasn't finished
a previous restart, perhaps the check should be one line higher (above
"restart_pending = 1;") to short-circuit the second restart completely.

It's also entirely possible that there's much more going on here and the NULL
value for retained is indicative of a deeper problem.

If the simple solution is not the correct one, this is something I'm happy to
look into further and work to fix if someone would be willing to shove me in
the right direction.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org
For additional commands, e-mail: bugs-h...@httpd.apache.org

Reply via email to