https://issues.apache.org/bugzilla/show_bug.cgi?id=49504

           Summary: Solaris 10/x64 worker graceful restart problem
           Product: Apache httpd-2
           Version: 2.2.15
          Platform: Sun
        OS/Version: Solaris
            Status: NEW
          Severity: normal
          Priority: P2
         Component: worker
        AssignedTo: [email protected]
        ReportedBy: [email protected]


On Solaris 10 u8, HTTPD 2.2.15 occasionally has one child process hang during a
graceful restart.

Symptoms:
1. At debug-level logging, the error log shows:
[Wed Jun 23 14:38:21 2010] [debug] worker.c(1083): the listener thread didn't
exit

I understand this is not a major issue
(https://issues.apache.org/bugzilla/show_bug.cgi?id=9011), but provides insight
into execution.

2. pstack of the hanging child shows the main thread is hanging while shutting
down worker threads:

-----------------  lwp# 1 / thread# 1  --------------------
 fffffd7fff06cdea lwp_wait (3, fffffd7fffdff964)
 fffffd7fff063eee _thrp_join () + 3e
 fffffd7fff0640cc pthread_join () + 1c
 fffffd7fff27b195 apr_thread_join () + 25
 0000000000470a19 join_workers () + e9
 0000000000470de3 child_main () + 353
 0000000000471137 make_child () + 147
 0000000000471a6e ap_mpm_run () + 8be
 000000000042fd81 main () + 8b1
 000000000042f08c _start () + 6c
-----------------  lwp# 3 / thread# 3  --------------------
 fffffd7fff067527 lwp_park (0, 0, 0)
 fffffd7fff0610b9 cond_wait_queue () + 59
 fffffd7fff061647 _cond_wait () + 57
 fffffd7fff061676 cond_wait () + 26
 fffffd7fff0616b9 pthread_cond_wait () + 9
 0000000000472cc2 ap_queue_pop () + 72
 000000000047032d worker_thread () + 11d
 fffffd7fff06727b _thr_setup () + 5b
 fffffd7fff0674b0 _lwp_start ()
-----------------  lwp# 4 / thread# 4  --------------------
 fffffd7fff067527 lwp_park (0, 0, 0)
 fffffd7fff0610b9 cond_wait_queue () + 59
 fffffd7fff061647 _cond_wait () + 57
 fffffd7fff061676 cond_wait () + 26
 fffffd7fff0616b9 pthread_cond_wait () + 9
 0000000000472cc2 ap_queue_pop () + 72
 000000000047032d worker_thread () + 11d
 fffffd7fff06727b _thr_setup () + 5b
 fffffd7fff0674b0 _lwp_start ()

---SNIP---
...lots more threads in lwp_park(0, 0, 0)...
---SNIP---

-----------------  lwp# 28 / thread# 28  --------------------
 fffffd7fff06ce2a lwp_mutex_timedlock (fffffd7ffeee0000, 0)
 fffffd7fff05fb78 mutex_lock_internal () + 328
 fffffd7fff05ff62 mutex_lock_impl () + 112
 fffffd7fff06002b mutex_lock () + b
 fffffd7fff26e5a5 proc_mutex_proc_pthread_acquire () + 15
 000000000046ff4c listener_thread () + 3bc
 fffffd7fff06727b _thr_setup () + 5b
 fffffd7fff0674b0 _lwp_start ()

It appears that join_workers() is hanging on a call to apr_thread_join(...), in
line 1104 of worker.c.


HTTPD was compiled with Solaris's default GCC (3.4.3), with the following
flags:

CFLAGS="-O3 -m64 -march=athlon64"
LDFLAGS="-R$INSTALL_SSL/lib -L$INSTALL_SSL/lib"
./configure -C \
                --prefix=$INSTALL \
                --enable-mods-shared="deflate expires headers proxy proxy-ajp
proxy-balancer proxy-connect proxy-http rewrite ssl usertrack dav status
log-config logio" \
                -with-ssl=$INSTALL_SSL \
                --with-mpm=worker \
                --enable-nonportable-atomics 

Anything other information I can provide to diagnose this issue?

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to