Hello,

I'm running apache 2.2.24 (worker MPM) with mod_jk 1.2.37 under Solaris 11, 
compiled as follows (from config.log):

--with-included-apr --with-mpm=worker --enable-so --enable-rewrite 
--enable-headers --enable-proxy --enable-proxy-http --enable-expires 
--enable-nonportable-atomics=yes --disable-include --disable-autoindex 
--disable-imap --disable-userdir CC=/usr/sfw/bin/gcc

We are running Tomcat 7.0.32.

Since moving to Solaris 11 I'm noticing over time that apache children are 
getting left in an idle state (and usually not showing up on the scoreboard at 
all) when doing graceful restarts.  If I do a hard restart, the error_log notes 
that the process had to be forcibly killed:

[Wed May 15 11:41:24 2013] [warn] child process 10057 still did not exit, 
sending a SIGTERM
[Wed May 15 11:41:26 2013] [error] child process 10057 still did not exit, 
sending a SIGKILL

If I let apache go unchecked, it will eventually stop passing traffic 
completely and a hard restart is required.  Example ps output looks like this:

nobody 24429 20925   0 11:43:59 ?           0:02 /usr/local/apache2/bin/httpd 
-k start
nobody  9750 20925   0 23:59:02 ?           0:00 /usr/local/apache2/bin/httpd 
-k start
nobody 20925  2440   0   May 15 ?           3:07 /usr/local/apache2/bin/httpd 
-k start
nobody 24689 20925   0 11:47:52 ?           0:00 /usr/local/apache2/bin/httpd 
-k start
nobody 24628 20925   0 11:46:18 ?           0:01 /usr/local/apache2/bin/httpd 
-k start
nobody 24428 20925   0 11:43:39 ?           0:02 /usr/local/apache2/bin/httpd 
-k start

Note PID 9750 is lingering, doing nothing according to pfiles and truss, and 
its timestamp coincides with the last graceful restart (log rotation).  Two 
main differences between this web server and ones that are working include:

a) This is Solaris 11 (vs. Solaris 10)
b) I have hardened apache by putting it in a Solaris 11 zone, and I'm starting 
apache as the "nobody" user with the net_privaddr privilege so it can function 
as the parent process.  It talks to Tomcat on another zone and everything works 
great (other than the problem described here).

Apache has permission to write to /logs, and /log/apache2 is where I set these:

JkLogFile /logs/apache2/mod_jk.log
JkShmFile /logs/apache2/jk-runtime-status

And this.
PidFile /logs/apache2/run/httpd.pid


Can anyone think of a reason why children are not being recycled or getting 
stranded like this over successive graceful restarts?  We do use multiple 
listeners, so I don't know if I'm dealing with a locking/mutex/serialization 
type of issue.  I'm not a C programmer.  There seems to be little info out 
there for Solaris platforms that's recent.  

I'd be happy to post more info if needed.  I appreciate your time.


Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to