Re: daedalus is running 2.0.31

2002-02-01 Thread jean-frederic clere

Ian Holsman wrote:
 
 Greg Ames wrote:
  ...since Thursday, 31-Jan-2002 19:04:06 PST.
 Cool.

Something hangs now on daedalus!

 we're running it for our developers internally starting tomorrow.
 
 
  Beside checking out the tag, it has the usual patch to save the input buffers
  for debugging, and a quick-n-dirty hack to exit the child without killing the
  parent if accept() gets ENFILE (system out of fd's).
 
  I did have to futz with the config file a bit.  The main thing was the change
  from mod_auth_db to mod_auth_dbm.  I hope the following is correct, please
  holler if not:
 
  @@ -762,7 +764,8 @@
  Options All
  /Directory
  Directory /da1/www/bugs.apache.org/private
  -  AuthDBUserFile /home/apmail/bugdbaccounts
  +  AuthDBMUserFile /home/apmail/bugdbaccounts
  +  AuthDBMType DB
 your running berkeleyDB ???
 if so that is the right config.
 
 AuthName ApacheBugDatabaseUsers
 AuthType Basic
 require valid-user
 
  Are there any changes to the utility to manage the passwords that Brian and
  Manoj should know about?
 
 htdbm has been changed as well so as to allow for multiple DB types
 I think it is -T DB as an extra parameter if you don't have berkeleyDB
 support as default.
 
 --Ian
 
 
  Greg
 
 



Re: daedalus is running 2.0.31

2002-02-01 Thread Greg Ames

jean-frederic clere wrote:
 
 Ian Holsman wrote:
 
  Greg Ames wrote:
   ...since Thursday, 31-Jan-2002 19:04:06 PST.
  Cool.
 
 Something hangs now on daedalus!

yep.  It ran for nearly 5 hours.  Then the clock struck midnight, a cron job
kicked off a graceful restart, and:

[Fri Feb 01 00:00:04 2002] [notice] Apache/2.0.31 (Unix) configured -- resuming
normal operations
[Fri Feb 01 00:00:04 2002] [notice] seg fault or similar nasty error detected in
the parent process

...and we were down until I woke up, took a look, momentarily panicked, then put
us back on 2.0.29 at Friday Feb 01 05:18:04 daedalus time.

At least we have a both a log message and a coredump now.  That wasn't the case
not too long ago.  The dump is /usr/local/apache2.0.31/corefiles/httpd.core.1 . 
The backtrace is pretty simple:

#0  0x806fdeb in find_child_by_pid (pid=0xbfbffa4c) at scoreboard.c:355
#1  0x8063e9a in ap_mpm_run (_pconf=0x8099010, plog=0x80bf010, s=0x809a8d8)
at prefork.c:1078
#2  0x8069676 in main (argc=1, argv=0xbfbffb30) at main.c:498
#3  0x805d95d in _start ()

Greg

(writes on the blackboard 50 times:)

I will test graceful restart
I will test graceful restart
I will test graceful restart
...



Re: daedalus is running 2.0.31

2002-02-01 Thread Aaron Bannert

On Fri, Feb 01, 2002 at 10:20:09AM -0500, Greg Ames wrote:
 
 yep.  It ran for nearly 5 hours.  Then the clock struck midnight, a cron job
 kicked off a graceful restart, and:

...all the horses turned back into mice... ;)


 [Fri Feb 01 00:00:04 2002] [notice] Apache/2.0.31 (Unix) configured -- resuming
 normal operations
 [Fri Feb 01 00:00:04 2002] [notice] seg fault or similar nasty error detected in
 the parent process
 
 ...and we were down until I woke up, took a look, momentarily panicked, then put
 us back on 2.0.29 at Friday Feb 01 05:18:04 daedalus time.
 
 At least we have a both a log message and a coredump now.  That wasn't the case
 not too long ago.  The dump is /usr/local/apache2.0.31/corefiles/httpd.core.1 . 
 The backtrace is pretty simple:
 
 #0  0x806fdeb in find_child_by_pid (pid=0xbfbffa4c) at scoreboard.c:355
 #1  0x8063e9a in ap_mpm_run (_pconf=0x8099010, plog=0x80bf010, s=0x809a8d8)
 at prefork.c:1078
 #2  0x8069676 in main (argc=1, argv=0xbfbffb30) at main.c:498
 #3  0x805d95d in _start ()

I have a feeling the scoreboard is getting destroyed by it's pool cleanup
routine before this routine gets called:

(gdb) p ap_scoreboard_image-parent
$2 = (process_score *) 0x2823500c
(gdb) p i
$3 = 0
(gdb) p ap_scoreboard_image-parent[0]
Cannot access memory at address 0x2823500c.

Did we intend to reuse the scoreboard across restarts, or recreate it?

It looks like the scoreboard is currently being created in the pconf
pool, which is cleared shortly after ap_run_mpm() decides to do a graceful.
OTOH, we don't call ap_run_pre_mpm() to create the scoreboard if we are
doing a graceful restart. I think that's where the SEGV is comming from.

-aaron

BTW, if we're ok to recreate during restart, I think this would fix prefork
(untested):

Index: server/mpm/prefork/prefork.c
===
RCS file: /home/cvs/httpd-2.0/server/mpm/prefork/prefork.c,v
retrieving revision 1.236
diff -u -u -r1.236 prefork.c
--- server/mpm/prefork/prefork.c30 Jan 2002 22:35:56 -  1.236
+++ server/mpm/prefork/prefork.c1 Feb 2002 15:57:38 -
@@ -1000,10 +1000,8 @@
 }
 
 SAFE_ACCEPT(accept_mutex_init(pconf));
-if (!is_graceful) {
-if (ap_run_pre_mpm(pconf, SB_SHARED) != OK) {
-return 1;
-}
+if (ap_run_pre_mpm(pconf, SB_SHARED) != OK) {
+return 1;
 }
 #ifdef SCOREBOARD_FILE
 else {




Re: daedalus is running 2.0.31

2002-02-01 Thread Greg Ames

Aaron Bannert wrote:
 
 On Fri, Feb 01, 2002 at 10:20:09AM -0500, Greg Ames wrote:
 
  yep.  It ran for nearly 5 hours.  Then the clock struck midnight, a cron job
  kicked off a graceful restart, and:
 
 ...all the horses turned back into mice... ;)

hee, hee :)  very appropriate

 I have a feeling the scoreboard is getting destroyed by it's pool cleanup
 routine before this routine gets called:
 
 (gdb) p ap_scoreboard_image-parent
 $2 = (process_score *) 0x2823500c
 (gdb) p i
 $3 = 0
 (gdb) p ap_scoreboard_image-parent[0]
 Cannot access memory at address 0x2823500c.
 
 Did we intend to reuse the scoreboard across restarts, or recreate it?

Reuse it.  

I think it would be cool if we could recreate it, but we are not there yet.  The
problem is that after a graceful restart, we don't know when the guys in third
world countries with 28.8Kb modems and noisy phone lines will be done
downloading the latest 4.6MB tomcat nightly build.  We can't free the scoreboard
until all of the old generation children go away, and that's complex to figure
out, so we reuse it.

 It looks like the scoreboard is currently being created in the pconf
 pool, which is cleared shortly after ap_run_mpm() decides to do a graceful.

That sounds bad.  

Greg



Re: daedalus is running 2.0.31

2002-02-01 Thread Ian Holsman

Aaron Bannert wrote:
 On Fri, Feb 01, 2002 at 11:26:47AM -0500, Greg Ames wrote:
  
 
It looks like the scoreboard is currently being created in the pconf
pool, which is cleared shortly after ap_run_mpm() decides to do a graceful.

That sounds bad.  

 
 Switched to use a global pool, and this gets rid of the SEGVs for
 me on graceful.
 
 Ian: you may wish to push the tag up on this file, but it's up to you.
 
 -aaron
 
 

pushing it now.

(luckily we haven't rolled yet eh)




Re: daedalus is running 2.0.31

2002-02-01 Thread Greg Ames

Aaron Bannert wrote:

 Switched to use a global pool, and this gets rid of the SEGVs for
 me on graceful.
 
 Ian: you may wish to push the tag up on this file, but it's up to you.

daedalus has been updated, and graceful and non-graceful restarts are working
fine once again.  I'll probably try it live again tonight.  

Thanks, Aaron and Ian.
Greg



daedalus is running 2.0.31

2002-01-31 Thread Greg Ames

...since Thursday, 31-Jan-2002 19:04:06 PST.  

Beside checking out the tag, it has the usual patch to save the input buffers
for debugging, and a quick-n-dirty hack to exit the child without killing the
parent if accept() gets ENFILE (system out of fd's).

I did have to futz with the config file a bit.  The main thing was the change
from mod_auth_db to mod_auth_dbm.  I hope the following is correct, please
holler if not:

@@ -762,7 +764,8 @@
Options All
/Directory
Directory /da1/www/bugs.apache.org/private
-  AuthDBUserFile /home/apmail/bugdbaccounts
+  AuthDBMUserFile /home/apmail/bugdbaccounts
+  AuthDBMType DB
   AuthName ApacheBugDatabaseUsers
   AuthType Basic
   require valid-user

Are there any changes to the utility to manage the passwords that Brian and
Manoj should know about?

Greg



Re: daedalus is running 2.0.31

2002-01-31 Thread Ian Holsman

Greg Ames wrote:
 ...since Thursday, 31-Jan-2002 19:04:06 PST.  
Cool.
we're running it for our developers internally starting tomorrow.

 
 Beside checking out the tag, it has the usual patch to save the input buffers
 for debugging, and a quick-n-dirty hack to exit the child without killing the
 parent if accept() gets ENFILE (system out of fd's).
 
 I did have to futz with the config file a bit.  The main thing was the change
 from mod_auth_db to mod_auth_dbm.  I hope the following is correct, please
 holler if not:
 
 @@ -762,7 +764,8 @@
 Options All
 /Directory
 Directory /da1/www/bugs.apache.org/private
 -  AuthDBUserFile /home/apmail/bugdbaccounts
 +  AuthDBMUserFile /home/apmail/bugdbaccounts
 +  AuthDBMType DB
your running berkeleyDB ???
if so that is the right config.

AuthName ApacheBugDatabaseUsers
AuthType Basic
require valid-user
 
 Are there any changes to the utility to manage the passwords that Brian and
 Manoj should know about?

htdbm has been changed as well so as to allow for multiple DB types
I think it is -T DB as an extra parameter if you don't have berkeleyDB 
support as default.

--Ian

 
 Greg