Re: daedalus is running 2.0.31
Aaron Bannert wrote: > Switched to use a global pool, and this gets rid of the SEGVs for > me on graceful. > > Ian: you may wish to push the tag up on this file, but it's up to you. daedalus has been updated, and graceful and non-graceful restarts are working fine once again. I'll probably try it live again tonight. Thanks, Aaron and Ian. Greg
Re: daedalus is running 2.0.31
Aaron Bannert wrote: > On Fri, Feb 01, 2002 at 11:26:47AM -0500, Greg Ames wrote: > > >>>It looks like the scoreboard is currently being created in the pconf >>>pool, which is cleared shortly after ap_run_mpm() decides to do a graceful. >>> >>That sounds bad. >> > > Switched to use a global pool, and this gets rid of the SEGVs for > me on graceful. > > Ian: you may wish to push the tag up on this file, but it's up to you. > > -aaron > > pushing it now. (luckily we haven't rolled yet eh)
Re: daedalus is running 2.0.31
On Fri, Feb 01, 2002 at 11:26:47AM -0500, Greg Ames wrote: > > It looks like the scoreboard is currently being created in the pconf > > pool, which is cleared shortly after ap_run_mpm() decides to do a graceful. > > That sounds bad. Switched to use a global pool, and this gets rid of the SEGVs for me on graceful. Ian: you may wish to push the tag up on this file, but it's up to you. -aaron
Re: daedalus is running 2.0.31
Aaron Bannert wrote: > > On Fri, Feb 01, 2002 at 10:20:09AM -0500, Greg Ames wrote: > > > yep. It ran for nearly 5 hours. Then the clock struck midnight, a cron job > > kicked off a graceful restart, and: > > ...all the horses turned back into mice... ;) hee, hee :) very appropriate > I have a feeling the scoreboard is getting destroyed by it's pool cleanup > routine before this routine gets called: > > (gdb) p ap_scoreboard_image->parent > $2 = (process_score *) 0x2823500c > (gdb) p i > $3 = 0 > (gdb) p ap_scoreboard_image->parent[0] > Cannot access memory at address 0x2823500c. > > Did we intend to reuse the scoreboard across restarts, or recreate it? Reuse it. I think it would be cool if we could recreate it, but we are not there yet. The problem is that after a graceful restart, we don't know when the guys in third world countries with 28.8Kb modems and noisy phone lines will be done downloading the latest 4.6MB tomcat nightly build. We can't free the scoreboard until all of the old generation children go away, and that's complex to figure out, so we reuse it. > It looks like the scoreboard is currently being created in the pconf > pool, which is cleared shortly after ap_run_mpm() decides to do a graceful. That sounds bad. Greg
RE: daedalus is running 2.0.31
> I have a feeling the scoreboard is getting destroyed by it's pool cleanup > routine before this routine gets called: > > (gdb) p ap_scoreboard_image->parent > $2 = (process_score *) 0x2823500c > (gdb) p i > $3 = 0 > (gdb) p ap_scoreboard_image->parent[0] > Cannot access memory at address 0x2823500c. > > Did we intend to reuse the scoreboard across restarts, or recreate it? The scoreboard needs to be re-used across restarts. Ryan
Re: daedalus is running 2.0.31
On Fri, Feb 01, 2002 at 10:20:09AM -0500, Greg Ames wrote: > yep. It ran for nearly 5 hours. Then the clock struck midnight, a cron job > kicked off a graceful restart, and: ...all the horses turned back into mice... ;) > [Fri Feb 01 00:00:04 2002] [notice] Apache/2.0.31 (Unix) configured -- resuming > normal operations > [Fri Feb 01 00:00:04 2002] [notice] seg fault or similar nasty error detected in > the parent process > > ...and we were down until I woke up, took a look, momentarily panicked, then put > us back on 2.0.29 at Friday Feb 01 05:18:04 daedalus time. > > At least we have a both a log message and a coredump now. That wasn't the case > not too long ago. The dump is /usr/local/apache2.0.31/corefiles/httpd.core.1 . > The backtrace is pretty simple: > > #0 0x806fdeb in find_child_by_pid (pid=0xbfbffa4c) at scoreboard.c:355 > #1 0x8063e9a in ap_mpm_run (_pconf=0x8099010, plog=0x80bf010, s=0x809a8d8) > at prefork.c:1078 > #2 0x8069676 in main (argc=1, argv=0xbfbffb30) at main.c:498 > #3 0x805d95d in _start () I have a feeling the scoreboard is getting destroyed by it's pool cleanup routine before this routine gets called: (gdb) p ap_scoreboard_image->parent $2 = (process_score *) 0x2823500c (gdb) p i $3 = 0 (gdb) p ap_scoreboard_image->parent[0] Cannot access memory at address 0x2823500c. Did we intend to reuse the scoreboard across restarts, or recreate it? It looks like the scoreboard is currently being created in the pconf pool, which is cleared shortly after ap_run_mpm() decides to do a graceful. OTOH, we don't call ap_run_pre_mpm() to create the scoreboard if we are doing a graceful restart. I think that's where the SEGV is comming from. -aaron BTW, if we're ok to recreate during restart, I think this would fix prefork (untested): Index: server/mpm/prefork/prefork.c === RCS file: /home/cvs/httpd-2.0/server/mpm/prefork/prefork.c,v retrieving revision 1.236 diff -u -u -r1.236 prefork.c --- server/mpm/prefork/prefork.c30 Jan 2002 22:35:56 - 1.236 +++ server/mpm/prefork/prefork.c1 Feb 2002 15:57:38 - @@ -1000,10 +1000,8 @@ } SAFE_ACCEPT(accept_mutex_init(pconf)); -if (!is_graceful) { -if (ap_run_pre_mpm(pconf, SB_SHARED) != OK) { -return 1; -} +if (ap_run_pre_mpm(pconf, SB_SHARED) != OK) { +return 1; } #ifdef SCOREBOARD_FILE else {
Re: daedalus is running 2.0.31
jean-frederic clere wrote: > > Ian Holsman wrote: > > > > Greg Ames wrote: > > > ...since Thursday, 31-Jan-2002 19:04:06 PST. > > Cool. > > Something hangs now on daedalus! yep. It ran for nearly 5 hours. Then the clock struck midnight, a cron job kicked off a graceful restart, and: [Fri Feb 01 00:00:04 2002] [notice] Apache/2.0.31 (Unix) configured -- resuming normal operations [Fri Feb 01 00:00:04 2002] [notice] seg fault or similar nasty error detected in the parent process ...and we were down until I woke up, took a look, momentarily panicked, then put us back on 2.0.29 at Friday Feb 01 05:18:04 daedalus time. At least we have a both a log message and a coredump now. That wasn't the case not too long ago. The dump is /usr/local/apache2.0.31/corefiles/httpd.core.1 . The backtrace is pretty simple: #0 0x806fdeb in find_child_by_pid (pid=0xbfbffa4c) at scoreboard.c:355 #1 0x8063e9a in ap_mpm_run (_pconf=0x8099010, plog=0x80bf010, s=0x809a8d8) at prefork.c:1078 #2 0x8069676 in main (argc=1, argv=0xbfbffb30) at main.c:498 #3 0x805d95d in _start () Greg (writes on the blackboard 50 times:) I will test graceful restart I will test graceful restart I will test graceful restart ...
Re: daedalus is running 2.0.31
Ian Holsman wrote: > > Greg Ames wrote: > > ...since Thursday, 31-Jan-2002 19:04:06 PST. > Cool. Something hangs now on daedalus! > we're running it for our developers internally starting tomorrow. > > > > > Beside checking out the tag, it has the usual patch to save the input buffers > > for debugging, and a quick-n-dirty hack to exit the child without killing the > > parent if accept() gets ENFILE (system out of fd's). > > > > I did have to futz with the config file a bit. The main thing was the change > > from mod_auth_db to mod_auth_dbm. I hope the following is correct, please > > holler if not: > > > > @@ -762,7 +764,8 @@ > > Options All > > > > > > - AuthDBUserFile /home/apmail/bugdbaccounts > > + AuthDBMUserFile /home/apmail/bugdbaccounts > > + AuthDBMType DB > your running berkeleyDB ??? > if so that is the right config. > > >AuthName ApacheBugDatabaseUsers > >AuthType Basic > >require valid-user > > > > Are there any changes to the utility to manage the passwords that Brian and > > Manoj should know about? > > htdbm has been changed as well so as to allow for multiple DB types > I think it is -T DB as an extra parameter if you don't have berkeleyDB > support as default. > > --Ian > > > > > Greg > > > >
Re: daedalus is running 2.0.31
Greg Ames wrote: > ...since Thursday, 31-Jan-2002 19:04:06 PST. Cool. we're running it for our developers internally starting tomorrow. > > Beside checking out the tag, it has the usual patch to save the input buffers > for debugging, and a quick-n-dirty hack to exit the child without killing the > parent if accept() gets ENFILE (system out of fd's). > > I did have to futz with the config file a bit. The main thing was the change > from mod_auth_db to mod_auth_dbm. I hope the following is correct, please > holler if not: > > @@ -762,7 +764,8 @@ > Options All > > > - AuthDBUserFile /home/apmail/bugdbaccounts > + AuthDBMUserFile /home/apmail/bugdbaccounts > + AuthDBMType DB your running berkeleyDB ??? if so that is the right config. >AuthName ApacheBugDatabaseUsers >AuthType Basic >require valid-user > > Are there any changes to the utility to manage the passwords that Brian and > Manoj should know about? htdbm has been changed as well so as to allow for multiple DB types I think it is -T DB as an extra parameter if you don't have berkeleyDB support as default. --Ian > > Greg > >