Hi Chris,

For the following supervisor config (memmon limit set low intentionally):

[inet_http_server]
port=127.0.0.1:9036

[supervisord]
logfile=%(here)s/../log/supervisord.log
logfile_maxbytes=50MB
logfile_backups=10
loglevel=info
pidfile=%(here)s/../var/supervisord.pid
nodaemon=false

[rpcinterface:supervisor]
supervisor.rpcinterface_factory=supervisor.rpcinterface:make_main_rpcinterface

[supervisorctl]
serverurl=http://127.0.0.1:9036

[program:osi]
command = %(here)s/../bin/paster serve %(here)s/../etc/karl.ini
redirect_stderr = true

[program:zeo]
command = %(here)s/../bin/runzeo -C %(here)s/zeo.conf
redirect_stderr = true

[eventlistener:memmon]
command=%(here)s/../bin/memmon -a 20MB
events=TICK_60

When I run it under "supervisord -n -edebug" (using supervisor 3.0a7) on MacOS 
10.6, I see this (see lines that being "restarting"):

[chr...@snowpro bfgenv]$ bin/supervisord -n -edebug
2009-10-03 13:34:41,101 INFO Increased RLIMIT_NOFILE limit to 1024
2009-10-03 13:34:41,221 INFO RPC interface 'supervisor' initialized
2009-10-03 13:34:41,221 WARN cElementTree not installed, using slower XML 
parser for XML-RPC
2009-10-03 13:34:41,221 CRIT Server 'inet_http_server' running without any HTTP 
authentication checking
2009-10-03 13:34:41,222 INFO supervisord started with pid 49246
2009-10-03 13:34:42,226 INFO spawned: 'memmon' with pid 49248
2009-10-03 13:34:42,231 INFO spawned: 'osi' with pid 49249
2009-10-03 13:34:42,239 INFO spawned: 'zeo' with pid 49250
2009-10-03 13:34:43,241 INFO success: memmon entered RUNNING state, process has 
stayed up for > than 1 seconds (startsecs)
2009-10-03 13:34:43,241 INFO success: osi entered RUNNING state, process has 
stayed up for > than 1 seconds (startsecs)
2009-10-03 13:34:43,250 INFO success: zeo entered RUNNING state, process has 
stayed up for > than 1 seconds (startsecs)
2009-10-03 13:34:43,370 DEBG 'memmon' stdout output:
READY

2009-10-03 13:34:43,371 DEBG memmon: ACKNOWLEDGED -> READY
2009-10-03 13:34:43,705 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:43 INFO ZEO.runzeo (49250) created PID file './var/ZEO.pid'

2009-10-03 13:34:43,706 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:43 INFO ZEO.runzeo (49250) opening storage '1' using 
BlobStorage

2009-10-03 13:34:43,749 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:43 INFO ZODB.blob (49250) Blob directory `./var/blobs` has 
layout marker set. Selected `bushy` layout.

2009-10-03 13:34:43,793 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:43 INFO ZEO.StorageServer (49250) StorageServer created RW 
with storages: 1:RW:./var/osi.db

2009-10-03 13:34:44,156 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:44 INFO ZEO.zrpc (49250) listening on ('', 8884)

2009-10-03 13:34:45,279 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:45 INFO ZEO.StorageServer (49250) new connection ('127.0.0.1', 
59853): <ManagedServerConnection ('127.0.0.1', 59853)>

2009-10-03 13:34:45,280 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:45 INFO ZEO.zrpc.Connection(S) (127.0.0.1:59853) received 
handshake 'Z308'

2009-10-03 13:34:45,414 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:45 INFO ZEO.StorageServer (49250) new connection ('127.0.0.1', 
59854): <ManagedServerConnection ('127.0.0.1', 59854)>

2009-10-03 13:34:45,415 DEBG 'zeo' stdout output:
------
2009-10-03T13:34:45 INFO ZEO.zrpc.Connection(S) (127.0.0.1:59854) received 
handshake 'Z308'

2009-10-03 13:35:01,422 DEBG event 0 sent to listener memmon
2009-10-03 13:35:01,423 DEBG 'memmon' stderr output:
Checking any=20971520

2009-10-03 13:35:01,441 DEBG 'memmon' stderr output:
RSS of memmon:memmon is 5775360

2009-10-03 13:35:01,447 DEBG 'memmon' stderr output:
RSS of osi:osi is 28737536
Restarting osi:osi

2009-10-03 13:35:01,448 DEBG fd 10 closed, stopped monitoring 
<POutputDispatcher at 7294408 for <Subprocess at 7054920 with name osi in state 
RUNNING> (stdout)>
2009-10-03 13:35:01,449 DEBG killing osi (pid 49249) with signal SIGTERM
2009-10-03 13:35:01,456 INFO stopped: osi (terminated by SIGTERM)
2009-10-03 13:35:01,457 DEBG received SIGCHLD indicating a child quit
2009-10-03 13:35:01,457 DEBG 'zeo' stdout output:
------
2009-10-03T13:35:01 INFO ZEO.StorageServer (49250/127.0.0.1:59854) disconnected

2009-10-03 13:35:01,457 DEBG 'zeo' stdout output:
------
2009-10-03T13:35:01 INFO ZEO.StorageServer (49250/127.0.0.1:59853) disconnected

2009-10-03 13:35:02,465 INFO spawned: 'osi' with pid 49256
2009-10-03 13:35:03,468 INFO success: osi entered RUNNING state, process has 
stayed up for > than 1 seconds (startsecs)
2009-10-03 13:35:03,477 DEBG 'memmon' stdout output:
RESULT 2
OKREADY

2009-10-03 13:35:03,477 DEBG memmon: BUSY -> ACKNOWLEDGED (processed)
2009-10-03 13:35:03,477 DEBG memmon: ACKNOWLEDGED -> READY
2009-10-03 13:35:03,478 DEBG 'memmon' stderr output:
RSS of zeo:zeo is 10936320

2009-10-03 13:35:05,143 DEBG 'zeo' stdout output:
------
2009-10-03T13:35:05 INFO ZEO.StorageServer (49250) new connection ('127.0.0.1', 
59856): <ManagedServerConnection ('127.0.0.1', 59856)>

2009-10-03 13:35:05,144 DEBG 'zeo' stdout output:
------
2009-10-03T13:35:05 INFO ZEO.zrpc.Connection(S) (127.0.0.1:59856) received 
handshake 'Z308'

2009-10-03 13:35:05,276 DEBG 'zeo' stdout output:
------
2009-10-03T13:35:05 INFO ZEO.StorageServer (49250) new connection ('127.0.0.1', 
59857): <ManagedServerConnection ('127.0.0.1', 59857)>

2009-10-03 13:35:05,277 DEBG 'zeo' stdout output:
------
2009-10-03T13:35:05 INFO ZEO.zrpc.Connection(S) (127.0.0.1:59857) received 
handshake 'Z308'


Chris Shenton wrote:
> I've got supervisor with a memmon eventlistener from a buildout:
> 
>> [supervisor]
>> recipe = collective.recipe.supervisor
>> user=nfw
>> password=nfw
>> eggs = supervisor
>> programs =
>>       10 zeo ${zeo:location}/bin/runzeo ${zeo:location}
>>       20 instance ${instance:location}/bin/runzope ${instance:location}  
>> true
>> plugins =
>>      superlance
>> eventlisteners =
>>      MemmonAll      TICK_60 ${buildout:bin-directory}/memmon [-a 100MB]
> 
> It generates a plausible parts/supervisor/supervisord.conf:
> 
>> ...
>> [program:instance]
>> command = /usr/local/cshenton/Projects/cjp/svn-buildout/parts/ 
>> instance/bin/runzope
>> process_name = instance
>> directory = /usr/local/cshenton/Projects/cjp/svn-buildout/parts/ 
>> instance
>> priority = 20
>> redirect_stderr = true
>>
>> [eventlistener:MemmonAll]
>> command = /usr/local/cshenton/Projects/cjp/svn-buildout/bin/memmon - 
>> a 100MB
>> events = TICK_60
>> process_name=MemmonAll
>> environment 
>> = 
>> SUPERVISOR_USERNAME 
>> =nfw,SUPERVISOR_PASSWORD=nfw,SUPERVISOR_SERVER_URL=http://localhost:9001
> 
> I start up supervisor, and  the plone instance stabilizes at 97MB,  
> then after connecting to plone with a browser, jumps to 113MB:
> 
>> 2009-10-03 10:30:34,798 INFO spawned: 'MemmonAll' with pid 11608
>> 2009-10-03 10:30:34,816 INFO spawned: 'zeo' with pid 11609
>> 2009-10-03 10:30:34,826 INFO spawned: 'instance' with pid 11610
>> ch...@mackeral:~$ ps -orss 11610
>>    RSS
>>  97444
>> ch...@mackeral:~$ ps -orss 11610
>>    RSS
>> 113512
> 
> But supervisor never restarts the instance. I don't see any info in  
> the logs that its checking all processes or anything else.
> 
> I'm not sure how to debug it with pdb since supervisor spawns memmon.   
> Sticking print and logging statements into superlance/memmon.py  
> indicates to me that it never comes back from this wait call:
> 
>>     def runforever(self, test=False):
>>         while 1:
>>             # we explicitly use self.stdin, self.stdout, and  
>> self.stderr
>>             # instead of sys.* so we can unit test this code
>>             headers, payload = childutils.listener.wait(self.stdin,  
>> self.stdout)
> 
> Any suggestions for debugging this? Thanks.
> 
> 
> 
> _______________________________________________
> Supervisor-users mailing list
> [email protected]
> http://lists.supervisord.org/mailman/listinfo/supervisor-users
> 

_______________________________________________
Supervisor-users mailing list
[email protected]
http://lists.supervisord.org/mailman/listinfo/supervisor-users

Reply via email to