Hi Roger,

I'm actually finding a separate issue which is that my procs
mysteriously start dying if I leave them up for too long... It's quite
weird, no errors, nothing in the log to suggest they have exited,
however after 20-30 minutes of load, I start getting these:

2009/04/15 09:34:15 [error] 18840#0: *127695 upstream timed out (110:
Connection timed out) while reading response header from upstream,
client: 174.129.xxx.xxx, server: xxxxxxxx, request: "GET /xxxxxxxxxx
HTTP/1.0", upstream: "fastcgi://127.0.0.1:8080", host: "xxx.com"

Do you have any idea why this is?  Everything works fine, supervisord
doesn't report any issues, nor is there anything in the stderr or
stdout logs, and all of the sudden, I'm getting thousands of 504
errors (get about 5 req/sec).

This one is really killing me.   I hate having a cron job run:
supervisorctl restart all every 15 minutes, and that also adds some
504s to the mix :(

Best,
Jacob

Thanks!
Jacob




On Wed, Apr 15, 2009 at 10:51 PM, Roger Hoover <[email protected]> wrote:
> Hi Jacob,
>
> If you use supervisorctl to restart just the gate processes (and not a full
> shutdown of supervisrod), requests should not get dropped during the
> restart.  The reason is that the fastcgi socket will still be open.  There
> will be no processes accepting connections on it temporarily but the
> connection requests will queue up in the OS until restart is successful and
> the new gate processes are accepting connections again.
>
> The only potential issue I see is if the restart takes too long and either
> the inbound connection queue buffer limit is reached or nginx mod_fastcgi
> times out the request for taking too long.
>
> If stopping and starting all the gate processes in one command takes to
> long, you can restart them one by one with supervisorctl.
>
> Hope that helps,
>
> Roger
>
> On Wed, Apr 15, 2009 at 5:50 AM, Jacob Singh <[email protected]> wrote:
>>
>> Hi Roger,
>>
>> I'm having some issues since going into production.
>>
>> Pardon my ignorance, but...
>>
>> I've got the following config:
>>
>> [fcgi-program:gate]
>> socket=tcp://127.0.0.1:8080  ; We reference this later in nginx
>> command = /usr/local/mything/bin/gate2.py
>> environment=PYTHON_EGG_CACHE=/tmp/.pythoneggs,LOG_LEVEL=WARNING #warning
>> numprocs=10
>> process_name=gate_%(process_num)s
>>
>> I then use:
>>
>> fcgi_pass 127.0.0.1:8080 in nginx to pass the requests in.  It works
>> fine, but then, after a while, I started getting timeout errors, so I
>> added a cronjob to restart them every hour.
>>
>> This works okay, but I really don't want to restart them all together,
>> because then requests are dropped.
>>
>> Best,
>> Jacob
>>
>>
>> On Wed, Apr 8, 2009 at 1:11 AM, Roger Hoover <[email protected]>
>> wrote:
>> > Sweet.  Glad it worked for you.  A release of supervisor should be
>> > coming
>> > soon.
>> >
>> >
>> > http://www.mail-archive.com/[email protected]/msg00144.html
>> >
>> > On Mon, Apr 6, 2009 at 10:46 PM, Jacob Singh <[email protected]>
>> > wrote:
>> >>
>> >> Nevermind, I got it figured out.
>> >>
>> >> Thanks! This is awseome.  I hope it gets into a release because my
>> >> company is wary of using something that hasn't had a release for a
>> >> year.
>> >>
>> >> On Tue, Apr 7, 2009 at 11:02 AM, Jacob Singh <[email protected]>
>> >> wrote:
>> >> > Hmm...
>> >> >
>> >> > okay, I've got it making the request through nginx, however, the
>> >> > environ variable is empty in my wsgi script.  The same script works
>> >> > fine when I create my own named sockets and add them to an nginx
>> >> > upstream...
>> >> >
>> >> > Not sure how to proceed on that.
>> >> >
>> >> > Best,
>> >> > Jacob
>> >> >
>> >> > On Mon, Apr 6, 2009 at 9:46 PM, Roger Hoover <[email protected]>
>> >> > wrote:
>> >> >> Hi Jacob,
>> >> >>
>> >> >> Your configuration has the FastCGI process listening on
>> >> >> 127.0.0.1:1212
>> >> >> so
>> >> >> that socket is expecting the client to speak FCGI.  If you use curl
>> >> >> to
>> >> >> send
>> >> >> an HTTP request, it won't understand the request.  You need to
>> >> >> configure a
>> >> >> web server such as nginx that will proxy HTTP requests over FastCGI.
>> >> >> Nginx
>> >> >> will need to run listen on another socket (say 5000) and proxy
>> >> >> requests
>> >> >> to
>> >> >> your FastCGI processes listening on 127.0.0.1:1212.
>> >> >>
>> >> >> Hope that helps,
>> >> >>
>> >> >> Roger
>> >> >>
>> >> >> On Sun, Apr 5, 2009 at 10:28 PM, Jacob Singh <[email protected]>
>> >> >> wrote:
>> >> >>>
>> >> >>> Hi folks!
>> >> >>>
>> >> >>> I just found out about this project from:
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> http://just-another.net/2009/01/18/byteflowdjangosupervisordnginx-win/#comments
>> >> >>>
>> >> >>> I've been trying to accomplish the same goal, but not using django.
>> >> >>>
>> >> >>> It all *kinda* works, but when I try to curl my fcgi program, I get
>> >> >>> nada, and it just hangs forever with no logs... don't know where to
>> >> >>> start.  I'm using trunk.
>> >> >>>
>> >> >>>
>> >> >>> Server info:
>> >> >>> Python 2.4.3 (#1, Mar 14 2007, 18:51:08)
>> >> >>> [GCC 4.1.1 20070105 (Red Hat 4.1.1-52)] on linux2
>> >> >>>
>> >> >>>
>> >> >>> Here's my config (relevant bits):
>> >> >>> --------------------------------------------------
>> >> >>> [supervisord]
>> >> >>> logfile=/tmp/supervisord.log ; (main log file;default
>> >> >>> $CWD/supervisord.log)
>> >> >>> logfile_maxbytes=50MB       ; (max main logfile bytes b4
>> >> >>> rotation;default
>> >> >>> 50MB)
>> >> >>> logfile_backups=10          ; (num of main logfile rotation
>> >> >>> backups;default 10)
>> >> >>> loglevel=debug               ; (log level;default info; others:
>> >> >>> debug,warn,trace)
>> >> >>> pidfile=/tmp/supervisord.pid ; (supervisord pidfile;default
>> >> >>> supervisord.pid)
>> >> >>> nodaemon=true             ; (start in foreground if true;default
>> >> >>> false)
>> >> >>> minfds=1024                 ; (min. avail startup file
>> >> >>> descriptors;default
>> >> >>> 1024)
>> >> >>> minprocs=200                ; (min. avail process
>> >> >>> descriptors;default
>> >> >>> 200)
>> >> >>> ;umask=022                  ; (process file creation umask;default
>> >> >>> 022)
>> >> >>> user=nobody                ; (default is current user, required if
>> >> >>> root)
>> >> >>>
>> >> >>> ; Production setup
>> >> >>> [fcgi-program:gate]
>> >> >>> socket=tcp://127.0.0.1:1212  ; We reference this later in nginx
>> >> >>> #command = /usr/local/solrflare/bin/gate.py  ; Calls the above code
>> >> >>> command = /tmp/new.py
>> >> >>> environment=PYTHON_EGG_CACHE=/tmp  ; Setup needed environment
>> >> >>>
>> >> >>>
>> >> >>> And here is new.py:
>> >> >>> ----------------------------------------------
>> >> >>>
>> >> >>> #!/usr/bin/python
>> >> >>> from flup.server.fcgi import WSGIServer
>> >> >>> import time, os, sys
>> >> >>>
>> >> >>> open('/tmp/new.log','a').write('something')
>> >> >>> def app(environ, start_response):
>> >> >>>        open('/tmp/new.log','a').write('else')
>> >> >>>        status = "200 OK"
>> >> >>>        response_headers = [('Content-type', 'text/plain')]
>> >> >>>        start_response(status, response_headers)
>> >> >>>        return ['LOALALA\n']
>> >> >>> WSGIServer(app).run()
>> >> >>>
>> >> >>>
>> >> >>> My Log:
>> >> >>> -------------------------------------------------
>> >> >>> [r...@balancer:/tmp] supervisord
>> >> >>> 2009-04-06 01:19:01,308 CRIT Set uid to user 99
>> >> >>> 2009-04-06 01:19:01,500 INFO RPC interface 'supervisor' initialized
>> >> >>> 2009-04-06 01:19:01,501 INFO RPC interface 'supervisor' initialized
>> >> >>> 2009-04-06 01:19:01,501 INFO supervisord started with pid 5886
>> >> >>> 2009-04-06 01:19:02,499 DEBG fd 8 closed, stopped monitoring
>> >> >>> <PInputDispatcher at -1216741876 for <Subprocess at -1216915476
>> >> >>> with
>> >> >>> name gate in state STARTING> (stdin)>
>> >> >>> 2009-04-06 01:19:02,510 INFO spawned: 'gate' with pid 5888
>> >> >>> 2009-04-06 01:19:03,508 INFO success: gate entered RUNNING state,
>> >> >>> process has stayed up for > than 1 seconds (startsecs)
>> >> >>>
>> >> >>>
>> >> >>> curl localhost:1212
>> >> >>> Just sits there forever...
>> >> >>>
>> >> >>>
>> >> >>> Help!?
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Jacob
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>>
>> >> >>> +1 510 277-0891 (o)
>> >> >>> +91 9999 33 7458 (m)
>> >> >>>
>> >> >>> web: http://pajamadesign.com
>> >> >>>
>> >> >>> Skype: pajamadesign
>> >> >>> Yahoo: jacobsingh
>> >> >>> AIM: jacobsingh
>> >> >>> gTalk: [email protected]
>> >> >>> _______________________________________________
>> >> >>> Supervisor-users mailing list
>> >> >>> [email protected]
>> >> >>> http://lists.supervisord.org/mailman/listinfo/supervisor-users
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> >
>> >> > +1 510 277-0891 (o)
>> >> > +91 9999 33 7458 (m)
>> >> >
>> >> > web: http://pajamadesign.com
>> >> >
>> >> > Skype: pajamadesign
>> >> > Yahoo: jacobsingh
>> >> > AIM: jacobsingh
>> >> > gTalk: [email protected]
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> +1 510 277-0891 (o)
>> >> +91 9999 33 7458 (m)
>> >>
>> >> web: http://pajamadesign.com
>> >>
>> >> Skype: pajamadesign
>> >> Yahoo: jacobsingh
>> >> AIM: jacobsingh
>> >> gTalk: [email protected]
>> >
>> >
>>
>>
>>
>> --
>>
>> +1 510 277-0891 (o)
>> +91 9999 33 7458 (m)
>>
>> web: http://pajamadesign.com
>>
>> Skype: pajamadesign
>> Yahoo: jacobsingh
>> AIM: jacobsingh
>> gTalk: [email protected]
>
>



-- 

+1 510 277-0891 (o)
+91 9999 33 7458 (m)

web: http://pajamadesign.com

Skype: pajamadesign
Yahoo: jacobsingh
AIM: jacobsingh
gTalk: [email protected]
_______________________________________________
Supervisor-users mailing list
[email protected]
http://lists.supervisord.org/mailman/listinfo/supervisor-users

Reply via email to