> Hi, we are using uWSGI in emperor mode on AWS and we recently encountered > a > bad case of EBS drive degradation that forced us to restart uWSGI once the > hard drive was working correctly again. My main question is: is there any > configuration option that would have allowed us to automatically restart > uWSGI. > > The details: > > 1. We use uWSGI 2.0.17 in emperor mode and nginx is used as a reverse > proxy. > > 2. Our main application has the following configuration: > ``` > [uwsgi] > socket = 127.0.0.1:8999 > pythonpath = /path/to/app > virtualenv = /path/to/virtualenv > processes = 6 > max-requests = 100 > reload-on-rss = 100 > master = true > harakiri = 20 > module = our_app.wsgi > pidfile = /var/run/webapp/our_app.pid > post-buffering = 4096 > logger = syslog:uwsgi.our_app > disable-logging = true > log-date = true > log-slow = 1000 > log-5xx = true > log-maxsize = 16777216 > ``` > > 3. When the EBS drive started behaving erratically, we started receiving > hundreds of harakiri notifications: > > `Sep 25 20:10:15 our_server uwsgi.our_app: Tue Sep 25 20:05:37 2018 - *** > HARAKIRI ON WORKER 1 (pid: 3720, try: 382) **` > > 4. Then we started to see this log statement: > > `Sep 25 20:13:35 our_server uwsgi.our_app: Tue Sep 25 20:08:57 2018 - *** > uWSGI listen queue of socket "127.0.0.1:8999" (fd: 6) full !!! (101/100) > ***` > > 5. Once the hard drive recovered, new requests were still returning 504 > gateway timeout error for several minutes. Restarting uWSGI "fixed" the > problem. > > My uninformed guess is that uWSGI was still trying to process the listen > queue, but I wonder if there is some configuration parameter or some > statistics emitted by uWSGI that we could use to automate a restart if > this > ever happens again. > > In any case, thanks for this wonderful piece of software! >
Hi, technically this has nothing to do with uWSGI by itself but on how sockets work: until the socket is opened, the kernel will continue to fill its backlog buffer. Probably in your case the stop/restart (a graceful reload is not enough as the internal socket is not closed) should have been triggered on the EBS resume. Note that you can attach an alarm to the listen queue full event: https://uwsgi-docs.readthedocs.io/en/latest/AlarmSubsystem.html and eventually trigger the restart, but honestly your case is so "apocalyptic" that a manual procedure is the most secure thing to do. (immagine restarting uWSGI under a dos, you will end with both the network and the system load destroyed ;) -- Roberto De Ioris http://unbit.com _______________________________________________ uWSGI mailing list uWSGI@lists.unbit.it http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi