On 01/06/2014 05:31 PM, Chuck Weinstock wrote: > Thanks! > > Yes to the stale lock problem. Regarding the other problem…the last time it > shut down was January 1. Here are some of the qrunner log entries just prior > to that: > >> Dec 30 18:17:20 2013 (8351) Master qrunner detected subprocess exit >> (pid: 2209, sig: 9, sts: None, class: ArchRunner, slice: 1/1) [restarting] >> Dec 30 18:17:23 2013 (16892) ArchRunner qrunner started. >> Dec 31 00:21:05 2013 (8351) Master qrunner detected subprocess exit >> (pid: 16892, sig: 9, sts: None, class: ArchRunner, slice: 1/1) [restarting] >> Dec 31 00:21:10 2013 (31527) ArchRunner qrunner started. >> Dec 31 06:25:01 2013 (8351) Master qrunner detected subprocess exit >> (pid: 15347, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) >> [restarting] >> Dec 31 06:25:04 2013 (13794) IncomingRunner qrunner started. >> Dec 31 12:28:51 2013 (8351) Master qrunner detected subprocess exit >> (pid: 13794, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) >> [restarting] >> Dec 31 12:28:53 2013 (28877) IncomingRunner qrunner started. >> Dec 31 18:32:44 2013 (8351) Master qrunner detected subprocess exit >> (pid: 31527, sig: 9, sts: None, class: ArchRunner, slice: 1/1) [restarting] >> Dec 31 18:32:46 2013 (10916) ArchRunner qrunner started. >> Jan 01 00:36:02 2014 (8351) Master qrunner detected subprocess exit >> (pid: 12268, sig: 9, sts: None, class: OutgoingRunner, slice: 1/1) >> [restarting] >> Jan 01 00:36:04 2014 (25317) OutgoingRunner qrunner started. >> Jan 01 12:43:48 2014 (8351) Master qrunner detected subprocess exit >> (pid: 10916, sig: 9, sts: None, class: ArchRunner, slice: 1/1) [restarting] >> Jan 01 12:43:50 2014 (22804) ArchRunner qrunner started. >> Jan 01 15:22:22 2014 (8351) Master qrunner detected subprocess exit >> (pid: 28877, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) >> [restarting] >> Jan 01 15:22:22 2014 (8351) Qrunner IncomingRunner reached maximum restart >> limit of 10, not restarting.
All of the above are signal 9 (SIGKILL). Do you have some cron or other process that's SIGKILLing the qrunners in an attempt to keep them small or for some other reason? See the FAQ at <http://wiki.list.org/x/94A9>. > Also there are no errors in the error log around the same time. I am seeing a > bunch of errors (now) like: > >> Jan 05 20:46:54 2014 (1522) Uncaught runner exception: [Errno 2] No such >> file or directory: >> '/usr/local/mailman/qfiles/in/1388971910.759851+b18e7af8cb0632a2d9f551c9e39053510b278e9a.pck' >> Jan 05 20:46:54 2014 (1522) Traceback (most recent call last): >> File "/usr/local/mailman/Mailman/Queue/Runner.py", line 99, in _oneloop >> msg, msgdata = self._switchboard.dequeue(filebase) >> File "/usr/local/mailman/Mailman/Queue/Switchboard.py", line 154, in >> dequeue >> fp = open(filename) >> IOError: [Errno 2] No such file or directory: >> '/usr/local/mailman/qfiles/in/1388971910.759851+b18e7af8cb0632a2d9f551c9e39053510b278e9a.pck' >> >> Jan 05 20:46:54 2014 (1522) Skipping and preserving unparseable message: >> 1388971910.759851+b18e7af8cb0632a2d9f551c9e39053510b278e9a >> Jan 05 20:46:54 2014 (1522) Failed to unlink/preserve backup file: >> /usr/local/mailman/qfiles/in/1388971910.759851+b18e7af8cb0632a2d9f551c9e39053510b278e9a.bak >> [Errno 2] No such file or directory > > > I think these are related to some pck files that I hand deleted because I > thought they were causing the stale lock problem. I think these are because you have more than one qrunner processing the same slice of the same queue. See the FAQ at <http://wiki.list.org/x/_4A9>. -- Mark Sapiro <m...@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org