We are running *without* num runs for over a year (and never have). It is a 
very elusive issue which has not been reproducible. 

I like more info on this but it needs to be very elaborate even to the point of 
access to the system exposing the behavior. 

Bolke

Sent from my iPhone

> On 24 Mar 2017, at 16:04, Vijay Ramesh <vi...@change.org> wrote:
> 
> We literally have a cron job that restarts the scheduler every 30 min. Num
> runs didn't work consistently in rc4, sometimes it would restart itself and
> sometimes we'd end up with a few zombie scheduler processes and things
> would get stuck. Also running locally, without celery.
> 
>> On Mar 24, 2017 16:02, <lro...@quartethealth.com> wrote:
>> 
>> We have max runs set and still hit this. Our solution is dumber:
>> monitoring log output, and kill the scheduler if it stops emitting. Works
>> like a charm.
>> 
>>> On Mar 24, 2017, at 5:50 PM, F. Hakan Koklu <fhakan.ko...@gmail.com>
>> wrote:
>>> 
>>> Some solutions to this problem is restarting the scheduler frequently or
>>> some sort of monitoring on the scheduler. We have set up a dag that pings
>>> cronitor <https://cronitor.io/> (a dead man's snitch type of service)
>> every
>>> 10 minutes and the snitch pages you when the scheduler dies and does not
>>> send a ping to it.
>>> 
>>> On Fri, Mar 24, 2017 at 1:49 PM, Andrew Phillips <aphill...@qrmedia.com>
>>> wrote:
>>> 
>>>> We use celery and run into it from time to time.
>>>>> 
>>>> 
>>>> Bang goes my theory ;-) At least, assuming it's the same underlying
>>>> cause...
>>>> 
>>>> Regards
>>>> 
>>>> ap
>>>> 
>> 

Reply via email to