Can you confirm that this happens on celery? It awfully sounds like this: http://stackoverflow.com/questions/27737990/django-celery-queue-getting-stuck
Sent from my iPhone > On 1 sep. 2016, at 21:59, Sergei Iakhnin <[email protected]> wrote: > > Alexandre talked about this being a known issue at least as far back as 10 > months ago. > >> On Thu, 1 Sep 2016, 21:46 Bolke de Bruin, <[email protected]> wrote: >> >> Again please create a jira and add as much info as possible. Including >> debug logs, executor logs, broker logs. If possible database dump. >> >> Note airflow version, celery version, rabbitmq/redis etc. provide config >> details. >> >> We really need more info to hint this down as it has been quite elusive. >> And I/we have not been able to replicate it. >> >> Bolke >> >> >> Sent from my iPhone >> >>> On 1 sep. 2016, at 20:45, [email protected] wrote: >>> >>> >>> >>> We face exactly the same issue... >>> I tried to describe it here this week, >>> But no one had a solution. >>> >>> ב-1 בספט׳ 2016, בשעה 17:54, Sergei Iakhnin <[email protected]> >> כתב/ה: >>> >>>> As far as I know even Airbnb themselves restart their schedulers every >> 30 >>>> minutes because of this issue. I ended up doing it as well with a cron >> job >>>> after giving up hope that it would be fixed in the short term. >>>> >>>>> On Thu, 1 Sep 2016, 16:03 Charalampos Paravalos, <[email protected]> >> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am writting to ask for advise in an issue that I have with airflow >> and >>>>> til now I have not managed to resolve. Wondering if someone else had >>>>> something similar in the past. >>>>> >>>>> So, we use airflow to schedule DAGs that will run some jobs >> periodically >>>>> (every 30min/1hr). Jobs run as normal etc., but there are some times >> that >>>>> suddenly after DAGs are finished, the next scheduled jobs do not start >> at >>>>> all. It seems like the server does not kick off the scheduled jobs at >> all, >>>>> for any of the DAGs defined (so no jobs are running on our server). >> When >>>>> that happens I have to restart the scheduler so jobs are kicked on >>>>> automatically after restart. And the jobs run until this issue appears >>>>> again (I noticed it happening every 1 or 2 days, it is quite often). >>>>> >>>>> This is very strange, tried to upgrade to 1.7.1.3 version but still >> that >>>>> issue is here. We use 32 concurrent jobs with celery workers, the >> server is >>>>> able to manage the load well. >>>>> >>>>> I believe it has to do with the scheduler, but can't understand why. >>>>> Backfilled jobs maybe? Can this be? >>>>> >>>>> I am looking forward to hearing back from someone that has any ideas. >>>>> Please let me know what information you might need about my setup >> anytime. >>>>> >>>>> Thanks for your help! >>>>> >>>>> Regards, >>>>> Babis >>>> -- >>>> >>>> Sergei > -- > > Sergei
