Re: Blog post: Upgrading & Scaling Airflow at Robinhood

2019-08-13 Thread Cong Zhu
Nice post Abhishek! Thanks Max for sharing! In Airbnb, we also have a cron job to restart the scheduler. Our cron job frequently check some critical metrics (canary delay, latest scheduler heartbeat time, and latest dag_processor_manager_log modification time), and restart the scheduler if

Re: Blog post: Upgrading & Scaling Airflow at Robinhood

2019-08-09 Thread Kevin Yang
Nice post Abhishek! Glad our discussion was helpful for you guys. To share more context with the community, Airbnb had task stuck in QUEUED state problem before too. Our issues were more on the executor side. Originally it was because message lost issue in early version celery, which Alex Guziel

Re: Blog post: Upgrading & Scaling Airflow at Robinhood

2019-08-09 Thread Jarek Potiuk
+1 On Fri, Aug 9, 2019 at 10:54 PM Maxime Beauchemin < maximebeauche...@gmail.com> wrote: > Thanks to Abhishek Ray @ Robinhood for this great post. I felt like I had > to share it here > > https://robinhood.engineering/upgrading-scaling-airflow-at-robinhood-5b625dfaa2ee > > Max > -- Jarek

Blog post: Upgrading & Scaling Airflow at Robinhood

2019-08-09 Thread Maxime Beauchemin
Thanks to Abhishek Ray @ Robinhood for this great post. I felt like I had to share it here https://robinhood.engineering/upgrading-scaling-airflow-at-robinhood-5b625dfaa2ee Max