Re: Airflow DAG Serialisation

2019-07-31 Thread Kevin Yang
s/dagbag import exception/dag import timeout exception/ On Wed, Jul 31, 2019 at 11:17 PM Kevin Yang wrote: > Hi Jonathan, for your problem, aside waiting for AIP-24 for the long term, > you can try set the dagbag_import_timeout >

Re: Airflow DAG Serialisation

2019-07-31 Thread Kevin Yang
Hi Jonathan, for your problem, aside waiting for AIP-24 for the long term, you can try set the dagbag_import_timeout to a smaller value so that those slow DAG file parsing ends faster. Also I don't thi

Re: Airflow DAG Serialisation

2019-07-31 Thread Zhou Fang
I implemented the first version of DAG serialization part in AIP-24: https://github.com/apache/airflow/pull/5701. Please take a look if you are interested @all. Thanks! It contains almost all fields of DAGs and tasks in the serialization (an example of serialized DAG here: https://github.com/apach

Re: Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread James Meickle
Yes, we use the Celery executor. To clarify, the tasks hadn't been running on workers for a long time, or even successfully submitted to Celery, so it's not a case where they got queued and then lost after some period of time. This happened shortly after UTC midnight, when we launch most of our t

Re: Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread Bolke de Bruin
Is this all with celery? Afaik Lyft runs with celery? Also if I remember correctly the Google guys had a fix for this but that hasn't been upstreamed yet? With celery task do get "lost" after a while with a certain setting (on a phone so don't have it handy, we do set a higher default) Can yo

Re: Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread James Meickle
Ash: We definitely don't run thousands of tasks. Looks like it's closer to 300 per execution date (and then retries), if I'm using the TI browser right. In my case, I found 21 tasks in "scheduled" state after 1 day of not restarting. One of our hourly "canary" DAGs got included in the pile-up - s

Re: Airflow DAG Serialisation

2019-07-31 Thread Tao Feng
hey Zhou, Great to see this happens and make it backward compatible. I will persist DAG into DB is definitely needed. And it will make migration easier with a lightweight approach. At Lyft we sometimes observe nondeterministic increased scheduling delay once users add some dynamic generated large

Re: Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread Tao Feng
Late in the game as I was't aware of `run_duration` option been removed. But just want to point out that Lyft also did very similar with James' setup that we run the scheduler in a fix internal instead of infinite loop and let the runit/supervisor to restart the scheduler process. This is to solve:

Re: Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread Ash Berlin-Taylor
Thanks for testing this out James, shame to discover we still have problems in that area. Do you have an idea of how many tasks per day we are talking about here? Your cluster schedules quite a large number of tasks over the day (in the 1k-10k range?) right? I'd say whatever causes a task to b

Removal of "run_duration" and its impact on orphaned tasks

2019-07-31 Thread James Meickle
In my testing of 1.10.4rc3, I discovered that we were getting hit by a process leak bug (which Ash has since fixed in 1.10.4rc4). This process leak was minimal impact for most users, but was exacerbated in our case by using "run_duration" to restart the scheduler every 10 minutes. To mitigate that

Re: Request permission to modify existing AIP

2019-07-31 Thread Chen Tong
Thank you! On Wed, Jul 31, 2019 at 9:05 AM Driesprong, Fokko wrote: > Hi Chen, > > You should have permissions now. > > Cheers, Fokko > > Op wo 31 jul. 2019 om 14:55 schreef Chen Tong : > > > Hi, > > I'd like to modify my votes for AIP-21: Changes in import paths. Could > you > > help grand me t

Re: Request permission to modify existing AIP

2019-07-31 Thread Driesprong, Fokko
Hi Chen, You should have permissions now. Cheers, Fokko Op wo 31 jul. 2019 om 14:55 schreef Chen Tong : > Hi, > I'd like to modify my votes for AIP-21: Changes in import paths. Could you > help grand me the write permission? > Thanks! >

Request permission to modify existing AIP

2019-07-31 Thread Chen Tong
Hi, I'd like to modify my votes for AIP-21: Changes in import paths. Could you help grand me the write permission? Thanks!

Re: Airflow DAG Serialisation

2019-07-31 Thread Dan Davydov
An idea for serialization of dynamic DAGs is moving the serialization to the actual clients. This would require having a python Airflow API that the clients could call like dag.publish(). This enables a couple of things: 1) Clients can serialize as often as they like, and can even serialize in an e

Re: [VOTE] Release Apache Airflow 1.10.4 from RC4

2019-07-31 Thread Kaxil Naik
Thank You Ash & Jarek, well done. Will test the RCs on Friday after reaching UK. On Wed, Jul 31, 2019, 15:55 Ash Berlin-Taylor wrote: > And a shout-out to Jarek for his help with this release - both with his > work on speeding up the CI pipeline as much as we can on Travis, and with > some of t

Re: [VOTE] Release Apache Airflow 1.10.4 from RC4

2019-07-31 Thread Ash Berlin-Taylor
And a shout-out to Jarek for his help with this release - both with his work on speeding up the CI pipeline as much as we can on Travis, and with some of the cherry-picking! -a > On 31 Jul 2019, at 11:07, Jarek Potiuk wrote: > > 💟 > > On Wed, Jul 31, 2019 at 12:03 PM Ash Berlin-Taylor wrote

Re: [VOTE] Release Apache Airflow 1.10.4 from RC4

2019-07-31 Thread Jarek Potiuk
💟 On Wed, Jul 31, 2019 at 12:03 PM Ash Berlin-Taylor wrote: > Hi my fellow Airflow peeps, > > After the process leaking bug James found and the Mysql 8.0.16+ issue > Jarek reported (both of which have been fixed) we are now ready to try > again. I have just cut 1.10.4rc4. > > This email is calli

[VOTE] Release Apache Airflow 1.10.4 from RC4

2019-07-31 Thread Ash Berlin-Taylor
Hi my fellow Airflow peeps, After the process leaking bug James found and the Mysql 8.0.16+ issue Jarek reported (both of which have been fixed) we are now ready to try again. I have just cut 1.10.4rc4. This email is calling a vote on the release, which will last for 72 hours (2019-08-03 10:00

Re: [Discuss] AIP-23 Proposal "Migration out of Travis CI"

2019-07-31 Thread Jarek Potiuk
So GitLab already works on automatically running builds from for PRs :). Kamil got involved and will be out advocate on it: https://gitlab.com/gitlab-org/gitlab-ce/issues/65139 J. Principal Software Engineer Phone: +48660796129 pt., 26 lip 2019, 18:12 użytkownik Jarek Potiuk napisał: > Update:

Re: [VOTE] Changes in import paths

2019-07-31 Thread Jarek Potiuk
Yep. Agree with Ash on it. There are a number of 'action' operators specific for cloud providers and these should be our target. The transfer ones require another AIP (A lot of that already discussed in AIP-8 https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=100827303). J. Principa

Re: [VOTE] Changes in import paths

2019-07-31 Thread Ash Berlin-Taylor
This is a good idea for now. I'm also not overly concerned about these few non-cloud examples - FTPtoS3Operator can stay where it is and doesn't have to move under 'aws.' to my mind. Longer term I'd like to go back to making the "transfer/copy/transform" operators "composable" so that we can ha