Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-24 Thread Ash Berlin-Taylor
James and I debugged a bit and we tracked down the problem to the multiprocessing.Manager process leaking. So I've removed the Manager entirely here https://github.com/apache/airflow/pull/5615 (whilst also not using a mp.Queue and the problems we had with unpredictable behaviour around it.) Onc

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-22 Thread Ash Berlin-Taylor
Given these two issues I'm changing my vote to -1. Ugh, Mysql :( The "orphan" (they aren't zombie in the linux process sense) process is massively exasperated but having a `--run-duration 600`, but we should try and fix this if we can. > On 20 Jul 2019, at 14:55, Jarek Potiuk wrote: > > I ha

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-20 Thread Jarek Potiuk
I have another issue that might be nail-in-the-coffin for this RC3. This is "external" - not related to changes in RC3 but to MySQL new features. There is an issue: https://issues.apache.org/jira/projects/AIRFLOW/issues/AIRFLOW-4995?filter=allopenissues . created yesterday about not being able to

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-20 Thread Driesprong, Fokko
I've removed the run_duration and I'm in favor of removing num_runs as well. I've noticed that the service not always exit cleanly, and left the database in an inconsistent state. After restarting, some of the DagRuns would not be picked up. This should not happen, but not sure if the root cause is

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-19 Thread Kaxil Naik
If you are using num_runs or run_duration, remove them. We have deprecated run_duration in master (we should probably cherry-pick that in next release) - https://github.com/apache/airflow/blob/30defe130df1399d8ff41a350f714af820113161/UPDATING.md#remove-run_duration (Note in updating.md) We should

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-18 Thread James Meickle
Yes, it's zombie processes owned by systemd. Here's a writeup with some various logs: https://gist.github.com/Eronarn/c5bc2df607d168d6fda3a70700c941d9 It's reproducible (lmk if you want to co-pilot on debugging), and we're up to 3 zombies in the hour or two since our most recent reboot. So it won'

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-18 Thread Ash Berlin-Taylor
Errk, that's not good. When you say orphaned do you mean zombie processes owned by pid 1, or just "process that are hanging around?" https://github.com/apache/airflow/pull/5605 might help if it's the later (but I suspect it isn't) We've also had reports of the scheduler having a memory leak si

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-18 Thread James Meickle
Hi folks, Sorry to throw a wrench into this, but we're found that this release is leaving orphaned processes that eventually OOMing the scheduler. 4 clusters have this problem post-upgrade, while 1 non-upgraded cluster doesn't have this problem. In our case we're using Supervisor to manage schedu

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-17 Thread Bolke de Bruin
+1, binding. Thanks ash! Verstuurd vanaf mijn iPad > Op 17 jul. 2019 om 20:45 heeft Kaxil Naik het volgende > geschreven: > > +1 (binding) Ran DAGs in both UIs. LGTM > > Great Job Ash. Appreciate all the effort you put into this. > > Regards, > Kaxil > > On Wed, Jul 17, 2019 at 10:30 PM F

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-17 Thread Kaxil Naik
+1 (binding) Ran DAGs in both UIs. LGTM Great Job Ash. Appreciate all the effort you put into this. Regards, Kaxil On Wed, Jul 17, 2019 at 10:30 PM Felix Uellendall wrote: > +1 (non-binding) flawlessly passed all my tests via rbac and classic ui. > Also tested it on production-level dags. Grea

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-17 Thread Felix Uellendall
+1 (non-binding) flawlessly passed all my tests via rbac and classic ui. Also tested it on production-level dags. Great job Ash, thanks :) Kind regards, Felix Sent from ProtonMail mobile Original Message On Jul 17, 2019, 16:41, Ash Berlin-Taylor wrote: > Thanks Andrii - I mis

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-17 Thread Ash Berlin-Taylor
Thanks Andrii - I missed it to when reviewing the PR so it's my fault too. Make the PR against the v1-10-stable branch please. -ash > On 17 Jul 2019, at 14:25, Andrii Soldatenko > wrote: > > @Ash, i'll fix wrong section bug. > > Sorry about that. > > On Wed, Jul 17, 2019 at 4:11 PM Robin Ed

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-17 Thread Philippe Gagnon
+1 (non-binding) On Mon, Jul 15, 2019 at 10:17 AM Ash Berlin-Taylor wrote: > Hello Airflow community, > > This email is calling a vote on the release, which will last for 72 hours > (2019-07-08 15:15 Z), and until three binding votes have been cast. > Consider this my (binding) +1. > > Airflow 1

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-17 Thread Andrii Soldatenko
@Ash, i'll fix wrong section bug. Sorry about that. On Wed, Jul 17, 2019 at 4:11 PM Robin Edwards wrote: > +1 (none binding) - been running in production since RC2. > > Thanks for all your hard work > > R > > On Tue, 16 Jul 2019 at 21:15, Ash Berlin-Taylor wrote: > > > > Thanks for testing. >

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-17 Thread Robin Edwards
+1 (none binding) - been running in production since RC2. Thanks for all your hard work R On Tue, 16 Jul 2019 at 21:15, Ash Berlin-Taylor wrote: > > Thanks for testing. > > On 1) everyone should run upgradedb on every upgrade. The behaviour of not > running it wasn't great. > > 2) I thought we

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-16 Thread Ash Berlin-Taylor
Thanks for testing. On 1) everyone should run upgradedb on every upgrade. The behaviour of not running it wasn't great. 2) I thought we set deprecation on the ES logging config vars, except we put the depreciation under the wrong section: https://github.com/apache/airflow/blob/1.10.4rc3/airflo

Re: [VOTE] Release Airflow 1.10.4 from RC3

2019-07-16 Thread James Meickle
+1 (nonbinding) to the release, it fixes a lot of UI issues we've been seeing lately. Though two notes: 1) Tasks were unscheduleable until I ran an upgradedb due the default pool change. 2) I got crash loops because I based our custom logging file off of the previous version's template. The chang

[VOTE] Release Airflow 1.10.4 from RC3

2019-07-15 Thread Ash Berlin-Taylor
Hello Airflow community, This email is calling a vote on the release, which will last for 72 hours (2019-07-08 15:15 Z), and until three binding votes have been cast. Consider this my (binding) +1. Airflow 1.10.4 RC3 is available at: https://dist.apache.org/repos/dist/dev/airflow/1.10.4rc3/ *a