Re: Why not mark inactive DAGs in the main scheduler loop?

2018-08-22 Thread Ruiqin Yang
I previously sent a proposal about scaling Airflow, I created Jira tickets around that time. For this particular one, it is AIRFLOW-2760 . We've finished testing it in Airbnb and plan to bake it for some time while I work on open source the

Re: Why not mark inactive DAGs in the main scheduler loop?

2018-08-22 Thread Taylor Edmiston
Kevin - Is there a Jira issue one can follow for this? On Wed, Aug 22, 2018 at 5:29 PM Ruiqin Yang wrote: > I'm working on spliting the DAG parsing manager to a subprocess and with > that we don't need to worry about scheduler doing non-supervisor stuff nor > prolong scheduler loop duration. I

Re: PR Review Dashboard?

2018-08-22 Thread Holden Karau
Thanks for the reminder, forgot to ask at coffee but I'll ask. On Wed, Aug 22, 2018, 1:52 AM Driesprong, Fokko wrote: > Hi Holden, > > Just curious if you got a hold of someone at the coffee machine :-) > > Cheers, Fokko > > Op di 7 aug. 2018 om 09:17 schreef Holden Karau : > > > The

Re: Why not mark inactive DAGs in the main scheduler loop?

2018-08-22 Thread Ruiqin Yang
I'm working on spliting the DAG parsing manager to a subprocess and with that we don't need to worry about scheduler doing non-supervisor stuff nor prolong scheduler loop duration. I can make a follow up PR to address this once I have the original PR published if you guys don't have plan to work

Will redeploying webserver and scheduler in Kubernetes cluster kill running tasks

2018-08-22 Thread Kyle Hamlin
I'm about to make the switch to Kubernetes with Airflow, but am wondering what happens when my CI/CD pipeline redeploys the webserver and scheduler and there are still long-running tasks (pods). My intuition is that since the database hold all state and the tasks are in charge of updating their

Re: Why not mark inactive DAGs in the main scheduler loop?

2018-08-22 Thread Dan Davydov
Agreed on delegation to a subprocess but I think that can come as part of a larger redesign (maybe along with uploading DAG import errors etc). The query should be quite fast so it should not have a significant impact on the Scheduler times. On Wed, Aug 22, 2018 at 3:52 PM Maxime Beauchemin <

Re: Why not mark inactive DAGs in the main scheduler loop?

2018-08-22 Thread Maxime Beauchemin
I'd rather the scheduler delegate that to one of the minions (subprocess) if possible. We should keep everything we can off the main thread. BTW I've been speaking about renaming the scheduler to "supervisor" for a while now. While renaming may be a bit tricky (updating all references in the

Re: Why not mark inactive DAGs in the main scheduler loop?

2018-08-22 Thread Taylor Edmiston
I'm not super familiar with this part of the scheduler. What exactly are the implications of doing this mid-loop vs at scheduler termination? Is there a use case where DAGs hit this besides having been deleted? The deactivate_stale_dags call doesn't appear to be super expensive or anything like

Re: ExternalTaskSensor alternatives

2018-08-22 Thread Stefan Seelmann
On 08/22/2018 06:56 PM, Tao Feng wrote: > FYI, there is an existing pr and proposal for improving sensor efficiency( > https://issues.apache.org/jira/browse/AIRFLOW-2747 and > https://github.com/apache/incubator-airflow/pull/3596/files) by the > community. And I hope I'll find some time next week

Re: Regarding airflow 1.10

2018-08-22 Thread Taylor Edmiston
Hemanth - To add to what Fokko mentioned... You can find the KubernetesExecutor implementation in KubernetesExecutor and AirflowKubernetesScheduler

Re: ExternalTaskSensor alternatives

2018-08-22 Thread Tao Feng
FYI, there is an existing pr and proposal for improving sensor efficiency( https://issues.apache.org/jira/browse/AIRFLOW-2747 and https://github.com/apache/incubator-airflow/pull/3596/files) by the community. For your idea, I am not sure if it is a good idea to add this dag dependent

Re: [RESULT][VOTE] Release Airflow 1.10.0

2018-08-22 Thread Bolke de Bruin
@max Mine is "bolke" Cheers B. Sent from my iPhone > On 22 Aug 2018, at 16:13, Driesprong, Fokko wrote: > > Certainly: https://github.com/apache/incubator-airflow/releases/tag/1.10.0 > > Cheers, Fokko > > Op wo 22 aug. 2018 om 15:18 schreef Ash Berlin-Taylor : > >> Could you push the git

Re: [RESULT][VOTE] Release Airflow 1.10.0

2018-08-22 Thread Driesprong, Fokko
Certainly: https://github.com/apache/incubator-airflow/releases/tag/1.10.0 Cheers, Fokko Op wo 22 aug. 2018 om 15:18 schreef Ash Berlin-Taylor : > Could you push the git tag too please Fokko/Bolke? > > -ash > > > On 22 Aug 2018, at 08:16, Driesprong, Fokko > wrote: > > > > Thanks Max, > > > >

ExternalTaskSensor alternatives

2018-08-22 Thread Emmanuel Brard
Hi everyone, I've recently looked at the implementation of the ExternalTaskSensor sensor and I was wondering if it would be a good idea to actually implement this check (these checks) at the scheduler level. Basically the ExternalTaskSensor runs a query against the backend database at regular

Re: [RESULT][VOTE] Release Airflow 1.10.0

2018-08-22 Thread Ash Berlin-Taylor
Could you push the git tag too please Fokko/Bolke? -ash > On 22 Aug 2018, at 08:16, Driesprong, Fokko wrote: > > Thanks Max, > > My PyPI ID is Fokko > > Cheers, Fokko > > 2018-08-21 22:49 GMT+02:00 Maxime Beauchemin : > >> I can, what's your PyPI ID? >> >> Max >> >> On Mon, Aug 20, 2018

Re: PR Review Dashboard?

2018-08-22 Thread Driesprong, Fokko
Hi Holden, Just curious if you got a hold of someone at the coffee machine :-) Cheers, Fokko Op di 7 aug. 2018 om 09:17 schreef Holden Karau : > The JIRA/Github integration tooling I’m a little more fuzzy on but I’m > doing coffee with some of the folks who probably know the details this week

Re: Jira cleanup and triage

2018-08-22 Thread Driesprong, Fokko
Hi Gerardo, Thanks for bringing this up. This is actually a good point. Recently we've moved the Apache Airflow repo to the Gitbox repo ( https://gitbox.apache.org/). Before with the Apache repo, the Github repo was just a mirror of the Apache one. Now we do everything on Github itself. We still

Jira cleanup and triage

2018-08-22 Thread Gerardo Curiel
Hi folks, Is there a recommended way for contributors to help close/triage Jira issues? I've been looking at issues to work on next, and I've found a few categories of issues: - Issues in need of triage: these might need to be checked against the latest version and then closed if they can't be

Re: [RESULT][VOTE] Release Airflow 1.10.0

2018-08-22 Thread Driesprong, Fokko
Thanks Max, My PyPI ID is Fokko Cheers, Fokko 2018-08-21 22:49 GMT+02:00 Maxime Beauchemin : > I can, what's your PyPI ID? > > Max > > On Mon, Aug 20, 2018 at 2:11 PM Driesprong, Fokko > wrote: > > > Thanks Bolke! > > > > I've just pushed the artifacts to Apache Dist: > > > >