Re: [AIP-35] Add Signal Based Scheduling To Airflow

2020-06-16 Thread Kevin Yang
I'm in general supportive of this idea of supporting streaming jobs. We in Airbnb have historically ran stream jobs for years on Airflow, with some hacks of course. Yes the stream jobs might not be idempotent or so to fit in the Airflow paradigm. But I personally would love to see Airflow be expand

Re: [AIP-35] Add Signal Based Scheduling To Airflow

2020-06-16 Thread 蒋晓峰
Hi barath, Could you please provide the permalink of your previous discussion about trigger based dag runs? In my opinion, compared with trigger based dag runs, signal based scheduling provides finer-grained triggers to determine whether to run the task instances on the Operator or not. Ext

Re: [AIP-35] Add Signal Based Scheduling To Airflow

2020-06-16 Thread 蒋晓峰
Hi Chris, Yes, We could have the scenario that the DagRun contains the streaming task instances which run forever. Airflow currently focuses on batch tasks scheduling and assumes the tasks in DAG could be completed. Only the existing mechanism doesn't support tasks intended to run indefinitely

Re: [UPDATE] AIP-31 .output update

2020-06-16 Thread 蒋晓峰
+1(not binding) On Wed, Jun 17, 2020 at 3:03 AM Gerard Casas Saez wrote: > Hi everyone, > > Sending an email here to consolidate an update to the AIP-31 that has > happened while we have been implementing this. > > When AIP-31 was proposed, the proposal mentioned that all operators would > have

Re: [UPDATE] AIP-31 .output update

2020-06-16 Thread Kaxil Naik
+1 (binding) On Tue, Jun 16, 2020 at 8:03 PM Gerard Casas Saez wrote: > Hi everyone, > > Sending an email here to consolidate an update to the AIP-31 that has > happened while we have been implementing this. > > When AIP-31 was proposed, the proposal mentioned that all operators would > have a _

[UPDATE] AIP-31 .output update

2020-06-16 Thread Gerard Casas Saez
Hi everyone, Sending an email here to consolidate an update to the AIP-31 that has happened while we have been implementing this. When AIP-31 was proposed, the proposal mentioned that all operators would have a __call__ method which would be used to define DAG in a functional manner. While imp

Re: [AIP-35] Add Signal Based Scheduling To Airflow

2020-06-16 Thread bharath palaksha
I had a started a similar discussion earlier. Trigger based dag runs, a sensor instead of a cron expression which tells whether to trigger the dag run or not. This is similar to that. It is very useful when you have external systems outside of Airflow and which can't be programmed to use REST API

Re: [AIP-35] Add Signal Based Scheduling To Airflow

2020-06-16 Thread Chris Palmer
Nicholas, Are you saying that you actually have tasks in Airflow that are intended to run indefinitely? That in of itself seems to be a huge fundamental departure from many of the assumptions built into Airflow. Chris On Tue, Jun 16, 2020 at 12:00 PM Gerard Casas Saez wrote: > That looks inte

Re: [DISCUSS] Parametrized DAGs

2020-06-16 Thread Dan Davydov
I think AIP is borderline, but would probably err on the side of a tiny AIP since it's a fairly large change in a part of Airflow that is touching the user interface. I do not think we should support RunTimeParams to modify the topology (at > least at the beginning). I strognly agree and think we

Re: [DISCUSS] Parametrized DAGs

2020-06-16 Thread Gerard Casas Saez
How should we go about this? Is an AIP needed? GitHub issues? Given most implementation for the backend seems to be done, it may be just needed to do a few issues on GitHub and work on them. Gerard Casas Saez Twitter | Cortex | @casassaez On Jun 16, 2020, 2:07 AM -0600, Tomasz Urbaszek , wrote:

Re: [AIP-35] Add Signal Based Scheduling To Airflow

2020-06-16 Thread Gerard Casas Saez
That looks interesting. My main worry is how to handle multiple executions of Signal based operators. If I follow your definition correctly, the idea is to run multiple evaluations of the online trained model (on a permanently running DAGRun). So what happens when you have triggered the downstre

[AIP-35] Add Signal Based Scheduling To Airflow

2020-06-16 Thread 蒋晓峰
Hello everyone, Sending a message to everyone and collecting feedback on the AIP-35 on adding signal-based scheduling. This was previously briefly mentioned in the discussion of development slack channel. The key motivation of this proposal is to support a mixture of batch and stream jobs in the s

Re: [DISCUSS] Parametrized DAGs

2020-06-16 Thread Tomasz Urbaszek
+1 for the idea Tomek On Tue, Jun 16, 2020 at 1:39 AM Kaxil Naik wrote: > Oh yes that sounds good, +1 to the idea as long as it can return a JSON > serializable object I am fine with it. > > On Tue, Jun 16, 2020 at 12:29 AM Gerard Casas Saez > wrote: > > > By XCom support before XComArg I mean