Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-26 Thread Maxime Beauchemin
Related thoughts: * on the topic of serialization, let's be clear whether we're talking about unidirectional serialization and *not* deserialization back to the object. This works for making the web server stateless, but isn't a solution around how DAG definition get shipped around on the cluster

Re: dag_run timeouts

2019-02-26 Thread Andrew Stahlman
+1 to renaming dagrun_timeout and to adding a true execution_timeout on the DAG. In the meantime, I've opened a PR [1] to clarify the current behavior in the relevant docstring. [1] https://github.com/apache/airflow/pull/4782

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-26 Thread Kevin Yang
My bad, I was misunderstanding a bit and mixing up two issues. I was thinking about the multiple runs for one DagRun issue( e.g. after we clear the DagRun). This is an orthogonal issue. So the current implementation can work in the long term plan. Cheers, Kevin Y On Tue, Feb 26, 2019 at 2:34 AM

Re: dag_run timeouts

2019-02-26 Thread Ash Berlin-Taylor
Hmmm yeah, dag_run timeout is more closely a "cache eviction" sort of time out, and the only time the dag run timeout comes in to play iis when a DAG has reached it's maximum active runs, at which point one of the older ones that has exceeded it's timeout will be "evicted" (i.e. failed). Having

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-26 Thread Ash Berlin-Taylor
> On 26 Feb 2019, at 09:37, Kevin Yang wrote: > > Now since we're already trying to have multiple graphs for one > execution_date, maybe we should just have multiple DagRun. I thought that there is exactly 1 graph for a DAG run - dag_run has a "graph_id" column

Re: [DISCUSS] AIP-12 Persist DAG into DB

2019-02-26 Thread Kevin Yang
@Dan Davydov Continuing the discussion from the PR here--so we discuss only code change in the PR. For the AIP: I agree with what Max and Dan mentioned in the last thread discussion this AIP: for long term we want DAG serialization and that should ser