Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-06-07 Thread Daniel Standish
I would like to add to this discussion a change to `TaskInstance.previous_ti`. It is a useful property in template context, but it has an identity crisis. Currently this property may return a different value depending on whether your dag is `catchup` and whether your dag has a schedule interval.

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-30 Thread Bas Harenslak
Nope. I never planned to have a discussion on execution_date but it appeared more important than the other variables mentioned in my first email :) I already wrote a summary halfway this thread. With the updates since then: * yesterday_ds, yesterday_ds_nodash, tomorrow_ds, tomorrow_ds_nodas

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-30 Thread Ash Berlin-Taylor
Did we come to any conclusion on this topic? > On 12 Apr 2019, at 10:26, Ash Berlin-Taylor wrote: > > Does anyone actually use end_date (of either spelling) given it's value is > currently the same as `ds`: > https://github.com/apache/airflow/blob/3020d9733cb189f091489a62f58f0f586dc8d4a9/airf

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-12 Thread Bolke de Bruin
Okay Yes inlets and outlets will be picked up again. I expect somewhere this quarter. Sent from my iPhone > On 12 Apr 2019, at 11:06, Bas Harenslak wrote: > > Reason for deprecating end_date is it is incorrect since the name does not > match the value, it’s currently the same as ds (the sta

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-12 Thread Ash Berlin-Taylor
Does anyone actually use end_date (of either spelling) given it's value is currently the same as `ds`: https://github.com/apache/airflow/blob/3020d9733cb189f091489a62f58f0f586dc8d4a9/airflow/models/taskinstance.py#L1183-L1184 'END_DATE': ds, 'end_date': ds, That seems like it is just

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-12 Thread Bas Harenslak
Reason for deprecating end_date is it is incorrect since the name does not match the value, it’s currently the same as ds (the start datetime of the interval). Another option would be to keep it but set the value to the end date of the interval, although I think that would create even more conf

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-12 Thread Bolke de Bruin
I don’t think guarding inlets and outlets makes sense, cause if they are not defined they code won’t be executed anyway. Ie. The logic is already there, no need for a config flag. End-date is actually used by some beyond Airbnb as far as I know. What is the reason for deprecating it? Execution

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-11 Thread Maxime Beauchemin
It should be labelled as a "bridge" or "transition" release somehow, and tell the community to absolutely to go there first and address all deprecation warnings prior to upgrade to 2.0. I wonder if semver has something for that. It might be the time to deprecate more things. Spring cleaning! Shou

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-11 Thread Ash Berlin-Taylor
I am happy to do another small(!) 1.10.x release. (There was a small bug I introduced where I broke the rendering of doc_md on dags) On 11 April 2019 22:52:13 BST, "Driesprong, Fokko" wrote: >I agree with Max here, we should be careful. > >Regarding the yesterday_ds, yesterday_ds_nodash, tomorr

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-11 Thread Maxime Beauchemin
`end_date` is useful, some DAGs / tasks may be scheduled to expire, say an A/B test in an A/B testing framework with an experiment duration, a backfill framework, or a framework/UI that allows users to schedule to run a task to run for say 30 days. I'd keep this one for sure, not all pipelines / ta

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-11 Thread Driesprong, Fokko
I agree with Max here, we should be careful. Regarding the yesterday_ds, yesterday_ds_nodash, tomorrow_ds, tomorrow_ds_nodash. I'm not against having better readable shorthands, but more about the fact that having a tomorrow_ds doesn't make sense when we have an hourly (or weekly) job. I'm agains

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-10 Thread Bas Harenslak
Great discussion, let’s stay on track. If I can summarise: * yesterday_ds, yesterday_ds_nodash, tomorrow_ds, tomorrow_ds_nodash * Arthur: some users use these for convenience * Bas/Fokko: these are values that can be easily derived in a one-liner * tables * nobody?

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-10 Thread Maxime Beauchemin
Making backwards incompatible changes that require altering the thousands (millions?!) of DAGs in the wild will alienate the community and prevent many from orchestrating an upgrade. Upgrading hundreds of DAGs and Airflow atomically would be hard and dangerous. To mitigate this, changes to the DAG

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-09 Thread James Meickle
I agree with Ash here. The naming of "execution_date" is incredibly confusing to people who are new to Airflow, who think it has something to do with... execution. However, I think that there's still room for improvement with "period_start" and "period_end". Think about manually triggered tasks -

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-09 Thread Driesprong, Fokko
Ash, Personally, I think this is quite clear, there is a list of reasons why the job isn't being scheduled: [image: image.png] Coming back to the question of Bas, I believe that yesterday_ds does not make sense since we cannot assume that the schedule is daily. I don't see any usage of this varia

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-09 Thread Ash Berlin-Taylor
To (slightly) hijack this thread: On the subject of execuction_date: as I'm sure we're all aware the concept of execution_date is confusing to new-commers to Airflow (there are many questions about "why hasn't my DAG run yet"? "Why is my dag a day behind?" etc.) and although we mention this in

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-08 Thread Arthur Wiedmer
Hi Bas, 1) I am aware of a few places where those parameters are used in production in a few hundred jobs. I highly recommend we don't deprecate them unless we do it in a major version. 2) As James mentioned, inlets and outlets are a lineage annotation feature which is still under development. Le

Re: [DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-08 Thread James Meickle
I'm generally in favor of this idea. Several people on my team have been confused by the different date options and their meaning. For the dates, I think we should switch to providing alternate representations of dates exclusively via Jinja filters. So instead of "next_execution_date", you'd use "

[DISCUSS] AIRFLOW-4192 - remove duplicate/obsolete/derived task context variables

2019-04-08 Thread Bas Harenslak
Hi all, Following Tao Feng’s question to discuss this PR (AIRFLOW-4192), please discuss here if you agree/disagree/would change. --- The summary of the PR: I was confused by the task cont