Hi Bas, 1) I am aware of a few places where those parameters are used in production in a few hundred jobs. I highly recommend we don't deprecate them unless we do it in a major version.
2) As James mentioned, inlets and outlets are a lineage annotation feature which is still under development. Let's leave them in, but we can guard them behind a feature flag if you prefer. 3) the yesterday*/tomorrow* params are convenience ones if you use a daily ETL. I agree with you that they are simple to compute, but not everyone using Apache Airflow is amazing with Python. Some users are only trying to get a query scheduled and rely on a couple of niceties like these to get by. 4) latest_date, end_date (I feel like there used to be start_date, but maybe it got lost) were a blend of things which were used by a backfill framework used internally at Airbnb. Latest date was used if you needed to join to a dimension for which you only wanted the latest version of the attributes in you backfill. end_date was used for time ranges where several days were processed together in a range to save on compute. I don't see an issue with removing them. Best regards, Arthur On Mon, Apr 8, 2019 at 5:37 AM Bas Harenslak <basharens...@godatadriven.com> wrote: > Hi all, > > Following Tao Feng’s question to discuss this PR< > https://github.com/apache/airflow/pull/5010> (AIRFLOW-4192< > https://issues.apache.org/jira/browse/AIRFLOW-4192>), please discuss here > if you agree/disagree/would change. > > ----------- > > The summary of the PR: > > I was confused by the task context values and suggest to clean up and > clarify these variables. Some are derivations from other variables, some > are undocumented and unused, some are wrong (name doesn’t match the value). > Please discuss what you think of the removal of these variables: > > > * Removed yesterday_ds, yesterday_ds_nodash, tomorrow_ds, > tomorrow_ds_nodash. IMO the next_* and previous_* variables are useful > since these require complex logic to compute the next execution date, > however would leave computing the yesterday* and tomorrow* variables up to > the user since they are simple one-liners and don't relate to the DAG > interval. > * Removed tables. This is a field in params, and is thus also > accessible by the user ({{ params.tables }}). Also, it was undocumented. > * Removed latest_date. It's the same as ds and was also undocumented. > * Removed inlets and outlets. Also undocumented, and have the > inlets/outlets ever worked/ever been used by anybody? > * Removed end_date and END_DATE. Both have the same value, so it > doesn't make sense to have both variables. Also, the value is ds which > contains the start date of the interval, so the naming didn't make sense to > me. However, if anybody argues in favour of adding "start_date" and > "end_date" to provide the start and end datetime of task instance > intervals, I'd be happy to add them. > > Cheers, > Bas >