Ash, Personally, I think this is quite clear, there is a list of reasons why the job isn't being scheduled: [image: image.png]
Coming back to the question of Bas, I believe that yesterday_ds does not make sense since we cannot assume that the schedule is daily. I don't see any usage of this variable. Personally, I do use next_execution_date quite extensively. When you have a job that runs daily, but you want to change this to an hourly job. In such a case you don't want to change {{ (execution_date + macros.timedelta(days=1)) }} to {{ (execution_date + macros.timedelta(hours=1)) }} everywhere. I'm just not sure if the aggressive deprecation of is really worth it. I don't see too much harm in letting them stay. Cheers, Fokko Op di 9 apr. 2019 om 12:17 schreef Ash Berlin-Taylor <a...@apache.org>: > To (slightly) hijack this thread: > > On the subject of execuction_date: as I'm sure we're all aware the concept > of execution_date is confusing to new-commers to Airflow (there are many > questions about "why hasn't my DAG run yet"? "Why is my dag a day behind?" > etc.) and although we mention this in the docs it's a confusing concept. > > What to people think about adding two new parameters: `period_start` and > `period_end` and making these the preferred terms in place of > execution_date and next_execution_date? > > This hopefully avoids any ambitious terms like "execution" or "run" which > is understandably easy to conflate with the time the task is being run > (i.e. `now()`) > > If people think this naming is better and less confusing I would suggest > we update all the docs and examples to use these terms (but still mention > the old names somewhere, probably in the macros docs) > > Thoughts? > > -ash > > > > On 8 Apr 2019, at 16:20, Arthur Wiedmer <arthur.wied...@gmail.com> > wrote: > > > > Hi Bas, > > > > 1) I am aware of a few places where those parameters are used in > production > > in a few hundred jobs. I highly recommend we don't deprecate them unless > we > > do it in a major version. > > > > 2) As James mentioned, inlets and outlets are a lineage annotation > feature > > which is still under development. Let's leave them in, but we can guard > > them behind a feature flag if you prefer. > > > > 3) the yesterday*/tomorrow* params are convenience ones if you use a > daily > > ETL. I agree with you that they are simple to compute, but not everyone > > using Apache Airflow is amazing with Python. Some users are only trying > to > > get a query scheduled and rely on a couple of niceties like these to get > by. > > > > 4) latest_date, end_date (I feel like there used to be start_date, but > > maybe it got lost) were a blend of things which were used by a backfill > > framework used internally at Airbnb. Latest date was used if you needed > to > > join to a dimension for which you only wanted the latest version of the > > attributes in you backfill. end_date was used for time ranges where > several > > days were processed together in a range to save on compute. I don't see > an > > issue with removing them. > > > > Best regards, > > Arthur > > > > > > > > On Mon, Apr 8, 2019 at 5:37 AM Bas Harenslak < > basharens...@godatadriven.com> > > wrote: > > > >> Hi all, > >> > >> Following Tao Feng’s question to discuss this PR< > >> https://github.com/apache/airflow/pull/5010> (AIRFLOW-4192< > >> https://issues.apache.org/jira/browse/AIRFLOW-4192>), please discuss > here > >> if you agree/disagree/would change. > >> > >> ----------- > >> > >> The summary of the PR: > >> > >> I was confused by the task context values and suggest to clean up and > >> clarify these variables. Some are derivations from other variables, some > >> are undocumented and unused, some are wrong (name doesn’t match the > value). > >> Please discuss what you think of the removal of these variables: > >> > >> > >> * Removed yesterday_ds, yesterday_ds_nodash, tomorrow_ds, > >> tomorrow_ds_nodash. IMO the next_* and previous_* variables are useful > >> since these require complex logic to compute the next execution date, > >> however would leave computing the yesterday* and tomorrow* variables up > to > >> the user since they are simple one-liners and don't relate to the DAG > >> interval. > >> * Removed tables. This is a field in params, and is thus also > >> accessible by the user ({{ params.tables }}). Also, it was undocumented. > >> * Removed latest_date. It's the same as ds and was also undocumented. > >> * Removed inlets and outlets. Also undocumented, and have the > >> inlets/outlets ever worked/ever been used by anybody? > >> * Removed end_date and END_DATE. Both have the same value, so it > >> doesn't make sense to have both variables. Also, the value is ds which > >> contains the start date of the interval, so the naming didn't make > sense to > >> me. However, if anybody argues in favour of adding "start_date" and > >> "end_date" to provide the start and end datetime of task instance > >> intervals, I'd be happy to add them. > >> > >> Cheers, > >> Bas > >> > >