I remember thinking about these issues in the past and thought adding some sort of `should_task_be_skipped` callback as an arg to BaseOperator would be easy and useful. Method should probably just receive a ref to the task instance.
By the very nature of interfacing with a method, we cannot guarantee that it is deterministic (same input arguments to the method might lead to a different answer over time), but we can mitigate that by documenting that it's best practice to use deterministic code in that context. I'm not quite sure what to do about `prev_ds` and `next_ds`, but it doesn't need to be handled for this proposal to be a step forward. Introduce `prev_unskipped_ds` or something like it? I'm not sure what the latest is around branching and depends_on_past, but clearly it's a bit tricky to design something that works for everyone and is intuitive. In this area people want and expect different behaviors. Max On Fri, Aug 30, 2019 at 10:17 AM Shaw, Damian P. < [email protected]> wrote: > Hi all, > > After discussion at the NY Meetup this week I've been pondering how > Airflow could support custom schedules with very little change to core > Airflow logic and keeping backwards compatibility. > > As I understand the common way to support custom schedules is through a > BranchOperator. You provide logic that on a good date executes the "run" > branch and on another date runs the "don't run" branch which usually is a > dummy operator. > > There are 2 problems associated with it which would be useful to me (and I > think the rest of the community) to solve: > > 1. depends_on_past does not play well with branching, because the > "run" branch tasks get marked as "skipped" > > 2. Template variables like "prev_ds" and "next_ds" represent the > underlying schedule and not the actual schedule you are working on > > I therefore propose a "schedule_filter_callback", a function which you > provide at DAG creation time that takes in some arguments (execution date, > timezone, DAG?), and returns a Truthy or Falsy result based on if this is a > good date to execute on. If schedule_filter_callback is None then the > current schedule logic is applied. > > I appreciate this is a fairly significant proposal, but it seems like > because it would just be 1 extra argument on the DAG and make no change to > the default behavior it doesn't quite rise to the level of AIP? Sorry if > this has already been discussed before. > > Regards, > Damian > > > =============================================================================== > > Please access the attached hyperlink for an important electronic > communications disclaimer: > http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > =============================================================================== > >
