I think whatever approach we decide on we should display
*next_execution_date* in the webserver for each DAG. This would help most
of the users.

Regards,
Kaxil

On Sat, Jan 23, 2021 at 10:25 PM Dmitri Khokhlov <dkhokh...@gmail.com>
wrote:

> Root problem:
> - existing Airflow schedule syntax defines only one interval pattern per
> DAG
> - there are use-cases that need multiple interval patterns per DAG (during
> a day etc)
>
> I vote for "crontab list" solution from Deng Xiaodong. Example:
>
> *schedule_interval = ["* 0,22,23 * * *", "30 1-21 * * *"]
>
> Reasoning:
> - it is additive change - does not remove or break existing usage patterns
> (very important)
> - it is generic and it has compact definition - easy to read/print/present
> in UI (a string). that is why it is better than "function" approach.
> - it is complete solution as it allows to define interval based schedules
> of any complexity.
> - it is relatively easy to implement by OR-ing crontabs times and choosing
> next earliest run time and following these instructions from Ash
> Berlin-Taylor <a...@apache.org>:
> "
> The way the scheduler works now it just looks at two columns on the dag
> (model) table called I think "next_dagrun_after" (which is the earliest
> date that the dag run can be created, and "next execution date" (which is
> the value to put in the execution date of the dag run when it's created.
>
> Both these values are set by the dag parser process, which has full access
> to run code. What ever interface for defining new schedule expression
> should run in the existing process, much like how James C did in a subclass.
> "
> --
> Dmitri
>
>
> On 2021/01/21 19:12:06, Daniel Imberman <daniel.imber...@gmail.com>
> wrote:
> > My only concern with tying this to the dag_parsing process is that that
> process might miss SLAs because it takes too long to loop around. I could
> imagine a separate thread or component that can read either TimeTable
> objects or SmartSensor objects and run them might make sense.
> > Ultimately I don’t see anything about SmartSensors that specifically
> need to run in a DAG. It could just as easily be while loop or something
> embarrasingly parallel (as sensors/timetables shouldn’t depend on each
> other).
> >
> > On Thu, Jan 21, 2021 at 11:07 AM, Vikram Koka <vik...@astronomer.io>
> wrote:
> > Great discussion.
> > I generally agree with the "Custom scheduling class" / subclass approach
> which would run as part of the "scheduler" set of processes, rather than an
> internal DAG approach.
> > I do think it would be good to have boundaries on what information this
> class would operate on and at what frequency. This is primarily from a
> performance standpoint, though it could be argued that there are security
> concerns with that as well.
> > Specifically from the "what information would this have access to"
> perspective, I think that interface would be helpful in clarifying some of
> the use cases and making sure that those are covered. One example I was
> thinking about in the "sunset" example is location. I was originally
> thinking of a timezone, but this is more specific than that.
> >
> >
> > On Thu, Jan 21, 2021 at 10:35 AM Ash Berlin-Taylor < a...@apache.org [
> a...@apache.org] > wrote:
> > It shouldn't need something that complex (or to my mind hacky) as in
> internal DAG.
> >
> > The way the scheduler works now it just looks at two columns on the dag
> (model) table called I think "next_dagrun_after" (which is the earliest
> date that the dag run can be created, and "next execution date" (which is
> the value to put in the execution date of the dag run when it's created.
> >
> > Both these values are set by the dag parser process, which has full
> access to run code. What ever interface for defining new schedule
> expression should run in the existing process, much like how James C did in
> a subclass.
> >
> > Ash
> >
> > On 21 January 2021 18:21:58 GMT, Daniel Imberman <
> daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote: I think
> James Idea sounds like a pretty good idea. What would you all think of us
> doing something similar to how we handle smart sensors for how we implement
> this? Have an internal DAG that reads all custom timetables and triggers a
> DAG if the function returns True? Seems like a pretty simple/customizeable
> solution.
> > On Wed, Jan 20, 2021 at 5:52 PM, James Timmins < ja...@astronomer.io [
> ja...@astronomer.io] > wrote:
> > Django provides a really good model for allowing users to customize the
> behavior of Class Based Views. It's in line w/ what Daniel/Kaxil and co are
> saying about a consistent backend class. It uses a standard base class as
> well as a default concrete implementation. Customization then only requires
> setting an explicit class if you're overriding the default.
> > Seems that the interface is more important than the backend mechanism to
> make this work. There are multiple ways to make this work internally, but
> the interface should be in line with future plans for hooks/extensible
> areas.
> > Just to make things concrete, here's my understanding of what that would
> look like / what they're suggesting.
> > BaseTimetable abstract class - Defines a ` get_next_execution_time `
> method. This method accepts one argument, an arbitrary datetime value.
> Based on that datetime, this method returns the next time the DAG should
> start. This makes it easy to schedule past events, and also makes it easy
> to print out a "dry run" of execution times for testing purposes. - Defines
> a '_check_timetable_arguments ` method that looks for any existing
> timetable args in the DAG and makes sure they're used by whatever Timetable
> class is selected. Error checking.
> > CronTimetable - Default TimetableClass. Built on BaseTimetable.
> > If they want a different timetable, they can just extend BaseTimetable
> and define a custom `get_next_execution_time` class. Then pass the class
> into the DAG constructor under the `timetable_class` argument. So for
> `sunset` or `sunrise`, they could easily create a `SolarTimetable` class
> and pass that in.
> > `get_next_execution_time` can then be called whenever DAGs are parsed or
> whenever tasks run.
> > On Wed, Jan 20, 2021 at 3:53 PM James Coder < jcode...@gmail.com [
> jcode...@gmail.com] > wrote:
> > Kaxil you beat me to it. I actually have a dag where I achieve an
> irregular schedule by overriding DAG.next [http://DAG.next]
> _dagrun_info(). If that method were swapped out for an object it may be a
> semi-easy way to make the schedule “plugable”.
> >
> > James Coder
> > On Jan 20, 2021, at 6:37 PM, Kaxil Naik < kaxiln...@gmail.com [
> kaxiln...@gmail.com] > wrote:
> >
> > "CronBackend" / "ScheduleIntervalBackend" :D similar to Xcom and Secrets
> Backend
> > Would be definitely good to have Custom Schedule intervals using
> functions/class that is Serializable too.
> >
> > On Wed, Jan 20, 2021 at 11:02 PM QP Hou <q...@scribd.com.invalid> wrote:
> > On Wed, Jan 20, 2021 at 10:22 AM Daniel Imberman
> > < daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote:
> > >
> > > I love the idea of allowing users to create their own scheduling
> objects/scheduling python functions. They could either live in the
> scheduler or as a seperate process that trips some value in the DB when it
> is “true”. Would be great from a “marketplace” standpoint as well as users
> could post their custom scheduling objects for others to use.
> > >
> >
> > I like this idea as well, a quick escape patch for custom and complex
> > scheduling behaviors without having to wait for upstream support.
>

Reply via email to