I think whatever approach we decide on we should display *next_execution_date* in the webserver for each DAG. This would help most of the users.
Regards, Kaxil On Sat, Jan 23, 2021 at 10:25 PM Dmitri Khokhlov <dkhokh...@gmail.com> wrote: > Root problem: > - existing Airflow schedule syntax defines only one interval pattern per > DAG > - there are use-cases that need multiple interval patterns per DAG (during > a day etc) > > I vote for "crontab list" solution from Deng Xiaodong. Example: > > *schedule_interval = ["* 0,22,23 * * *", "30 1-21 * * *"] > > Reasoning: > - it is additive change - does not remove or break existing usage patterns > (very important) > - it is generic and it has compact definition - easy to read/print/present > in UI (a string). that is why it is better than "function" approach. > - it is complete solution as it allows to define interval based schedules > of any complexity. > - it is relatively easy to implement by OR-ing crontabs times and choosing > next earliest run time and following these instructions from Ash > Berlin-Taylor <a...@apache.org>: > " > The way the scheduler works now it just looks at two columns on the dag > (model) table called I think "next_dagrun_after" (which is the earliest > date that the dag run can be created, and "next execution date" (which is > the value to put in the execution date of the dag run when it's created. > > Both these values are set by the dag parser process, which has full access > to run code. What ever interface for defining new schedule expression > should run in the existing process, much like how James C did in a subclass. > " > -- > Dmitri > > > On 2021/01/21 19:12:06, Daniel Imberman <daniel.imber...@gmail.com> > wrote: > > My only concern with tying this to the dag_parsing process is that that > process might miss SLAs because it takes too long to loop around. I could > imagine a separate thread or component that can read either TimeTable > objects or SmartSensor objects and run them might make sense. > > Ultimately I don’t see anything about SmartSensors that specifically > need to run in a DAG. It could just as easily be while loop or something > embarrasingly parallel (as sensors/timetables shouldn’t depend on each > other). > > > > On Thu, Jan 21, 2021 at 11:07 AM, Vikram Koka <vik...@astronomer.io> > wrote: > > Great discussion. > > I generally agree with the "Custom scheduling class" / subclass approach > which would run as part of the "scheduler" set of processes, rather than an > internal DAG approach. > > I do think it would be good to have boundaries on what information this > class would operate on and at what frequency. This is primarily from a > performance standpoint, though it could be argued that there are security > concerns with that as well. > > Specifically from the "what information would this have access to" > perspective, I think that interface would be helpful in clarifying some of > the use cases and making sure that those are covered. One example I was > thinking about in the "sunset" example is location. I was originally > thinking of a timezone, but this is more specific than that. > > > > > > On Thu, Jan 21, 2021 at 10:35 AM Ash Berlin-Taylor < a...@apache.org [ > a...@apache.org] > wrote: > > It shouldn't need something that complex (or to my mind hacky) as in > internal DAG. > > > > The way the scheduler works now it just looks at two columns on the dag > (model) table called I think "next_dagrun_after" (which is the earliest > date that the dag run can be created, and "next execution date" (which is > the value to put in the execution date of the dag run when it's created. > > > > Both these values are set by the dag parser process, which has full > access to run code. What ever interface for defining new schedule > expression should run in the existing process, much like how James C did in > a subclass. > > > > Ash > > > > On 21 January 2021 18:21:58 GMT, Daniel Imberman < > daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote: I think > James Idea sounds like a pretty good idea. What would you all think of us > doing something similar to how we handle smart sensors for how we implement > this? Have an internal DAG that reads all custom timetables and triggers a > DAG if the function returns True? Seems like a pretty simple/customizeable > solution. > > On Wed, Jan 20, 2021 at 5:52 PM, James Timmins < ja...@astronomer.io [ > ja...@astronomer.io] > wrote: > > Django provides a really good model for allowing users to customize the > behavior of Class Based Views. It's in line w/ what Daniel/Kaxil and co are > saying about a consistent backend class. It uses a standard base class as > well as a default concrete implementation. Customization then only requires > setting an explicit class if you're overriding the default. > > Seems that the interface is more important than the backend mechanism to > make this work. There are multiple ways to make this work internally, but > the interface should be in line with future plans for hooks/extensible > areas. > > Just to make things concrete, here's my understanding of what that would > look like / what they're suggesting. > > BaseTimetable abstract class - Defines a ` get_next_execution_time ` > method. This method accepts one argument, an arbitrary datetime value. > Based on that datetime, this method returns the next time the DAG should > start. This makes it easy to schedule past events, and also makes it easy > to print out a "dry run" of execution times for testing purposes. - Defines > a '_check_timetable_arguments ` method that looks for any existing > timetable args in the DAG and makes sure they're used by whatever Timetable > class is selected. Error checking. > > CronTimetable - Default TimetableClass. Built on BaseTimetable. > > If they want a different timetable, they can just extend BaseTimetable > and define a custom `get_next_execution_time` class. Then pass the class > into the DAG constructor under the `timetable_class` argument. So for > `sunset` or `sunrise`, they could easily create a `SolarTimetable` class > and pass that in. > > `get_next_execution_time` can then be called whenever DAGs are parsed or > whenever tasks run. > > On Wed, Jan 20, 2021 at 3:53 PM James Coder < jcode...@gmail.com [ > jcode...@gmail.com] > wrote: > > Kaxil you beat me to it. I actually have a dag where I achieve an > irregular schedule by overriding DAG.next [http://DAG.next] > _dagrun_info(). If that method were swapped out for an object it may be a > semi-easy way to make the schedule “plugable”. > > > > James Coder > > On Jan 20, 2021, at 6:37 PM, Kaxil Naik < kaxiln...@gmail.com [ > kaxiln...@gmail.com] > wrote: > > > > "CronBackend" / "ScheduleIntervalBackend" :D similar to Xcom and Secrets > Backend > > Would be definitely good to have Custom Schedule intervals using > functions/class that is Serializable too. > > > > On Wed, Jan 20, 2021 at 11:02 PM QP Hou <q...@scribd.com.invalid> wrote: > > On Wed, Jan 20, 2021 at 10:22 AM Daniel Imberman > > < daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote: > > > > > > I love the idea of allowing users to create their own scheduling > objects/scheduling python functions. They could either live in the > scheduler or as a seperate process that trips some value in the DB when it > is “true”. Would be great from a “marketplace” standpoint as well as users > could post their custom scheduling objects for others to use. > > > > > > > I like this idea as well, a quick escape patch for custom and complex > > scheduling behaviors without having to wait for upstream support. >