Is it possible to offer both? (maybe in two releases)..

that then allows the user to select the most appropriate for their scenario. 

My scenario for example is easy with multiple crons:

Monday - Thursday run job A at 9pm
Friday - run job A at 8pm

this is easier in cron than writing a python extension to handle it.

But - having the ability to write a custom method in language x then satisfies 
those that need something more complex such as the astronomy example in the 
thread.

"it looking complicated" to the user, is probably for the user to worry about - 
if it looks too complicated, they've probably selected the wrong way of doing 
it. (or chosen the simplest and not worried about how it looks)

Phil


On 2021/01/24 07:04:03, Jarek Potiuk <ja...@potiuk.com> wrote: 
> Yep. I agree with Daniel - adding multiple crons is very difficult to
> reason about. you can create arbitrary complex declarative way of defining
> complex schedule that you will have hard time understanding.  We are
> already entering the realm of programming the schedule, which IMHO is
> better to do in a "programming" language rather than cron declarations.
> 
> J.
> 
> On Sun, Jan 24, 2021 at 7:48 AM Daniel Imberman <daniel.imber...@gmail.com>
> wrote:
> 
> > I worry that multiple crons would become difficult to read for stranger
> > use-cases (for example "run on the first trading day after the 15th of the
> > month"). If we create a python function or class we can easily create a
> > "CronTimeTable" that does exactly what Dmitry is suggesting while still
> > leaving open the possibility of creating other custom schedules.
> >
> > On Sat, Jan 23, 2021, 2:32 PM Kaxil Naik <kaxiln...@gmail.com> wrote:
> >
> >> I think whatever approach we decide on we should display
> >> *next_execution_date* in the webserver for each DAG. This would help
> >> most of the users.
> >>
> >> Regards,
> >> Kaxil
> >>
> >> On Sat, Jan 23, 2021 at 10:25 PM Dmitri Khokhlov <dkhokh...@gmail.com>
> >> wrote:
> >>
> >>> Root problem:
> >>> - existing Airflow schedule syntax defines only one interval pattern per
> >>> DAG
> >>> - there are use-cases that need multiple interval patterns per DAG
> >>> (during a day etc)
> >>>
> >>> I vote for "crontab list" solution from Deng Xiaodong. Example:
> >>>
> >>> *schedule_interval = ["* 0,22,23 * * *", "30 1-21 * * *"]
> >>>
> >>> Reasoning:
> >>> - it is additive change - does not remove or break existing usage
> >>> patterns (very important)
> >>> - it is generic and it has compact definition - easy to
> >>> read/print/present in UI (a string). that is why it is better than
> >>> "function" approach.
> >>> - it is complete solution as it allows to define interval based
> >>> schedules of any complexity.
> >>> - it is relatively easy to implement by OR-ing crontabs times and
> >>> choosing next earliest run time and following these instructions from Ash
> >>> Berlin-Taylor <a...@apache.org>:
> >>> "
> >>> The way the scheduler works now it just looks at two columns on the dag
> >>> (model) table called I think "next_dagrun_after" (which is the earliest
> >>> date that the dag run can be created, and "next execution date" (which is
> >>> the value to put in the execution date of the dag run when it's created.
> >>>
> >>> Both these values are set by the dag parser process, which has full
> >>> access to run code. What ever interface for defining new schedule
> >>> expression should run in the existing process, much like how James C did 
> >>> in
> >>> a subclass.
> >>> "
> >>> --
> >>> Dmitri
> >>>
> >>>
> >>> On 2021/01/21 19:12:06, Daniel Imberman <daniel.imber...@gmail.com>
> >>> wrote:
> >>> > My only concern with tying this to the dag_parsing process is that
> >>> that process might miss SLAs because it takes too long to loop around. I
> >>> could imagine a separate thread or component that can read either 
> >>> TimeTable
> >>> objects or SmartSensor objects and run them might make sense.
> >>> > Ultimately I don’t see anything about SmartSensors that specifically
> >>> need to run in a DAG. It could just as easily be while loop or something
> >>> embarrasingly parallel (as sensors/timetables shouldn’t depend on each
> >>> other).
> >>> >
> >>> > On Thu, Jan 21, 2021 at 11:07 AM, Vikram Koka <vik...@astronomer.io>
> >>> wrote:
> >>> > Great discussion.
> >>> > I generally agree with the "Custom scheduling class" / subclass
> >>> approach which would run as part of the "scheduler" set of processes,
> >>> rather than an internal DAG approach.
> >>> > I do think it would be good to have boundaries on what information
> >>> this class would operate on and at what frequency. This is primarily from 
> >>> a
> >>> performance standpoint, though it could be argued that there are security
> >>> concerns with that as well.
> >>> > Specifically from the "what information would this have access to"
> >>> perspective, I think that interface would be helpful in clarifying some of
> >>> the use cases and making sure that those are covered. One example I was
> >>> thinking about in the "sunset" example is location. I was originally
> >>> thinking of a timezone, but this is more specific than that.
> >>> >
> >>> >
> >>> > On Thu, Jan 21, 2021 at 10:35 AM Ash Berlin-Taylor < a...@apache.org [
> >>> a...@apache.org] > wrote:
> >>> > It shouldn't need something that complex (or to my mind hacky) as in
> >>> internal DAG.
> >>> >
> >>> > The way the scheduler works now it just looks at two columns on the
> >>> dag (model) table called I think "next_dagrun_after" (which is the 
> >>> earliest
> >>> date that the dag run can be created, and "next execution date" (which is
> >>> the value to put in the execution date of the dag run when it's created.
> >>> >
> >>> > Both these values are set by the dag parser process, which has full
> >>> access to run code. What ever interface for defining new schedule
> >>> expression should run in the existing process, much like how James C did 
> >>> in
> >>> a subclass.
> >>> >
> >>> > Ash
> >>> >
> >>> > On 21 January 2021 18:21:58 GMT, Daniel Imberman <
> >>> daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote: I think
> >>> James Idea sounds like a pretty good idea. What would you all think of us
> >>> doing something similar to how we handle smart sensors for how we 
> >>> implement
> >>> this? Have an internal DAG that reads all custom timetables and triggers a
> >>> DAG if the function returns True? Seems like a pretty simple/customizeable
> >>> solution.
> >>> > On Wed, Jan 20, 2021 at 5:52 PM, James Timmins < ja...@astronomer.io [
> >>> ja...@astronomer.io] > wrote:
> >>> > Django provides a really good model for allowing users to customize
> >>> the behavior of Class Based Views. It's in line w/ what Daniel/Kaxil and 
> >>> co
> >>> are saying about a consistent backend class. It uses a standard base class
> >>> as well as a default concrete implementation. Customization then only
> >>> requires setting an explicit class if you're overriding the default.
> >>> > Seems that the interface is more important than the backend mechanism
> >>> to make this work. There are multiple ways to make this work internally,
> >>> but the interface should be in line with future plans for hooks/extensible
> >>> areas.
> >>> > Just to make things concrete, here's my understanding of what that
> >>> would look like / what they're suggesting.
> >>> > BaseTimetable abstract class - Defines a ` get_next_execution_time `
> >>> method. This method accepts one argument, an arbitrary datetime value.
> >>> Based on that datetime, this method returns the next time the DAG should
> >>> start. This makes it easy to schedule past events, and also makes it easy
> >>> to print out a "dry run" of execution times for testing purposes. - 
> >>> Defines
> >>> a '_check_timetable_arguments ` method that looks for any existing
> >>> timetable args in the DAG and makes sure they're used by whatever 
> >>> Timetable
> >>> class is selected. Error checking.
> >>> > CronTimetable - Default TimetableClass. Built on BaseTimetable.
> >>> > If they want a different timetable, they can just extend BaseTimetable
> >>> and define a custom `get_next_execution_time` class. Then pass the class
> >>> into the DAG constructor under the `timetable_class` argument. So for
> >>> `sunset` or `sunrise`, they could easily create a `SolarTimetable` class
> >>> and pass that in.
> >>> > `get_next_execution_time` can then be called whenever DAGs are parsed
> >>> or whenever tasks run.
> >>> > On Wed, Jan 20, 2021 at 3:53 PM James Coder < jcode...@gmail.com [
> >>> jcode...@gmail.com] > wrote:
> >>> > Kaxil you beat me to it. I actually have a dag where I achieve an
> >>> irregular schedule by overriding DAG.next [http://DAG.next]
> >>> _dagrun_info(). If that method were swapped out for an object it may be a
> >>> semi-easy way to make the schedule “plugable”.
> >>> >
> >>> > James Coder
> >>> > On Jan 20, 2021, at 6:37 PM, Kaxil Naik < kaxiln...@gmail.com [
> >>> kaxiln...@gmail.com] > wrote:
> >>> >
> >>> > "CronBackend" / "ScheduleIntervalBackend" :D similar to Xcom and
> >>> Secrets Backend
> >>> > Would be definitely good to have Custom Schedule intervals using
> >>> functions/class that is Serializable too.
> >>> >
> >>> > On Wed, Jan 20, 2021 at 11:02 PM QP Hou <q...@scribd.com.invalid>
> >>> wrote:
> >>> > On Wed, Jan 20, 2021 at 10:22 AM Daniel Imberman
> >>> > < daniel.imber...@gmail.com [daniel.imber...@gmail.com] > wrote:
> >>> > >
> >>> > > I love the idea of allowing users to create their own scheduling
> >>> objects/scheduling python functions. They could either live in the
> >>> scheduler or as a seperate process that trips some value in the DB when it
> >>> is “true”. Would be great from a “marketplace” standpoint as well as users
> >>> could post their custom scheduling objects for others to use.
> >>> > >
> >>> >
> >>> > I like this idea as well, a quick escape patch for custom and complex
> >>> > scheduling behaviors without having to wait for upstream support.
> >>>
> >>
> 
> -- 
> +48 660 796 129
> 

Reply via email to