Hi all, Thanks for the opinions and ideas, really helpful.
I agree that being able to visualize intervals should be a priority. Just to get a feel of how it could be done, I implemented something in the CLI: https://github.com/apache/airflow/pull/9072 (thanks Tomek and Kamil for the reviews). But I think something on the web UI would be more convenient and consequently safer for users. I'm going to give it a try. I'll start by checking the related PRs mentioned by Ash. Such a view can also help to explore corner cases. I created a WIP PR to see how the support for multiple crons would look like: https://github.com/apache/airflow/pull/9091. The main idea is to use a class which is a composite of croniters. The first commit is what I was able to do with minimal code changes. The second commit replaces croniter with the new class in other parts of the code. I think it is better encapsulation of the crontab concept, but it has a larger potential to break things, particularly because of the change in interval normalization. It seems to me that any class providing get_next(), get_prev() and is_fixed_schedule() could work as a drop in replacement of this new class, so it shouldn't be too hard to add support to something like the "Every Business Day" example given by Damian later on. I'm just mentioning it to keep that in mind if such flexibility is desirable at some point, I wouldn't implement that now. So next step for me is to do some work on visualization. Meanwhile, please let me know if you think this is the way to go or if I should be looking into something different, ideas are always welcome. -- Mauricio On Thu, May 28, 2020 at 9:31 AM Bas Harenslak <basharens...@godatadriven.com.invalid> wrote: > > I think this is a very valid use case, so +1 from me. > > The code logic is IMO pretty straight-forward: take the min(expr1, expr2, > expr3, …) to compute the next interval. > From the human point of view, it might become complex as Jarek says, but > that’s up to the developer. > > And also: a “helper” view in the UI visualizing the next schedules would be > very helpful! E.g. a separate view to test scheduling expressions, or just a > pop-up in the DAG to show the next X scheduled times. > > Bas > > > On 28 May 2020, at 14:12, Jarek Potiuk <jarek.pot...@polidea.com> wrote: > > > > I am for it as long as we can thoroughly test the edge cases and maybe have > > some way to visualize resulting "data intervals" in this case. > > > > I think big part of Airflow is that it is not just a "scheduled job" > > engine, but that it works on data intervals that are defined by the > > schedules. > > With regular schedules. it's fairly easy to reason about what data > > intervals Airflow works on, but with complex multi-cron job expressions it > > will become much less obvious > > > > In the example you mentioned above - - Mauricio - "every 10 min between > > 16:30 and 18:10" we will have 10x 10 minutes data intervals ending at > > 16:40, 16:50, 17:00, 17:10, 17:20, 17:30, 17:40, 17:50, 18:00, 18:10 and > > one ~ 22.5h interval between 18:10 and 16:30 next day. > > > > This is a simple example of course and it is currently also possible for > > fixed hours in cron, but I can imagine if we introduce the capability of > > multiple cron job expressions that introduce arbitrary complex schedules, > > the schedules might be super-difficult to reason about if you start mixing > > them. > > > > I think it is fine if the users want to do it, but also for the convenience > > of the users themselves. maybe there should be some way (Web UI? CLI?) > > where you can take such a schedule and see the data intervals you can > > expect to have? > > > > WDYT? > > > > J. > > > > On Thu, May 28, 2020 at 2:37 AM Shaw, Damian P. < > > damian.sha...@credit-suisse.com> wrote: > > > >> Big +1 to anything that extends the limitations of Airflow's current > >> scheduling capability. > >> > >> For me the only drawback of this is it doesn't go far enough and further > >> additions would needed to be added later, it would still be difficult to > >> express things that require updatable calendars like "Every Business Day" > >> or things which are hard to express even with composible crontabs likes > >> "The first week day of the month". > >> > >> But if this is an easy win I hope it's taken seriously. > >> > >> Damian. > >> > >> -----Original Message----- > >> From: Mauricio De Diana <mdedi...@gmail.com> > >> Sent: Wednesday, May 27, 2020 14:27 > >> To: dev@airflow.apache.org > >> Subject: Support for multiple cron expressions > >> > >> Hello all, > >> > >> At the moment some schedules are not possible in Airflow, for example, > >> "every 10 min between 16:30 and 18:10". Such schedules would be possible if > >> Airflow supported multiple cron expressions, as described in > >> https://github.com/apache/airflow/issues/8649. In the issue, I was > >> suggested to bring the discussion here because this may not be a desirable > >> feature. > >> > >> In terms of implementation, I gave the idea a try and I have something > >> working. For that, besides str, timedelta and relativedelta, a schedule > >> interval can also be a list of strings representing cron expressions. There > >> is a class that is a composite of croniter objects and providing the same > >> methods. It works seamlessly for one or many cron expressions, so changes > >> in the scheduler code are mostly replacing croniter with this class. > >> > >> I can create a PR if there is interest in discussing the implementation, > >> but first I would like to learn opinions about this feature? Is it an idea > >> worth following? > >> > >> Thanks, > >> Mauricio > >> > >> > >> > >> =============================================================================== > >> > >> Please access the attached hyperlink for an important electronic > >> communications disclaimer: > >> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html > >> =============================================================================== > >> > >> > > > > > > -- > > > > Jarek Potiuk > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > M: +48 660 796 129 <+48660796129> > > [image: Polidea] <https://www.polidea.com/> >