I am for it as long as we can thoroughly test the edge cases and maybe have
some way to visualize resulting "data intervals" in this case.

I think big part of Airflow is that it is not just a "scheduled job"
engine, but that it works on data intervals that are defined by the
schedules.
With regular schedules. it's fairly easy to reason about what data
intervals Airflow works on, but with complex multi-cron job expressions it
will become much less obvious

In the example you mentioned above - - Mauricio - "every 10 min between
16:30 and 18:10" we will have 10x  10 minutes data intervals ending at
16:40, 16:50, 17:00, 17:10, 17:20, 17:30, 17:40, 17:50, 18:00, 18:10 and
one ~ 22.5h interval between 18:10 and 16:30 next day.

This is a simple example of course and it is currently also possible for
fixed hours in cron, but I can imagine if we introduce the capability of
multiple cron job expressions that introduce arbitrary complex schedules,
the schedules might be super-difficult to reason about if you start mixing
them.

I think it is fine if the users want to do it, but also for the convenience
of the users themselves. maybe there should be some way (Web UI? CLI?)
where you can take such a schedule and see the data intervals you can
expect to have?

WDYT?

J.

On Thu, May 28, 2020 at 2:37 AM Shaw, Damian P. <
damian.sha...@credit-suisse.com> wrote:

> Big +1 to anything that extends the limitations of Airflow's current
> scheduling capability.
>
> For me the only drawback of this is it doesn't go far enough and further
> additions would needed to be added later, it would still be difficult to
> express things that require updatable calendars like "Every Business Day"
> or things which are hard to express even with composible crontabs likes
> "The first week day of the month".
>
> But if this is an easy win I hope it's taken seriously.
>
> Damian.
>
> -----Original Message-----
> From: Mauricio De Diana <mdedi...@gmail.com>
> Sent: Wednesday, May 27, 2020 14:27
> To: dev@airflow.apache.org
> Subject: Support for multiple cron expressions
>
> Hello all,
>
> At the moment some schedules are not possible in Airflow, for example,
> "every 10 min between 16:30 and 18:10". Such schedules would be possible if
> Airflow supported multiple cron expressions, as described in
> https://github.com/apache/airflow/issues/8649. In the issue, I was
> suggested to bring the discussion here because this may not be a desirable
> feature.
>
> In terms of implementation, I gave the idea a try and I have something
> working. For that, besides str, timedelta and relativedelta, a schedule
> interval can also be a list of strings representing cron expressions. There
> is a class that is a composite of croniter objects and providing the same
> methods. It works seamlessly for one or many cron expressions, so changes
> in the scheduler code are mostly replacing croniter with this class.
>
> I can create a PR if there is interest in discussing the implementation,
> but first I would like to learn opinions about this feature? Is it an idea
> worth following?
>
> Thanks,
> Mauricio
>
>
>
> ===============================================================================
>
> Please access the attached hyperlink for an important electronic
> communications disclaimer:
> http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
> ===============================================================================
>
>


-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Reply via email to