Running exactly every two weeks can be done by setting `schedule_interval=timedelta(days=14), start_date=...`.
Does this do what you need Elad? On 20 January 2021 18:12:36 GMT, Elad Kalif <elad...@gmail.com> wrote: >>> In the example of a twice-a-month dag (not sure if it you have this use >case too?) what do you expect the "data interval" (i.e. execution_date) to >be? >Yes we have this use case too. The execution date does matter because I >want it to be bi-weekly for starting specific day and time >so with the current implementation I expect to provide >start_date=datetime(2021,1,19,20,5) & schedule_interval='2 weeks' > >Currently Airflow has 'hourly', 'daily' , 'weekly' - which doesn't allow us >to set it. >So a possible solution for this specific use case could be defining: >repeat_every - integer that represents the frequency (1,2,3,... n) >unit - str that provide the "gaps" (minutes, hours, days, weeks, months, >years) >Example: bi-weekly / twice a month can be: repeat_every = 2, unit = 'weeks' > To get the 'hourly', 'daily' , 'weekly' functionality it >just needs to set unit=1. > >By the way this is exactly what google calendar allows to set if you click >on custom scheduling for a meeting. > >I'm still in favor of the python function approach as it should cover all >cases and provide full control for the users. > >On Wed, Jan 20, 2021 at 7:20 PM Deng Xiaodong <xd.den...@gmail.com> wrote: > >> A quick thought (*maybe not making sense*): if *schedule_interval* accepts >> a list of values, we may support much higher complexity. >> >> For example, I may want to schedule my jobs at every days' 04:05 AND 02:31 >> , which cannot be expressed by single Cron pattern. Then I may want to have >> *schedule_interval >> = ["5 4 * * *", "31 2 * * *"]*. >> >> Maybe I missed something or the idea doesn't make sense. Please let me >> know. >> >> >> XD >> >> On Wed, Jan 20, 2021 at 6:09 PM Ash Berlin-Taylor <a...@apache.org> wrote: >> >>> Yes, we quite possibly could do this -- I'm trying to work out what the >>> needs are here. >>> >>> In the example of a twice-a-month dag (not sure if it you have this use >>> case too?) what do you expect the "data interval" (i.e. execution_date) to >>> be? >>> >>> Or for this case does it not matter? >>> >>> -ash >>> >>> >>> On Wed, 20 Jan, 2021 at 19:06, Elad Kalif <elad...@gmail.com> wrote: >>> >>> Another case that is mentioned in one of the issues is the ability to >>> schedule a bi-weekly job (equivalent of bi-weekly meeting that you can set >>> in a calendar) which is very much needed. >>> >>> Maybe this is unrealistic but I think the game changer is if it would be >>> possible to let the users define their own logic and airflow will use it to >>> schedule DAGs. >>> My thought here is - if I can define the logic in a python function >>> (regardless of what this logic is). Can't Airflow utilize it? >>> >>> On Wed, Jan 20, 2021 at 5:39 PM Ash Berlin-Taylor <a...@apache.org> wrote: >>> >>>> Hi everyone, >>>> >>>> I'd like to (re)start the discussion about a new feature I'd like to add >>>> for Airflow 2.1, that I am loosely calling "improving schedule_interval" >>>> (catchy name I know!) >>>> >>>> I have two main high-level goals in mind here: >>>> >>>> 1. To reduce the confusion around execution_date (specifically the >>>> naming of the parameter!) - the whole start vs end discussion. >>>> 2. To support more complex schedules. >>>> >>>> Previous thread on this point 1 here: >>>> https://lists.apache.org/thread.html/2b12ae265795ff2e655a5161c972f5c7bbe60722a12849a0e2c5c55f%40%3Cdev.airflow.apache.org%3E, >>>> (but I'm taking a bit of a step back from that to think if there's a bigger >>>> change we could make that encompases this) >>>> >>>> >>>> I don't yet have a concrete plan, nor implementation in mind, but I'd >>>> like to start collecting peoples "wish list" when it comes to scheduling >>>> DAGS: >>>> >>>> - What do you wish you could express natively in terms of scheduling >>>> your DAGs? (I.e. without using "hacks" such as date sensor/skip tasks at >>>> start) >>>> - What schedules do you wish you could express now, that you just can't? >>>> - Do you have good example workflows that give a good example of where >>>> you want schedule at start? Follow up question: do you also want this to be >>>> different for different DAGs in your Airflow install? >>>> >>>> >>>> Existing issues: >>>> https://github.com/apache/airflow/issues/8649 "Add support for more >>>> than 1 cron exp per DAG" >>>> https://github.com/apache/airflow/issues/10194 "Ability to better >>>> support odd scheduling time" >>>> https://github.com/apache/airflow/issues/10449 "Dynamic Schedule >>>> Intervals" >>>> https://github.com/apache/airflow/issues/10123 "Job Schedule Interval >>>> on 2nd & 4th Tuesday" >>>> >>>> I'll start: >>>> >>>> Case1: >>>> >>>> One example that came up recently in slack was an actual astronomer >>>> wanting a DAG to run with a schedule of "@sunset"! This also brings up the >>>> subject of "running dags at interval start or end" >>>> >>>> Case2: >>>> >>>> I'd like to be able to run a daily process at the end of each week day. >>>> I.e. to process data for Monday..Friday. The naive way of expressing this >>>> would be "0 0 * * MON-FRI", but that means that the dags would run Tuesday, >>>> Wednesday ,Thursday ,Friday, Monday -- meaning Friday's data isn't >>>> processed until Monday! >>>> >>>> My thoughts on this is we need to separate schedule interval (when to >>>> run a task) from the period duration (i.e look at one days worth of data). >>>> >>>> Thanks, >>>> Ash >>>> >>>> >>>> >>>>