Alex, that's a good point regarding the need to run a DAG for the most
recent schedule interval right away. I hadn't thought of that scenario as I
haven't needed to build a DAG with that large of a scheduling gap. In that
case I agree with you - it seems like it would make more sense to make this
configurable.

Perhaps there could be an additional DAG-level parameter that could be set
alongside "catchup" to control this behavior. Or there could be a new
parameter that could eventually replace "catchup" that supported 3 options
- "catchup", "run most recent interval only", and "run next interval only".

On Sat, Mar 19, 2022 at 1:02 PM Alex Begg <[email protected]> wrote:

> I would not consider it a bug to have the latest data interval run when
> you enable a DAG that is set to catchup=False.
>
> I have legitimate use for that feature by having my production environment
> have catchup_by_default=True but my lower environments are using 
> catchup_by_default=False,
> meaning if I want to test the DAG behavior *as scheduled* in a lower
> environment I can just enable the DAG.
>
> For example, in a staging environment if I need to test out the
> functionality of a DAG that was scheduled for @monthly and there was no way
> to test the most recent data interval, than to test a true data interval of
> the DAG it could be many days, even weeks until they will occur.
>
> Triggering a DAG won’t run the latest data interval, it will use the
> current time as the logical_date, right? So that will won’t let me test a
> single *as scheduled* data interval. So in that @monthly senecio it will
> be impossible for me to test the functionality of a single data interval
> unless I wait multiple weeks.
>
> I see there could be a desire to not run the latest data interval and just
> start with whatever full interval follows the DAG being turned on. However
> I think that should be configurable, not fixed permanently.
>
> Alternatively it could be ideal to have a way to trigger a specific run
> for a catchup=False DAG that just got enabled by adding a 3d option to the
> trigger button drop down to trigger a past scheduled run. Then in that
> dialog the form can default to the most recent full data interval but then
> let you also specify a specific past interval based on the DAG's schedule.
> I often had to debug a DAG in production and I wanted to trigger a specific
> past data interval, not just the most recent.
>
> Alex Begg
>
> On Thu, Mar 17, 2022 at 4:58 PM Larry Komenda <
> [email protected]> wrote:
>
>> I agree with this. I'd much rather have to trigger a single manual run
>> the first time I enable a DAG than to either wait to enable until after I
>> want it to run or by editing the start_date of the DAG itself.
>>
>> I'd be in favor of adjusting this behavior either permanently or by a
>> configuration.
>>
>> On Fri, Mar 4, 2022 at 3:00 PM Philippe Lanoe <[email protected]>
>> wrote:
>>
>>> Hello Daniel,
>>>
>>> Thank you for your answer. In your example, as I experienced, the first
>>> run would not be 2010-01-01 but 2022-03-03, 00:00:00 (it is currently March
>>> 4 - 21:00 here), which is the execution date corresponding to the start of
>>> the previous data interval, but the result is the same: an undesired dag
>>> run. (For instance, in case of cron schedule '00 22 * * *', one dagrun
>>> would be started immediately with execution date of 2022-03-02, 22:00:00)
>>>
>>> I also agree with you that it could be categorized as a bug and I would
>>> also vote for a fix.
>>>
>>> Would be great to have the feedback of others on this.
>>>
>>> On Fri, Mar 4, 2022 at 6:17 PM Daniel Standish
>>> <[email protected]> wrote:
>>>
>>>> You are saying, when you turn on for the first time a dag with
>>>> e.g. @daily schedule, and catchup = False, if start date is 2010-01-01,
>>>> then it would run first the 2010-01-01 run, then the current run (whatever
>>>> yesterday is)?  That sounds familiar.
>>>>
>>>> Yeah I don't like that behavior.  I agree that, as you say, it's not
>>>> the intuitive behavior.  Seems it could reasonably be categorized as a
>>>> bug.  I'd prefer we just "fix" it rather than making it configurable.  But
>>>> some might have concerns re backcompat.
>>>>
>>>> What do others think?
>>>>
>>>>
>>>>

Reply via email to