GitHub user akomisarek edited a comment on the discussion: Best way of
processing historical data without catchup
Answerwing to myself - we think about implementing custom timetable this seems
to work well enough.
POC code:
```
def next_dagrun_info(self, last_automated_data_interval: DataInterval,
restriction: TimeRestriction) -> DagRunInfo:
if last_automated_data_interval is None:
if not restriction.catchup:
next_start = pendulum.now().start_of('day')
next_end = next_start.add(days=1)
else:
next_start = restriction.earliest
diff_days = (pendulum.now() - next_start).in_days()
if diff_days >= 365:
next_end = next_start.add(years=1)
else:
next_end = next_start.add(days=1)
else:
next_start = last_automated_data_interval.end
diff_days = (pendulum.now() - next_start).in_days()
if diff_days >= 365:
next_end = next_start.add(years=1)
else:
next_end = next_start.add(days=1)
return DagRunInfo.interval(start=next_start, end=next_end)
```
GitHub link:
https://github.com/apache/airflow/discussions/46141#discussioncomment-12072398
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]