Re: [D] Best way of processing historical data without catchup [airflow]

2025-02-05 Thread via GitHub


GitHub user akomisarek edited a comment on the discussion: Best way of 
processing historical data without catchup

Answerwing to myself - we think about implementing custom timetable this seems 
to work well enough. 
POC code:
```
def next_dagrun_info(self, last_automated_data_interval: DataInterval, 
restriction: TimeRestriction) -> DagRunInfo:
if last_automated_data_interval is None:
if not restriction.catchup:
next_start = pendulum.now().start_of('day')
next_end = next_start.add(days=1)
else:
next_start = restriction.earliest
diff_days = (pendulum.now() - next_start).in_days()

if diff_days >= 365:
next_end = next_start.add(years=1)
else:
next_end = next_start.add(days=1)

else:
next_start = last_automated_data_interval.end
diff_days = (pendulum.now() - next_start).in_days()
if diff_days >= 365:
next_end = next_start.add(years=1)
else:
next_end = next_start.add(days=1)

return DagRunInfo.interval(start=next_start, end=next_end)
```

GitHub link: 
https://github.com/apache/airflow/discussions/46141#discussioncomment-12072398


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]



Re: [D] Best way of processing historical data without catchup [airflow]

2025-02-05 Thread via GitHub


GitHub user akomisarek added a comment to the discussion: Best way of 
processing historical data without catchup

Answerwing to myself - we think about implementing custom timetable this seems 
to work well enough. 

GitHub link: 
https://github.com/apache/airflow/discussions/46141#discussioncomment-12072398


This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]