GitHub user akomisarek edited a comment on the discussion: Best way of 
processing historical data without catchup

Answerwing to myself - we think about implementing custom timetable this seems 
to work well enough. 
POC code:
```
    def next_dagrun_info(self, last_automated_data_interval: DataInterval, 
restriction: TimeRestriction) -> DagRunInfo:
        if last_automated_data_interval is None:
            if not restriction.catchup:
                next_start = pendulum.now().start_of('day')
                next_end = next_start.add(days=1)
            else:
                next_start = restriction.earliest
                diff_days = (pendulum.now() - next_start).in_days()

                if diff_days >= 365:
                    next_end = next_start.add(years=1)
                else:
                    next_end = next_start.add(days=1)

        else:
            next_start = last_automated_data_interval.end
            diff_days = (pendulum.now() - next_start).in_days()
            if diff_days >= 365:
                next_end = next_start.add(years=1)
            else:
                next_end = next_start.add(days=1)

        return DagRunInfo.interval(start=next_start, end=next_end)
```

GitHub link: 
https://github.com/apache/airflow/discussions/46141#discussioncomment-12072398

----
This is an automatically sent email for commits@airflow.apache.org.
To unsubscribe, please send an email to: commits-unsubscr...@airflow.apache.org

Reply via email to