uranusjr commented on PR #37424: URL: https://github.com/apache/airflow/pull/37424#issuecomment-1953591955
Notes from talking to Ankit off-thread: 1. I think adding an association table shouldn’t affect `triggering_dataset_events`. SQLA loads relationships lazily (unless we make it; we don’t) so the new relation shouldn’t be loaded at all unless the user accesses it. They shouldn’t (it’s unsupported) but if they do they get an unavoidable performance penalty. 2. Right now we [pass in all triggered events collected by DDRQ during the prior trigger and the current trigger to the downstream timetable](https://github.com/apache/airflow/blob/011cd3debb4bb166908277c764d65eaf5985c7af/airflow/jobs/scheduler_job_runner.py#L1268-L1278), and let it [come up with an appropriate data interval](https://github.com/apache/airflow/blob/011cd3debb4bb166908277c764d65eaf5985c7af/airflow/timetables/simple.py#L178-L192) for the downstream DAG run. The logic is pretty obvious for ALL (default, current logic), but less so for ANY or anything more complicated. We might need a way for users to override that timetable function to generate a more appropriate data interval, but that will be handled in the future when the need comes up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org