uranusjr commented on PR #37424:
URL: https://github.com/apache/airflow/pull/37424#issuecomment-1953591955

   Notes from talking to Ankit off-thread:
   
   1. I think adding an association table shouldn’t affect 
`triggering_dataset_events`. SQLA loads relationships lazily (unless we make 
it; we don’t) so the new relation shouldn’t be loaded at all unless the user 
accesses it. They shouldn’t (it’s unsupported) but if they do they get an 
unavoidable performance penalty.
   2. Right now we [pass in all triggered events collected by DDRQ during the 
prior trigger and the current trigger to the downstream 
timetable](https://github.com/apache/airflow/blob/011cd3debb4bb166908277c764d65eaf5985c7af/airflow/jobs/scheduler_job_runner.py#L1268-L1278),
 and let it [come up with an appropriate data 
interval](https://github.com/apache/airflow/blob/011cd3debb4bb166908277c764d65eaf5985c7af/airflow/timetables/simple.py#L178-L192)
 for the downstream DAG run. The logic is pretty obvious for ALL (default, 
current logic), but less so for ANY or anything more complicated. We might need 
a way for users to override that timetable function to generate a more 
appropriate data interval, but that will be handled in the future when the need 
comes up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to