nehemiascr commented on issue #32375:
URL: https://github.com/apache/airflow/issues/32375#issuecomment-1634466429

   Jens, I added `finally:` statements to the code and the issue still 
persists, to answer your question, yes, it is a different MySQL database, 
airflow is running on its own Postgres database from the helms chart. The issue 
happens when there is more than one task being updated, if I update just one 
task instance, it succeeds, even if I read from this MySQL database.
   
   We are indeed following your recommendation and trying to run our database 
calculations somewhere else, we are studying what options we have, what we want 
is to, dynamically, annotate the 'pod_executor' configuration of the task 
instance so it will be scheduled in pods/nodes with variable sizes and 
resources (which depends, i.e.., on the database/size and and contents), we 
understand that at the Dag level is to early (we don't know the details of the 
task size yet) and inside the task instance it is already too late (it is 
running in a pod already). The `task_instance_mutation_hook` seems like the 
only place were we can get the 'database size' information we want, we would 
also like to take other factors into account, like the size of an input file to 
be processed, etc.
   We understand now that the `task_instance_mutation_hook` may not be the best 
place to do our calculations to define what node/resources should the task be 
scheduled to.
   
   We are testing doing the 'pod resources' calculations just before we trigger 
the Dag and have this information as part of the Dag configurations, and then, 
in the `task_instance_mutation_hook` just move this information from the 
`dag_run.conf` to the `pod_executor`, this way we won't do any 
calculations/database read inside the `task_instance_mutation_hook`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to