SameerMesiah97 commented on code in PR #66815:
URL: https://github.com/apache/airflow/pull/66815#discussion_r3330528268
##########
providers/microsoft/azure/src/airflow/providers/microsoft/azure/operators/batch.py:
##########
@@ -296,6 +307,29 @@ def execute(self, context: Context) -> None:
)
# Add task to job
self.hook.add_single_task_to_job(job_id=self.batch_job_id, task=task)
+
+ if self.deferrable:
+ # Verify pool and nodes are in terminal state before deferral
+ pool = self.hook.connection.pool.get(self.batch_pool_id)
+ if pool.resize_errors:
+ raise RuntimeError(f"Pool resize errors: {pool.resize_errors}")
+
+ nodes =
list(self.hook.connection.compute_node.list(self.batch_pool_id))
+ self.log.debug("Deferral pre-check: %d nodes present in pool %s",
len(nodes), self.batch_pool_id)
+ end_time = time.time() + self.timeout
+
Review Comment:
`end_time` computation has been adjusted as suggested. However, it has been
decided that wall-clock time i.e time.time() should be use to prevent
unintended timeout behavior due to triggerer and worker running in separate
processes. This class of bug has already been reported in issue #67616.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]