SameerMesiah97 commented on code in PR #61642:
URL: https://github.com/apache/airflow/pull/61642#discussion_r2779783705
##########
task-sdk/src/airflow/sdk/execution_time/supervisor.py:
##########
@@ -2080,17 +2080,26 @@ def supervise(
exit_code = process.wait()
end = time.monotonic()
- log.info(
- "Task finished",
- task_instance_id=str(ti.id),
- exit_code=exit_code,
- duration=end - start,
- final_state=process.final_state,
- )
+
+ if exit_code == -9:
+ log.critical(
+ "Task killed by OOM (exit_code=-9)!",
Review Comment:
I think this log message is a bit misleading. `exit_code == -9` tells us the
process received SIGKILL, which is often (but not necessarily) due to OOM.
SIGKILL can also be user- or system-initiated.
Would it be more accurate to phrase this in terms of SIGKILL and mention OOM
as a likely cause rather than asserting it definitively? For example:
`"Task killed after receiving SIGKILL (exit_code=-9). OOM is a likely
cause."`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]