Copilot commented on code in PR #64995:
URL: https://github.com/apache/airflow/pull/64995#discussion_r3067136785


##########
task-sdk/tests/task_sdk/execution_time/test_supervisor.py:
##########
@@ -540,6 +540,38 @@ def subprocess_main():
         assert "Last chance exception handler" in captured.err
         assert "RuntimeError: Fake syntax error" in captured.err
 
+    def test_start_retries_psutil_lookup_on_no_such_process(self, 
client_with_ti_start, monkeypatch):
+        real_process = psutil.Process
+        call_count = 0
+
+        def flaky_process(pid=None):
+            nonlocal call_count
+            if pid is not None:
+                call_count += 1
+                if call_count == 1:
+                    raise psutil.NoSuchProcess(pid=pid, msg="process PID not 
found")
+                return real_process(pid)
+            return real_process()
+
+        
monkeypatch.setattr("airflow.sdk.execution_time.supervisor.psutil.Process", 
flaky_process)
+        
monkeypatch.setattr("airflow.sdk.execution_time.supervisor.time.sleep", lambda 
_: None)

Review Comment:
   The monkeypatch targets here use a dotted-string that looks like a nested 
attribute path ("airflow.sdk.execution_time.supervisor.psutil.Process" / 
".time.sleep"). With pytest's monkeypatch, the string form imports everything 
up to the last dot as a module, so this will try to import 
`airflow.sdk.execution_time.supervisor.psutil` and 
`airflow.sdk.execution_time.supervisor.time` (which don’t exist) and the test 
will error. Patch the attributes on the already-imported supervisor module 
objects instead (e.g. import `airflow.sdk.execution_time.supervisor` and set 
`supervisor.psutil.Process` / `supervisor.time.sleep`).



##########
task-sdk/src/airflow/sdk/execution_time/supervisor.py:
##########
@@ -441,6 +441,14 @@ def exit(n: int) -> NoReturn:
             exit(125)
 
 
+def _psutil_process(pid: int) -> psutil.Process:
+    try:
+        return psutil.Process(pid)
+    except psutil.NoSuchProcess:
+        time.sleep(0.1)
+        return psutil.Process(pid)

Review Comment:
   `time.sleep(0.1)` is a new hard-coded retry delay. Given this module already 
centralizes timing knobs as constants (e.g. `MIN_HEARTBEAT_INTERVAL`, 
`SOCKET_CLEANUP_TIMEOUT`), consider extracting this delay (and possibly retry 
count/timeout) into a named constant or config-backed value so it’s easier to 
tune and reason about if the race needs longer retries on some platforms.



##########
task-sdk/src/airflow/sdk/execution_time/supervisor.py:
##########
@@ -441,6 +441,14 @@ def exit(n: int) -> NoReturn:
             exit(125)
 
 
+def _psutil_process(pid: int) -> psutil.Process:
+    try:
+        return psutil.Process(pid)
+    except psutil.NoSuchProcess:
+        time.sleep(0.1)
+        return psutil.Process(pid)

Review Comment:
   The PR description says this change replaces the immediate psutil lookup 
with an `os.waitpid`/`os.kill`-backed lightweight handle, but the 
implementation here still wraps the child PID in `psutil.Process` (with a 
retry). Either update the PR description to match the actual approach, or 
implement the described non-psutil handle so reviewers/users aren’t misled 
about the behavior and dependencies.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to