potiuk commented on code in PR #40078:
URL: https://github.com/apache/airflow/pull/40078#discussion_r1635218635


##########
airflow/providers/openlineage/plugins/listener.py:
##########
@@ -319,10 +322,46 @@ def on_failure():
                 len(Serde.to_json(redacted_event).encode("utf-8")),
             )
 
-        on_failure()
+        self._execute(on_failure, "on_failure", use_fork=True)
+
+    def _execute(self, callable, callable_name: str, use_fork: bool = False):
+        if use_fork:
+            self._fork_execute(callable, callable_name)
+        else:
+            callable()
+
+    def _fork_execute(self, callable, callable_name: str):
+        self.log.debug("Will fork to execute OpenLineage process.")
+        pid = os.fork()
+        if pid:
+            process = psutil.Process(pid)
+            try:
+                self.log.debug("Waiting for process %s", pid)
+                process.wait(conf.execution_timeout())
+            except psutil.TimeoutExpired:
+                self.log.warning(
+                    "OpenLineage process %s expired. This should not affect 
process execution.", pid
+                )
+                process.kill()

Review Comment:
   I think we should add process.terminate() call as well and get another 
(possibly hard-coded) timeout before we send kill. Sending SIGKILL should 
generally be last-resort kinda thing - because it might leave some resources 
opened, locks locked etc. so we should really only do it if earlier SIGTERM did 
not terminate the process in reasonable time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to