kennknowles commented on code in PR #35298: URL: https://github.com/apache/beam/pull/35298#discussion_r2151002410
########## sdks/python/apache_beam/runners/worker/worker_status.py: ########## @@ -250,6 +263,13 @@ def _log_lull_in_bundle_processor(self, bundle_process_cache): if processor: info = processor.state_sampler.get_info() self._log_lull_sampler_info(info, instruction) + if self._restart_lull_timeout_ns and self._restart_lull(info): + sys.exit(1) Review Comment: On the design... we want to be very careful about adding `exit` calls (here and in Java). Because eventually code ends up getting used in unexpected contexts. It should be written without assumptions about the context it is called in, to the extent this is possible. So this could cause some other way of executing to exit unexpectedly. Can we raise an exception that bubbles up and causes the top level `main` to exit? That is really the only level where we should make any assumptions about being in charge of the process lifetime. ########## sdks/python/apache_beam/runners/worker/worker_status.py: ########## @@ -47,6 +48,8 @@ # 5 minutes * 60 seconds * 1000 millis * 1000 micros * 1000 nanoseconds DEFAULT_LOG_LULL_TIMEOUT_NS = 5 * 60 * 1000 * 1000 * 1000 +DEFAULT_RESTART_LULL_TIMEOUT_NS = 10 * 60 * 1000 * 1000 * 1000 Review Comment: This is actually an "exit/kill" timeout. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org