kennknowles commented on code in PR #35120: URL: https://github.com/apache/beam/pull/35120#discussion_r2150957290
########## sdks/java/core/src/main/java/org/apache/beam/sdk/options/SdkHarnessOptions.java: ########## @@ -427,4 +427,16 @@ public Duration create(PipelineOptions options) { : Duration.ofMinutes(1); } } + + /** + * The time limit (in minute) that an SDK worker allows for a PTransform operation before + * signaling the runner harness to restart the SDK worker. + */ + @Description( + "The time limit (minute) that an SDK worker allows for a PTransform operation " Review Comment: A PTransform is an abstract concept that exists in the submitted pipeline. Even though we use the "PTransform" proto message during execution, there really isn't any such thing in how we think and talk about execution. An OK user-facing name for this would be `--bundleProcessingTimeout` ########## sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ExecutionStateSampler.java: ########## @@ -169,7 +203,15 @@ private Void stateSampler() throws Exception { long millisSinceLastSample = currentTimeMillis - lastSampleTimeMillis; synchronized (activeStateTrackers) { for (ExecutionStateTracker activeTracker : activeStateTrackers) { - activeTracker.takeSample(currentTimeMillis, millisSinceLastSample); + try { + activeTracker.takeSample(currentTimeMillis, millisSinceLastSample); + } catch (RuntimeException e) { + LOG.error( + String.format( + "The SDK worker will restart because the lull time is longer than %d minutes", Review Comment: Since right here all we know is that the SDK will terminate, the log message should say that. Then the runner's log message can say "SDK terminated; starting a new one". -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org