[ 
https://issues.apache.org/jira/browse/BEAM-9474?focusedWorklogId=401753&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-401753
 ]

ASF GitHub Bot logged work on BEAM-9474:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Mar/20 20:24
            Start Date: 11/Mar/20 20:24
    Worklog Time Spent: 10m 
      Work Description: mxm commented on pull request #11084: [BEAM-9474] 
Improve robustness of BundleFactory and ProcessEnvironment
URL: https://github.com/apache/beam/pull/11084#discussion_r391247459
 
 

 ##########
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/ProcessManager.java
 ##########
 @@ -186,31 +186,28 @@ private void stopProcess(String id, Process process) {
       LOG.debug("Attempting to stop process with id {}", id);
       // first try to kill gracefully
       process.destroy();
-      long maxTimeToWait = 2000;
-      if (waitForProcessToDie(process, maxTimeToWait)) {
-        LOG.debug("Process for worker {} shut down gracefully.", id);
-      } else {
-        LOG.info("Process for worker {} still running. Killing.", id);
-        process.destroyForcibly();
+      long maxTimeToWait = 500;
 
 Review comment:
   It's not always shutting down gracefully but that's what the change is 
about: removing processes and ensuring a quick recovery time. It's a trade-off. 
Ideally we would want to allow more time but if we wait 2 seconds with an SDK 
parallelism of 16, that's already more than half a minute waiting time. We 
really want to do the process removal in parallel. I'll look into this.
   
   I'm not sure the ProcessManager is a good place to document the shutdown 
behavior. If you have any suggestions though, I'll add them here.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 401753)
    Time Spent: 6.5h  (was: 6h 20m)

> Environment cleanup is not robust enough and may leak resources
> ---------------------------------------------------------------
>
>                 Key: BEAM-9474
>                 URL: https://issues.apache.org/jira/browse/BEAM-9474
>             Project: Beam
>          Issue Type: Bug
>          Components: java-fn-execution
>            Reporter: Maximilian Michels
>            Assignee: Maximilian Michels
>            Priority: Major
>          Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> The cleanup code in {{DefaultJobBundleFactory}} and its {{RemoteEnvironment}} 
> s may leak resources. This is especially a concern when the execution engines 
> reuses the same JVM or underlying machines for multiple runs of a pipeline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to