Daryn Sharp created HADOOP-10146: ------------------------------------ Summary: Workaround JDK7 Process fd close bug Key: HADOOP-10146 URL: https://issues.apache.org/jira/browse/HADOOP-10146 Project: Hadoop Common Issue Type: Bug Components: util Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Critical
JDK7's {{Process}} output streams have an async fd-close race bug. This manifests as commands run via o.a.h.u.Shell causing threads to hang, OOM, or cause other bizarre behavior. The NM is likely to encounter the bug under heavy load. Specifically, {{ProcessBuilder}}'s {{UNIXProcess}} starts a thread to reap the process and drain stdout/stderr to avoid a lingering zombie process. A race occurs if the thread using the stream closes it, the underlying fd is recycled/reopened, while the reaper is draining it. {{ProcessPipeInputStream.drainInputStream}}'s will OOM allocating an array if {{in.available()}} returns a huge number, or may wreak havoc by incorrectly draining the fd. -- This message was sent by Atlassian JIRA (v6.1#6144)