[ https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334872#comment-16334872 ]
Jason Lowe commented on YARN-2185: ---------------------------------- Thanks for updating the patch! Should a SuppressWarnings("deprecation") be added? I personally would rather see that with a comment next to the call site explaining why we're using a deprecated method rather than add yet another warning to the pile, but I'm curious what others think here. There were checks for paths with embedded single-quotes which is missing. The code should be escaping single quotes in the filename to avoid the shell mis-parsing the command. runCommandOnStream is only creating a thread pool and reading the subprocess stdout and stderr if logging is enabled. If the subprocess ends up producing too much output on either channel then this will deadlock. The child process will stop consuming input waiting for the output stream to be consumed but the parent process will be busy blocked waiting for the subprocess to consume more input. We need to be consuming the subprocess stdout and stderr even if we do not intend to log it. If not being logged or otherwise acted upon then the data can simply be thrown away. Speaking of throwing away subprocess output, if the tar command fails there will be nothing but an exit code to try to figure out what went wrong. The existing unTarUsingTar gets this behavior via the ShellCommandExecutor. I think runCommandOnStream should throw an exception (e.g.: ExitCodeException or something similar) containing the error output if the subprocess does not return a zero exit code. > Use pipes when localizing archives > ---------------------------------- > > Key: YARN-2185 > URL: https://issues.apache.org/jira/browse/YARN-2185 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager > Affects Versions: 2.4.0 > Reporter: Jason Lowe > Assignee: Miklos Szegedi > Priority: Major > Attachments: YARN-2185.000.patch, YARN-2185.001.patch, > YARN-2185.002.patch, YARN-2185.003.patch, YARN-2185.004.patch, > YARN-2185.005.patch, YARN-2185.006.patch, YARN-2185.007.patch, > YARN-2185.008.patch > > > Currently the nodemanager downloads an archive to a local file, unpacks it, > and then removes it. It would be more efficient to stream the data as it's > being unpacked to avoid both the extra disk space requirements and the > additional disk activity from storing the archive. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org