[ 
https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334872#comment-16334872
 ] 

Jason Lowe commented on YARN-2185:
----------------------------------

Thanks for updating the patch!

Should a SuppressWarnings("deprecation") be added?  I personally would rather 
see that with a comment next to the call site explaining why we're using a 
deprecated method rather than add yet another warning to the pile, but I'm 
curious what others think here.

There were checks for paths with embedded single-quotes which is missing.  The 
code should be escaping single quotes in the filename to avoid the shell 
mis-parsing the command.

runCommandOnStream is only creating a thread pool and reading the subprocess 
stdout and stderr if logging is enabled.  If the subprocess ends up producing 
too much output on either channel then this will deadlock.  The child process 
will stop consuming input waiting for the output stream to be consumed but the 
parent process will be busy blocked waiting for the subprocess to consume more 
input.  We need to be consuming the subprocess stdout and stderr even if we do 
not intend to log it.  If not being logged or otherwise acted upon then the 
data can simply be thrown away.

Speaking of throwing away subprocess output, if the tar command fails there 
will be nothing but an exit code to try to figure out what went wrong.  The 
existing unTarUsingTar gets this behavior via the ShellCommandExecutor.  I 
think runCommandOnStream should throw an exception (e.g.: ExitCodeException or 
something similar) containing the error output if the subprocess does not 
return a zero exit code.


> Use pipes when localizing archives
> ----------------------------------
>
>                 Key: YARN-2185
>                 URL: https://issues.apache.org/jira/browse/YARN-2185
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.4.0
>            Reporter: Jason Lowe
>            Assignee: Miklos Szegedi
>            Priority: Major
>         Attachments: YARN-2185.000.patch, YARN-2185.001.patch, 
> YARN-2185.002.patch, YARN-2185.003.patch, YARN-2185.004.patch, 
> YARN-2185.005.patch, YARN-2185.006.patch, YARN-2185.007.patch, 
> YARN-2185.008.patch
>
>
> Currently the nodemanager downloads an archive to a local file, unpacks it, 
> and then removes it.  It would be more efficient to stream the data as it's 
> being unpacked to avoid both the extra disk space requirements and the 
> additional disk activity from storing the archive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to