[ 
https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16337856#comment-16337856
 ] 

Jason Lowe commented on YARN-2185:
----------------------------------

Thanks for updating the patch!  I think it's very close now.

bq. I verified and this scenario would use tar and gzip only and I made they do 
not have any output in the normal scenario, so that should not block the pipe.

The "not normal" scenarios are what I was worried about. ;-)  Thanks for fixing 
this in the latest patch.

makeShellPath is a pre-existing, public function, and the added escaping logic 
is only appropriate if the caller provides the single-quoting themselves.  If 
the caller is not quoting then this can mangle the path.  For example, TaskLog 
sometimes calls this function without single-quoting the result.  This function 
could also be called on Windows, and I do not know if it supports the same 
quoting/escaping behavior.  Therefore I think this patch could add a new method 
that quotes a path as a shell command-line argument.  Then the callers don't 
have to do the quoting themselves, we don't surprise anyone who was calling the 
existing method without single-quoting, and it could do the correct quoting 
thing on Windows. The Windows support could be a followup JIRA.

Attempting to get the futures from the executor could result in an 
ExecutionException if somehow the runnable threw an exception.  Would it make 
sense to wait on the process before attempting to get the stderr and stdout 
output?  That way we won't leak the child process even if those exceptions 
occurred.

Nit: There's a double-period in this exception message:
{code}
      if (exitCode != 0) {
        throw new IOException(
            String.format(
                "Error executing command. %s %s %s." +
                    ". Process exited with exit code %d.",
                shell, cmdSwitch, command, exitCode));
      }
{code}


> Use pipes when localizing archives
> ----------------------------------
>
>                 Key: YARN-2185
>                 URL: https://issues.apache.org/jira/browse/YARN-2185
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.4.0
>            Reporter: Jason Lowe
>            Assignee: Miklos Szegedi
>            Priority: Major
>         Attachments: YARN-2185.000.patch, YARN-2185.001.patch, 
> YARN-2185.002.patch, YARN-2185.003.patch, YARN-2185.004.patch, 
> YARN-2185.005.patch, YARN-2185.006.patch, YARN-2185.007.patch, 
> YARN-2185.008.patch, YARN-2185.009.patch
>
>
> Currently the nodemanager downloads an archive to a local file, unpacks it, 
> and then removes it.  It would be more efficient to stream the data as it's 
> being unpacked to avoid both the extra disk space requirements and the 
> additional disk activity from storing the archive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to