[jira] [Comment Edited] (YARN-2185) Use pipes when localizing archives

Miklos Szegedi (JIRA) Fri, 22 Dec 2017 16:11:53 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16302123#comment-16302123
 ]


Miklos Szegedi edited comment on YARN-2185 at 12/23/17 12:10 AM:
-----------------------------------------------------------------

Attaching my suggestion how to solve this. The code streams HDFS as standard 
input to the tar and gzip commands. It handles Windows as well. As an addition 
I create the temporary directory with permissions 700 instead of 755. I do not 
create any additional temporary directories for extraction, one is enough. A 
difference is that I use jar command for zips as well, so that it handles 
Windows properly. Also I added an additional switch to be able to disable the 
modification time check specifying -1 as the timestamp. I also do parallel copy 
for directory localization to leverage the distributed storage in HDFS.


was (Author: miklos.szeg...@cloudera.com):
Attaching my suggestion how to solve this. The code streams HDFS as standard 
input to the tar and gzip commands. It handles Windows as well. As an addition 
I create temporary files with permissions 700 instead of 755. I do not create 
any additional temporary directories for extraction, one is enough. A 
difference is that I use jar command for zips as well, so that it handles 
Windows properly. Also I added an additional switch to be able to disable the 
modification time check specifying -1 as the timestamp. I also do parallel copy 
for directory localization to leverage the distributed storage in HDFS.

> Use pipes when localizing archives
> ----------------------------------
>
>                 Key: YARN-2185
>                 URL: https://issues.apache.org/jira/browse/YARN-2185
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>    Affects Versions: 2.4.0
>            Reporter: Jason Lowe
>            Assignee: Miklos Szegedi
>         Attachments: YARN-2185.000.patch
>
>
> Currently the nodemanager downloads an archive to a local file, unpacks it, 
> and then removes it.  It would be more efficient to stream the data as it's 
> being unpacked to avoid both the extra disk space requirements and the 
> additional disk activity from storing the archive.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-2185) Use pipes when localizing archives

Reply via email to