[ 
https://issues.apache.org/jira/browse/OOZIE-2402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027681#comment-15027681
 ] 

Illya Yalovyy commented on OOZIE-2402:
--------------------------------------

[~rkanter],

Thank you for the prompt review.

Please see my notes below:
1. Will fix it.
2. I'll update related documentation.
3. Will fix it.
4. Will fix it.
5. {{IOUtils.copyBytes(in, out, fs.getConf(), true);}} closes both streams 
internally. We need this {{close()}} statement in catch section only for a case 
when {{out = fs.create(new Path(dstPath, file.getName()));}} fails with an 
exception.
6. Will fix it
7. I wanted to escape overhead of hadoop FS implementation, but I will run some 
tests to actually measure the difference. If it is not significant,  I will use 
{{fs.copyFromLocalFile}} to copy individual files.
8. Will add unit test



> oozie-setup.sh sharelib create takes a long time on large clusters
> ------------------------------------------------------------------
>
>                 Key: OOZIE-2402
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2402
>             Project: Oozie
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 4.2.0
>            Reporter: Illya Yalovyy
>            Assignee: Illya Yalovyy
>         Attachments: OOZIE-2402-1.patch
>
>
> When cluster has 256+ nodes it can take up to 5 minutes to create a sharelib. 
> Copy the tarball itself takes only around 10 seconds. It seems like 
> performance could be improved by loading files concurrently in many threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to