[ 
https://issues.apache.org/jira/browse/HADOOP-11785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14391095#comment-14391095
 ] 

Hadoop QA commented on HADOOP-11785:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12708737/distcp-liststatus.patch
  against trunk revision 4922394.

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-distcp.

Test results: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/6041//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HADOOP-Build/6041//console

This message is automatically generated.

> Reduce number of listStatus operation in distcp buildListing()
> --------------------------------------------------------------
>
>                 Key: HADOOP-11785
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11785
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools/distcp
>    Affects Versions: 3.0.0
>            Reporter: Zoran Dimitrijevic
>            Assignee: Zoran Dimitrijevic
>            Priority: Minor
>         Attachments: distcp-liststatus.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Distcp was taking long time in copyListing.buildListing() for large source 
> trees (I was using source of 1.5M files in a tree of about 50K directories). 
> For input at s3 buildListing was taking more than one hour. I've noticed a 
> performance bug in the current code which does listStatus twice for each 
> directory which doubles number of RPCs in some cases (if most directories do 
> not contain >1000 files).
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to