[ 
https://issues.apache.org/jira/browse/HDFS-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152333#comment-14152333
 ] 

Hadoop QA commented on HDFS-7122:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12671824/hdfs-7122.001.patch
  against trunk revision 9e985e6.

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
12 warning messages.
        See 
https://builds.apache.org/job/PreCommit-HDFS-Build/8253//artifact/patchprocess/diffJavadocWarnings.txt
 for details.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

    {color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

                  org.apache.hadoop.crypto.random.TestOsSecureRandom
                  org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
                  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

                                      The following test timeouts occurred in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.TestCheckpoint

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8253//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/8253//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8253//console

This message is automatically generated.

> Very poor distribution of replication copies
> --------------------------------------------
>
>                 Key: HDFS-7122
>                 URL: https://issues.apache.org/jira/browse/HDFS-7122
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.3.0
>         Environment: medium-large environments with 100's to 1000's of DNs 
> will be most affected, but potentially all environments.
>            Reporter: Jeff Buell
>            Assignee: Andrew Wang
>            Priority: Blocker
>              Labels: performance
>         Attachments: copies_per_slave.jpg, 
> hdfs-7122-cdh5.1.2-testing.001.patch, hdfs-7122.001.patch
>
>
> Summary:
> Since HDFS-6268, the distribution of replica block copies across the 
> DataNodes (replicas 2,3,... as distinguished from the first "primary" 
> replica) is extremely poor, to the point that TeraGen slows down by as much 
> as 3X for certain configurations.  This is almost certainly due to the 
> introduction of Thread Local Random in HDFS-6268.  The mechanism appears to 
> be that this change causes all the random numbers in the threads to be 
> correlated, thus preventing a truly random choice of DN for each replica copy.
> Testing details:
> 1 TB TeraGen on 638 slave nodes (virtual machines on 32 physical hosts), 
> 256MB block size.  This results in 6 "primary" blocks on each DN.  With 
> replication=3, there will be on average 12 more copies on each DN that are 
> copies of blocks from other DNs.  Because of the random selection of DNs, 
> exactly 12 copies are not expected, but I found that about 160 DNs (1/4 of 
> all DNs!) received absolutely no copies, while one DN received over 100 
> copies, and the elapsed time increased by about 3X from a pre-HDFS-6268 
> distro.  There was no pattern to which DNs didn't receive copies, nor was the 
> set of such DNs repeatable run-to-run. In addition to the performance 
> problem, there could be capacity problems due to one or a few DNs running out 
> of space. Testing was done on CDH 5.0.0 (before) and CDH 5.1.2 (after), but I 
> don't see a significant difference from the Apache Hadoop source in this 
> regard. The workaround to recover the previous behavior is to set 
> dfs.namenode.handler.count=1 but of course this has scaling implications for 
> large clusters.
> I recommend that the ThreadLocal Random part of HDFS-6268 be reverted until a 
> better algorithm can be implemented and tested.  Testing should include a 
> case with many DNs and a small number of blocks on each.
> It should also be noted that even pre-HDFS-6268, the random choice of DN 
> algorithm produces a rather non-uniform distribution of copies.  This is not 
> due to any bug, but purely a case of random distributions being much less 
> uniform than one might intuitively expect. In the above case, pre-HDFS-6268 
> yields something like a range of 3 to 25 block copies on each DN. 
> Surprisingly, the performance penalty of this non-uniformity is not as big as 
> might be expected (maybe only 10-20%), but HDFS should do better, and in any 
> case the capacity issue remains.  Round-robin choice of DN?  Better awareness 
> of which DNs currently store fewer blocks? It's not sufficient that the total 
> number of blocks is similar on each DN at the end, but that at each point in 
> time no individual DN receives a disproportionate number of blocks at once 
> (which could be a danger of a RR algorithm).
> Probably should limit this jira to tracking the ThreadLocal issue, and track 
> the random choice issue in another one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to