[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramkumar Vadali updated MAPREDUCE-2167:
---------------------------------------

    Attachment: MAPREDUCE-2167.4.patch

Fixed a broken test.

TEST RESULTS:


ant test-patch has the same number of failures as a clean checkout

{code}
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 4 new or 
modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
     [exec]
     [exec]     -1 findbugs.  The patch appears to introduce 13 new Findbugs 
warnings.
     [exec]
     [exec]     -1 release audit.  The applied patch generated 2 release audit 
warnings (more than the trunk's current 1 warnings).
     [exec]
     [exec]     +1 system test framework.  The patch passed system test 
framework compile.
     [exec]
     [exec]
     [exec]
     [exec]
     [exec] 
======================================================================
     [exec] 
======================================================================
     [exec]     Finished build.
     [exec] 
======================================================================
     [exec] 
======================================================================
     [exec]
     [exec]
{code}

ant test succeeds:

{code}


test-junit:
    [junit] WARNING: multiple versions of ant detected in path for junit
    [junit]          
jar:file:/home/rvadali/local/external/ant/lib/ant.jar!/org/apache/tools/ant/Project.class
    [junit]      and 
jar:file:/home/rvadali/.ivy2/cache/ant/ant/jars/ant-1.6.5.jar!/org/apache/tools/ant/Project.class
    [junit] Running org.apache.hadoop.hdfs.TestRaidDfs
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 47.071 sec
    [junit] Running org.apache.hadoop.raid.TestBlockFixer
    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 124.583 sec
    [junit] Running org.apache.hadoop.raid.TestDirectoryTraversal
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 9.337 sec
    [junit] Running org.apache.hadoop.raid.TestErasureCodes
    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 24.481 sec
    [junit] Running org.apache.hadoop.raid.TestGaloisField
    [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.392 sec
    [junit] Running org.apache.hadoop.raid.TestHarIndexParser
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.052 sec
    [junit] Running org.apache.hadoop.raid.TestRaidFilter
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 4.485 sec
    [junit] Running org.apache.hadoop.raid.TestRaidHar
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 71.136 sec
    [junit] Running org.apache.hadoop.raid.TestRaidNode
    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 471.072 sec
    [junit] Running org.apache.hadoop.raid.TestRaidPurge
    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 107.828 sec
    [junit] Running org.apache.hadoop.raid.TestRaidShell
    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 25.714 sec

test:

BUILD SUCCESSFUL
Total time: 15 minutes 6 seconds
{code}


> Faster directory traversal for raid node
> ----------------------------------------
>
>                 Key: MAPREDUCE-2167
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2167
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/raid
>            Reporter: Ramkumar Vadali
>            Assignee: Ramkumar Vadali
>         Attachments: MAPREDUCE-2167.2.patch, MAPREDUCE-2167.3.patch, 
> MAPREDUCE-2167.4.patch, MAPREDUCE-2167.patch
>
>
> The RaidNode currently iterates over the directory structure to figure out 
> which files to RAID. With millions of files, this can take a long time - 
> especially if some files are already RAIDed and the RaidNode needs to look at 
> parity files / parity file HARs to determine if the file needs to be RAIDed.
> The directory traversal is encapsulated inside the class DirectoryTraversal, 
> which examines one file at a time, using the caller's thread.
> My proposal is to make this multi-threaded as follows:
>  * use a pool of threads inside DirectoryTraversal
>  * The caller's thread is used to retrieve directories, and each new 
> directory is assigned to a thread in the pool. The worker thread examines all 
> the files the directory.
>  * If there sub-directories, those are added back as workitems to the pool.
> Comments?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to