[ 
https://issues.apache.org/jira/browse/HADOOP-9287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663101#comment-13663101
 ] 

Chris Nauroth commented on HADOOP-9287:
---------------------------------------

{quote}
As part of this effort, it would be good to enumerate patterns that can cause 
concurrent tests to fail
{quote}

This sounds like good material for the code review checklist.  
http://wiki.apache.org/hadoop/CodeReviewChecklist

{quote}
Instead of changing individual tests to use unique test folder paths, couldn't 
we just reconfigure test.build.data from the outside (from maven)?
{quote}

This would be very convenient, but unfortunately, I can't think of a way to 
make it work.  The problem is that our pom.xml code hands over control to 
maven-surefire-plugin, which then iterates through each test suite class and 
executes them.  When execution enters maven-surefire-plugin, the Maven 
properties are frozen at a specific state.  I don't believe there is any way 
for our pom.xml code to take back control from maven-surefire-plugin between 
test suite iterations to generate a different unique ID.  Maybe a custom JUnit 
runner could do it?  At that point, it might be more trouble than it's worth.  
Does anyone else have ideas on this?  I'm also not aware of any built-in unique 
ID property or external plugins that generate unique IDs, so we might end up 
needing to code another custom plugin of our own.

{quote}
Chris Nauroth, have you had a chance to kick the tires on this patch for 
Windows?
{quote}

Results look good so far.  First, I ran the tests without the parallel-tests 
profile enabled.  As expected, this caused no harm to the test results on 
Windows.  That's a great sign!

Next, I enabled parallel-tests with the default thread count of 4.  Performance 
improvement was similar to what is reported here: from ~15 minutes down to ~8 
minutes, and this is on a fairly wimpy VM.  I did see some new failures though:

# There were failures due to test timeouts in TestCopyPreserveFlag 
(testPutWithP, testPutWithoutP, testGetWithP, testGetWithoutP), and 
TestLocalFileSystem (testWorkingDirectory, testCopy).  These all have very 
short timeouts (1s).  I suspect that multi-threaded execution introduced a bit 
of context-switching overhead that just barely pushed it over the timeout.  I 
recommend increasing these timeouts to 10s.  Unfortunately, this suggests that 
timeout settings + parallel execution could be another source of flaky test 
results in the future.
# {{TestTFileNoneCodecsByteArrays#testFailureNegativeLength_3}} failed with an 
EOFException, which makes me think that 2 tests tried to share a file or 
directory and saw unexpected data.  This inherits from a base class, and I see 
that the code changes in the base class should have prevented a sharing 
problem, but perhaps we missed something.  I think we ought to investigate this 
one before committing.  It's probably not a Windows problem, but rather just a 
coincidence that the problem manifested on a Windows machine.

[~aklochkov], thanks again for sticking with this issue and responding to the 
feedback.  This is going to be a big help for developer productivity.  I got 
pretty excited when the common tests finished so quickly on my machine!  :-)
                
> Parallel testing hadoop-common
> ------------------------------
>
>                 Key: HADOOP-9287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9287
>             Project: Hadoop Common
>          Issue Type: Test
>          Components: test
>    Affects Versions: 3.0.0
>            Reporter: Tsuyoshi OZAWA
>            Assignee: Andrey Klochkov
>         Attachments: HADOOP-9287.1.patch, HADOOP-9287--N3.patch, 
> HADOOP-9287--N3.patch, HADOOP-9287--N4.patch, HADOOP-9287--N5.patch, 
> HADOOP-9287.patch, HADOOP-9287.patch
>
>
> The maven surefire plugin supports parallel testing feature. By using it, the 
> tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to