[ 
https://issues.apache.org/jira/browse/HDFS-12711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251809#comment-16251809
 ] 

Allen Wittenauer commented on HDFS-12711:
-----------------------------------------

With the kill code in place, I'm seeing wild fluctuations in hdfs and mr unit 
tests.  Lots of unreaped processes.  Probably a hint that they are paused for 
some reason.  I have a hunch that we're pretty much bottlenecked on IO. Tests 
happen on a single disk that is shared among all the executors on that jenkins 
node.  Let's say 2xHDFS tests are running, that could easily be thousands of 
threads doing IO to the same disk.

It might be smart to decrease the # of parallel tests, at least in HDFS. This 
obviously impacts runtime (which is already out of control) but will probably 
increase accuracy.  Or, we could attempt to split up the tests such that 
compute heavy get done in parallel, IO heavy get done serial. 

Of course, if no one is paying attention to the tests anyway, we could just 
disable them altogether I guess.

> deadly hdfs test
> ----------------
>
>                 Key: HDFS-12711
>                 URL: https://issues.apache.org/jira/browse/HDFS-12711
>             Project: Hadoop HDFS
>          Issue Type: Test
>    Affects Versions: 2.9.0, 2.8.2
>            Reporter: Allen Wittenauer
>            Priority: Critical
>         Attachments: HDFS-12711.branch-2.00.patch, fakepatch.branch-2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to