[ https://issues.apache.org/jira/browse/HBASE-23779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033372#comment-17033372 ]
Bharath Vissapragada commented on HBASE-23779: ---------------------------------------------- {quote}Average for file count is reported by yetus and its usually around the 5k. Perhaps the -T has us aggregate file counts? {quote} I don't think it is always around 5k. In fact, I suspected ulimits because the failed jobs I looked at, were running dangerously close to the proc limit of 10k enforced by yetus (example: 9901 (vs. ulimit of 10000)). But I do agree that it could be a memory issue too, like Mark mentioned. Looks like yetus gets this data by polling it from the OS in a loop [1]. So I'd assume it is accurate. For some reason this report only shows up only in the precommits and not in nightly builds (am I wrong?). [1] [https://github.com/apache/yetus/blob/b3a402b012773c94e2ade0797e893d9a14e9f0ed/precommit/src/main/shell/coprocs.d/process_counter.sh#L34] > Up the default fork count to make builds complete faster; make count relative > to CPU count > ------------------------------------------------------------------------------------------ > > Key: HBASE-23779 > URL: https://issues.apache.org/jira/browse/HBASE-23779 > Project: HBase > Issue Type: Bug > Components: test > Reporter: Michael Stack > Assignee: Michael Stack > Priority: Major > Fix For: 3.0.0, 2.3.0 > > Attachments: addendum2.patch, test_yetus_934.0.patch > > > Tests take a long time. Our fork count running all tests are conservative -- > 1 (small) for first part and 5 for second part (medium and large). Rather > than hardcoding we should set the fork count to be relative to machine size. > Suggestion here is 0.75C where C is CPU count. This ups the CPU use on my box. > Looking up at jenkins, it seems like the boxes are 24 cores... at least going > by my random survey. The load reported on a few seems low though this not > representative (looking at machine/uptime). > More parallelism willl probably mean more test failure. Let me take a look > see. -- This message was sent by Atlassian Jira (v8.3.4#803005)