[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985983#comment-14985983
 ] 

stack edited comment on HBASE-14420 at 11/2/15 8:45 PM:
--------------------------------------------------------

Here are the longest running tests:
{code}
$ grep -h "<testcase" `find . -iname "TEST-*.xml"` | sed 's/<testcase 
name="\(.*\)" classname="\(.*\)" time="\(.*\)".*/\3\t\2 \1/' |sort -rn |head 
-100
  177.358       org.apache.hadoop.hbase.client.TestReplicasClient 
testSmallScanWithReplicas
  158.826       org.apache.hadoop.hbase.regionserver.TestRemoveRegionMetrics 
testMoveRegion
  146.995       org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd 
testEndToEnd
  106.28        org.apache.hadoop.hbase.regionserver.wal.TestLogRolling 
testLogRolling
  103.126       org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat 
testWithMapReduceMultiRegion
  100.889       org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat 
testMRIncrementalLoadWithSplit
  97.4  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat 
testExcludeMinorCompaction
...
{code}
Tests with most threads:

{code}
[stack@c2021 hbase.git]$ grep -h "after: " `find . -iname "*-output.txt"` | sed 
's/.*: after: \(.*\) Thread=\([0-9]*\).*/\2\t\1/'|sort -rn|head -100
1010    replication.TestReplicationKillSlaveRS#killOneSlaveRS
964     
replication.multiwal.TestReplicationKillMasterRSCompressedWithMultipleWAL#killOneMasterRS
942     replication.TestReplicationKillMasterRS#killOneMasterRS
930     replication.TestReplicationKillMasterRSCompressed#killOneMasterRS
834     snapshot.TestSecureExportSnapshot#testExportRetry
834     snapshot.TestSecureExportSnapshot#testExportFailure
832     snapshot.TestSecureExportSnapshot#testExportFileSystemStateWithSkipTmp
830     
snapshot.TestSecureExportSnapshot#testSnapshotWithRefsExportFileSystemState
830     snapshot.TestExportSnapshot#testExportRetry
826     snapshot.TestSecureExportSnapshot#testEmptyExportFileSystemState
826     snapshot.TestExportSnapshot#testExportFileSystemStateWithSkipTmp
826     snapshot.TestExportSnapshot#testExportFailure
823     snapshot.TestExportSnapshot#testSnapshotWithRefsExportFileSystemState
820     snapshot.TestSecureExportSnapshot#testConsecutiveExports
818     snapshot.TestExportSnapshot#testEmptyExportFileSystemState
818     replication.TestReplicationSmallTests#testSmallBatch
815     snapshot.TestExportSnapshot#testConsecutiveExports
811     snapshot.TestSecureExportSnapshot#testExportFileSystemState
800     snapshot.TestExportSnapshot#testExportFileSystemState
800     mapreduce.TestCopyTable#testCopyTableWithBulkload
798     snapshot.TestSecureExportSnapshot#testExportWithTargetName
791     snapshot.TestExportSnapshot#testExportWithTargetName
788     mapreduce.TestCopyTable#testMainMethod
787     snapshot.TestSecureExportSnapshot#testBalanceSplit
787     mapreduce.TestCopyTable#testStartStopRow
785     mapred.TestMultiTableSnapshotInputFormat#testScanEmptyToAPP
784     mapreduce.TestMultiTableSnapshotInputFormat#testScanEmptyToAPP
783     mapreduce.TestCopyTable#testRenameFamily
781     snapshot.TestExportSnapshot#testBalanceSplit
779     mapreduce.TestMultiTableSnapshotInputFormat#testScanEmptyToEmpty
778     replication.TestReplicationSmallTests#testReplicationStatus
776     mapreduce.TestTableInputFormatScan1#testGetSplits
774     replication.TestReplicationSmallTests#testSimplePutDelete
774     mapreduce.TestMultiTableSnapshotInputFormat#testScanYZYToEmpty
773     mapreduce.TestTableInputFormatScan2#testScanYZYToEmpty
773     mapreduce.TestMultiTableInputFormat#testScanEmptyToAPP
773     mapred.TestMultiTableSnapshotInputFormat#testScanYZYToEmpty
772     mapreduce.TestTableInputFormatScan2#testScanYYXToEmpty
772     mapreduce.TestMultiTableSnapshotInputFormat#testScanOBBToOPP
770     mapreduce.TestMultiTableInputFormat#testScanEmptyToEmpty
770     mapreduce.TestCopyTable#testCopyTable
769     mapreduce.TestTableInputFormatScan1#testScanEmptyToAPP
768     mapreduce.TestTableInputFormatScan2#testScanYYYToEmpty
767     mapreduce.TestTableInputFormatScan2#testScanOPPToEmpty
767     mapreduce.TestMultiTableInputFormat#testScanYZYToEmpty
767     mapred.TestMultiTableSnapshotInputFormat#testScanOBBToOPP
767     mapred.TestMultiTableSnapshotInputFormat#testScanEmptyToEmpty
765     mapreduce.TestTableInputFormatScan1#testScanEmptyToEmpty
765     mapreduce.TestTableInputFormatScan1#testGetSplitsPoint
764     mapreduce.TestTableInputFormatScan2#testScanOBBToOPP
764     mapreduce.TestTableInputFormatScan2#testScanFromConfiguration
763     mapreduce.TestTableInputFormatScan2#testScanOBBToQPP
762     mapreduce.TestTableInputFormatScan1#testScanEmptyToOPP
762     mapreduce.TestMultiTableInputFormat#testScanOBBToOPP
761     mapreduce.TestTableInputFormatScan1#testScanEmptyToBBB
759     replication.TestReplicationSmallTests#testLoading
757     mapreduce.TestTableInputFormatScan1#testScanEmptyToBBA
720     regionserver.TestRegionFavoredNodes#testFavoredNodes
717     replication.TestReplicationSmallTests#testCompactionWALEdits
...
{code}

Tests using lots of file descriptors:

{code}
[stack@c2021 hbase.git]$ grep -h "after: " `find . -iname "*-output.txt"` | sed 
's/.*: after: \([^ ]*\).*OpenFileDescriptor=\([0-9]*\).*/\2\t\1/'|grep -v 
'after: '|sort -rn|head -100
1010    ipc.TestAsyncIPC#testRTEDuringAsyncConnectionSetup[3]
1010    ipc.TestAsyncIPC#testRpcScheduler[2]
978     ipc.TestAsyncIPC#testCompressCellBlock[2]
947     mapred.TestMultiTableSnapshotInputFormat#testScanYZYToEmpty
946     ipc.TestAsyncIPC#testNoCodec[2]
943     mapreduce.TestMultiTableSnapshotInputFormat#testScanEmptyToAPP
943     mapred.TestMultiTableSnapshotInputFormat#testScanEmptyToAPP
942     mapreduce.TestMultiTableSnapshotInputFormat#testScanYZYToEmpty
935     snapshot.TestSecureExportSnapshot#testExportFailure
933     mapreduce.TestCopyTable#testRenameFamily
931     
client.replication.TestReplicationAdminWithClusters#testEnableReplicationWhenSlaveClusterDoesntHaveTable
926     mapreduce.TestTableInputFormatScan2#testScanOPPToEmpty
926     mapreduce.TestMultiTableInputFormat#testScanYZYToEmpty
923     snapshot.TestExportSnapshot#testExportFailure
921     mapreduce.TestCopyTable#testMainMethod
921     ipc.TestAsyncIPC#testAsyncConnectionSetup[3]
920     mapreduce.TestTableInputFormatScan1#testScanEmptyToBBA
919     mapred.TestMultiTableSnapshotInputFormat#testScanEmptyToEmpty
917     mapreduce.TestMultiTableSnapshotInputFormat#testScanOBBToOPP
916     client.TestMultiParallel#testBatchWithMixedActions
914     mapred.TestMultiTableSnapshotInputFormat#testScanOBBToOPP
912     client.TestMultiParallel#testNonceCollision
909     client.TestMultiParallel#testBatchWithDelete
904     mapreduce.TestMultiTableInputFormat#testScanOBBToOPP
904     client.TestMultiParallel#testHTableDeleteWithList
904     client.TestMultiParallel#testBadFam
903     mapreduce.TestTableInputFormatScan2#testScanOBBToQPP
903     mapreduce.TestTableInputFormatScan1#testGetSplitsPoint
902     client.TestMultiParallel#testFlushCommitsNoAbort
900     mapreduce.TestTableInputFormatScan2#testScanOBBToOPP
893     client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut
891     
master.TestRegionPlacement2#testFavoredNodesPresentForRoundRobinAssignment
891     master.TestRegionPlacement2#testFavoredNodesPresentForRandomAssignment
880     client.TestFromClientSide#testJiraTest861
878     client.TestFromClientSideWithCoprocessor#testJiraTest861
861     
client.replication.TestReplicationAdminWithClusters#testEnableReplicationForNonExistingTable
860     replication.TestReplicationSmallTests#testSmallBatch
860     replication.TestReplicationSmallTests#testSimplePutDelete
860     replication.TestReplicationSmallTests#testDisableEnable
860     replication.TestReplicationSmallTests#testCompactionWALEdits
855     replication.TestReplicationSmallTests#testVerifyRepJob
855     client.TestFromClientSide#testNullWithReverseScan
854     client.TestFromClientSideWithCoprocessor#testSimpleMissing
852     client.TestFromClientSide#testJiraTest867
851     client.TestFromClientSide#testMultiRowMutation
850     client.TestFromClientSideWithCoprocessor#testNullWithReverseScan
850     client.TestFromClientSideWithCoprocessor#testJiraTest867
849     client.TestFromClientSideWithCoprocessor#testMultiRowMutation
849     client.TestFromClientSide#testSimpleMissing
835     client.TestMultiParallel#testBatchWithIncrementAndAppend
835     client.TestMultiParallel#testBatchWithGet
834     replication.TestReplicationSmallTests#testReplicationStatus
834     
client.TestFromClientSideWithCoprocessor#testGetStartEndKeysWithRegionReplicas
832     client.TestFromClientSide#testGetStartEndKeysWithRegionReplicas
830     client.TestFromClientSideWithCoprocessor#testFilterAllRecords
828     client.TestFromClientSideWithCoprocessor#testScan_NullQualifier
....
{code}


was (Author: stack):
Here are the longest running tests:
{code}$ grep -h "<testcase" `find . -iname "TEST-*.xml"` | sed 's/<testcase 
name="\(.*\)" classname="\(.*\)" time="\(.*\)".*/\3\t\2 \1/' |sort -rn |head 
-100
  177.358       org.apache.hadoop.hbase.client.TestReplicasClient 
testSmallScanWithReplicas
  158.826       org.apache.hadoop.hbase.regionserver.TestRemoveRegionMetrics 
testMoveRegion
  146.995       org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd 
testEndToEnd
  106.28        org.apache.hadoop.hbase.regionserver.wal.TestLogRolling 
testLogRolling
  103.126       org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat 
testWithMapReduceMultiRegion
  100.889       org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat 
testMRIncrementalLoadWithSplit
  97.4  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat 
testExcludeMinorCompaction
...

Tests with most threads:

{code}
[stack@c2021 hbase.git]$ grep -h "after: " `find . -iname "*-output.txt"` | sed 
's/.*: after: \(.*\) Thread=\([0-9]*\).*/\2\t\1/'|sort -rn|head -100
1010    replication.TestReplicationKillSlaveRS#killOneSlaveRS
964     
replication.multiwal.TestReplicationKillMasterRSCompressedWithMultipleWAL#killOneMasterRS
942     replication.TestReplicationKillMasterRS#killOneMasterRS
930     replication.TestReplicationKillMasterRSCompressed#killOneMasterRS
834     snapshot.TestSecureExportSnapshot#testExportRetry
834     snapshot.TestSecureExportSnapshot#testExportFailure
832     snapshot.TestSecureExportSnapshot#testExportFileSystemStateWithSkipTmp
830     
snapshot.TestSecureExportSnapshot#testSnapshotWithRefsExportFileSystemState
830     snapshot.TestExportSnapshot#testExportRetry
826     snapshot.TestSecureExportSnapshot#testEmptyExportFileSystemState
826     snapshot.TestExportSnapshot#testExportFileSystemStateWithSkipTmp
826     snapshot.TestExportSnapshot#testExportFailure
823     snapshot.TestExportSnapshot#testSnapshotWithRefsExportFileSystemState
820     snapshot.TestSecureExportSnapshot#testConsecutiveExports
818     snapshot.TestExportSnapshot#testEmptyExportFileSystemState
818     replication.TestReplicationSmallTests#testSmallBatch
815     snapshot.TestExportSnapshot#testConsecutiveExports
811     snapshot.TestSecureExportSnapshot#testExportFileSystemState
800     snapshot.TestExportSnapshot#testExportFileSystemState
800     mapreduce.TestCopyTable#testCopyTableWithBulkload
798     snapshot.TestSecureExportSnapshot#testExportWithTargetName
791     snapshot.TestExportSnapshot#testExportWithTargetName
788     mapreduce.TestCopyTable#testMainMethod
787     snapshot.TestSecureExportSnapshot#testBalanceSplit
787     mapreduce.TestCopyTable#testStartStopRow
785     mapred.TestMultiTableSnapshotInputFormat#testScanEmptyToAPP
784     mapreduce.TestMultiTableSnapshotInputFormat#testScanEmptyToAPP
783     mapreduce.TestCopyTable#testRenameFamily
781     snapshot.TestExportSnapshot#testBalanceSplit
779     mapreduce.TestMultiTableSnapshotInputFormat#testScanEmptyToEmpty
778     replication.TestReplicationSmallTests#testReplicationStatus
776     mapreduce.TestTableInputFormatScan1#testGetSplits
774     replication.TestReplicationSmallTests#testSimplePutDelete
774     mapreduce.TestMultiTableSnapshotInputFormat#testScanYZYToEmpty
773     mapreduce.TestTableInputFormatScan2#testScanYZYToEmpty
773     mapreduce.TestMultiTableInputFormat#testScanEmptyToAPP
773     mapred.TestMultiTableSnapshotInputFormat#testScanYZYToEmpty
772     mapreduce.TestTableInputFormatScan2#testScanYYXToEmpty
772     mapreduce.TestMultiTableSnapshotInputFormat#testScanOBBToOPP
770     mapreduce.TestMultiTableInputFormat#testScanEmptyToEmpty
770     mapreduce.TestCopyTable#testCopyTable
769     mapreduce.TestTableInputFormatScan1#testScanEmptyToAPP
768     mapreduce.TestTableInputFormatScan2#testScanYYYToEmpty
767     mapreduce.TestTableInputFormatScan2#testScanOPPToEmpty
767     mapreduce.TestMultiTableInputFormat#testScanYZYToEmpty
767     mapred.TestMultiTableSnapshotInputFormat#testScanOBBToOPP
767     mapred.TestMultiTableSnapshotInputFormat#testScanEmptyToEmpty
765     mapreduce.TestTableInputFormatScan1#testScanEmptyToEmpty
765     mapreduce.TestTableInputFormatScan1#testGetSplitsPoint
764     mapreduce.TestTableInputFormatScan2#testScanOBBToOPP
764     mapreduce.TestTableInputFormatScan2#testScanFromConfiguration
763     mapreduce.TestTableInputFormatScan2#testScanOBBToQPP
762     mapreduce.TestTableInputFormatScan1#testScanEmptyToOPP
762     mapreduce.TestMultiTableInputFormat#testScanOBBToOPP
761     mapreduce.TestTableInputFormatScan1#testScanEmptyToBBB
759     replication.TestReplicationSmallTests#testLoading
757     mapreduce.TestTableInputFormatScan1#testScanEmptyToBBA
720     regionserver.TestRegionFavoredNodes#testFavoredNodes
717     replication.TestReplicationSmallTests#testCompactionWALEdits
...
{code}

Tests using lots of file descriptors:

{code}
[stack@c2021 hbase.git]$ grep -h "after: " `find . -iname "*-output.txt"` | sed 
's/.*: after: \([^ ]*\).*OpenFileDescriptor=\([0-9]*\).*/\2\t\1/'|grep -v 
'after: '|sort -rn|head -100
1010    ipc.TestAsyncIPC#testRTEDuringAsyncConnectionSetup[3]
1010    ipc.TestAsyncIPC#testRpcScheduler[2]
978     ipc.TestAsyncIPC#testCompressCellBlock[2]
947     mapred.TestMultiTableSnapshotInputFormat#testScanYZYToEmpty
946     ipc.TestAsyncIPC#testNoCodec[2]
943     mapreduce.TestMultiTableSnapshotInputFormat#testScanEmptyToAPP
943     mapred.TestMultiTableSnapshotInputFormat#testScanEmptyToAPP
942     mapreduce.TestMultiTableSnapshotInputFormat#testScanYZYToEmpty
935     snapshot.TestSecureExportSnapshot#testExportFailure
933     mapreduce.TestCopyTable#testRenameFamily
931     
client.replication.TestReplicationAdminWithClusters#testEnableReplicationWhenSlaveClusterDoesntHaveTable
926     mapreduce.TestTableInputFormatScan2#testScanOPPToEmpty
926     mapreduce.TestMultiTableInputFormat#testScanYZYToEmpty
923     snapshot.TestExportSnapshot#testExportFailure
921     mapreduce.TestCopyTable#testMainMethod
921     ipc.TestAsyncIPC#testAsyncConnectionSetup[3]
920     mapreduce.TestTableInputFormatScan1#testScanEmptyToBBA
919     mapred.TestMultiTableSnapshotInputFormat#testScanEmptyToEmpty
917     mapreduce.TestMultiTableSnapshotInputFormat#testScanOBBToOPP
916     client.TestMultiParallel#testBatchWithMixedActions
914     mapred.TestMultiTableSnapshotInputFormat#testScanOBBToOPP
912     client.TestMultiParallel#testNonceCollision
909     client.TestMultiParallel#testBatchWithDelete
904     mapreduce.TestMultiTableInputFormat#testScanOBBToOPP
904     client.TestMultiParallel#testHTableDeleteWithList
904     client.TestMultiParallel#testBadFam
903     mapreduce.TestTableInputFormatScan2#testScanOBBToQPP
903     mapreduce.TestTableInputFormatScan1#testGetSplitsPoint
902     client.TestMultiParallel#testFlushCommitsNoAbort
900     mapreduce.TestTableInputFormatScan2#testScanOBBToOPP
893     client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut
891     
master.TestRegionPlacement2#testFavoredNodesPresentForRoundRobinAssignment
891     master.TestRegionPlacement2#testFavoredNodesPresentForRandomAssignment
880     client.TestFromClientSide#testJiraTest861
878     client.TestFromClientSideWithCoprocessor#testJiraTest861
861     
client.replication.TestReplicationAdminWithClusters#testEnableReplicationForNonExistingTable
860     replication.TestReplicationSmallTests#testSmallBatch
860     replication.TestReplicationSmallTests#testSimplePutDelete
860     replication.TestReplicationSmallTests#testDisableEnable
860     replication.TestReplicationSmallTests#testCompactionWALEdits
855     replication.TestReplicationSmallTests#testVerifyRepJob
855     client.TestFromClientSide#testNullWithReverseScan
854     client.TestFromClientSideWithCoprocessor#testSimpleMissing
852     client.TestFromClientSide#testJiraTest867
851     client.TestFromClientSide#testMultiRowMutation
850     client.TestFromClientSideWithCoprocessor#testNullWithReverseScan
850     client.TestFromClientSideWithCoprocessor#testJiraTest867
849     client.TestFromClientSideWithCoprocessor#testMultiRowMutation
849     client.TestFromClientSide#testSimpleMissing
835     client.TestMultiParallel#testBatchWithIncrementAndAppend
835     client.TestMultiParallel#testBatchWithGet
834     replication.TestReplicationSmallTests#testReplicationStatus
834     
client.TestFromClientSideWithCoprocessor#testGetStartEndKeysWithRegionReplicas
832     client.TestFromClientSide#testGetStartEndKeysWithRegionReplicas
830     client.TestFromClientSideWithCoprocessor#testFilterAllRecords
828     client.TestFromClientSideWithCoprocessor#testScan_NullQualifier
....
{code}

> Zombie Stomping Session
> -----------------------
>
>                 Key: HBASE-14420
>                 URL: https://issues.apache.org/jira/browse/HBASE-14420
>             Project: HBase
>          Issue Type: Umbrella
>          Components: test
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>         Attachments: hangers.txt, none_fix (1).txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, none_fix.txt, 
> none_fix.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to