[ https://issues.apache.org/jira/browse/HBASE-16894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186916#comment-16186916 ]
Hadoop QA commented on HBASE-16894: ----------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 3s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} branch-1 passed with JDK v1.8.0_144 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} branch-1 passed with JDK v1.7.0_151 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 52s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} branch-1 passed with JDK v1.8.0_144 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} branch-1 passed with JDK v1.7.0_151 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed with JDK v1.8.0_144 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green} the patch passed with JDK v1.7.0_151 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 25s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 18m 43s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed with JDK v1.8.0_144 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed with JDK v1.7.0_151 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}424m 27s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 3m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}464m 28s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 | | | hadoop.hbase.regionserver.TestRecoveredEdits | | | hadoop.hbase.mapreduce.TestLoadIncrementalHFilesUseSecurityEndPoint | | | hadoop.hbase.client.TestAdmin2 | | Timed out junit tests | org.apache.hadoop.hbase.replication.TestReplicationDisableInactivePeer | | | org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat | | | org.apache.hadoop.hbase.mapreduce.TestTableInputFormat | | | org.apache.hadoop.hbase.regionserver.compactions.TestFIFOCompactionPolicy | | | org.apache.hadoop.hbase.master.TestMasterFailover | | | org.apache.hadoop.hbase.snapshot.TestSnapshotClientRetries | | | org.apache.hadoop.hbase.mapred.TestTableInputFormat | | | org.apache.hadoop.hbase.TestHBaseTestingUtility | | | org.apache.hadoop.hbase.mapred.TestTableMapReduceUtil | | | org.apache.hadoop.hbase.master.TestGetInfoPort | | | org.apache.hadoop.hbase.regionserver.TestRegionReplicas | | | org.apache.hadoop.hbase.regionserver.TestRegionServerAbort | | | org.apache.hadoop.hbase.client.TestMetaWithReplicas | | | org.apache.hadoop.hbase.client.TestClientTimeouts | | | org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort | | | org.apache.hadoop.hbase.master.TestTableLockManager | | | org.apache.hadoop.hbase.master.procedure.TestWALProcedureStoreOnHDFS | | | org.apache.hadoop.hbase.TestClusterBootOrder | | | org.apache.hadoop.hbase.TestJMXConnectorServer | | | org.apache.hadoop.hbase.util.TestMiniClusterLoadEncoded | | | org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential | | | org.apache.hadoop.hbase.regionserver.wal.TestLogRolling | | | org.apache.hadoop.hbase.master.TestMasterShutdown | | | org.apache.hadoop.hbase.regionserver.throttle.TestCompactionWithThroughputController | | | org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster | | | org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush | | | org.apache.hadoop.hbase.master.TestMasterBalanceThrottling | | | org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController | | | org.apache.hadoop.hbase.snapshot.TestSecureExportSnapshot | | | org.apache.hadoop.hbase.replication.TestReplicationEndpoint | | | org.apache.hadoop.hbase.TestAcidGuarantees | | | org.apache.hadoop.hbase.client.TestSnapshotMetadata | | | org.apache.hadoop.hbase.TestGlobalMemStoreSize | | | org.apache.hadoop.hbase.TestNamespace | | | org.apache.hadoop.hbase.client.TestClientPushback | | | org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster | | | org.apache.hadoop.hbase.regionserver.TestCompoundBloomFilter | | | org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion | | | org.apache.hadoop.hbase.snapshot.TestExportSnapshot | | | org.apache.hadoop.hbase.TestRegionRebalancing | | | org.apache.hadoop.hbase.mapreduce.TestCopyTable | | | org.apache.hadoop.hbase.replication.regionserver.TestGlobalThrottler | | | org.apache.hadoop.hbase.TestInfoServers | | | org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithTwoDifferentZKClusters | | | org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase | | | org.apache.hadoop.hbase.wal.TestWALOpenAfterDNRollingStart | | | org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFiles | | | org.apache.hadoop.hbase.TestIOFencing | | | org.apache.hadoop.hbase.util.TestMiniClusterLoadParallel | | | org.apache.hadoop.hbase.replication.TestReplicationKillMasterRS | | | org.apache.hadoop.hbase.replication.TestReplicationKillMasterRSCompressed | | | org.apache.hadoop.hbase.regionserver.wal.TestProtobufLog | | | org.apache.hadoop.hbase.mapreduce.TestLoadIncrementalHFiles | | | org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas | | | org.apache.hadoop.hbase.coprocessor.TestAggregateProtocol | | | org.apache.hadoop.hbase.regionserver.TestRegionServerHostname | | | org.apache.hadoop.hbase.regionserver.TestClusterId | | | org.apache.hadoop.hbase.TestHColumnDescriptorDefaultVersions | | | org.apache.hadoop.hbase.fs.TestBlockReorder | | | org.apache.hadoop.hbase.client.TestTableSnapshotScanner | | | org.apache.hadoop.hbase.TestMetaTableAccessor | | | org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort | | | org.apache.hadoop.hbase.master.TestRegionPlacement2 | | | org.apache.hadoop.hbase.replication.TestMultiSlaveReplication | | | org.apache.hadoop.hbase.replication.TestPerTableCFReplication | | | org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithClusters | | | org.apache.hadoop.hbase.regionserver.TestRegionReplicaFailover | | | org.apache.hadoop.hbase.master.handler.TestEnableTableHandler | | | org.apache.hadoop.hbase.master.handler.TestCreateTableHandler | | | org.apache.hadoop.hbase.wal.TestWALFactory | | | org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan1 | | | org.apache.hadoop.hbase.mapreduce.TestTableSnapshotInputFormat | | | org.apache.hadoop.hbase.regionserver.TestHRegionServerBulkLoad | | | org.apache.hadoop.hbase.replication.TestMasterReplication | | | org.apache.hadoop.hbase.client.TestScannersFromClientSide | | | org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat | | | org.apache.hadoop.hbase.tool.TestCanaryTool | | | org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat | | | org.apache.hadoop.hbase.master.procedure.TestAddColumnFamilyProcedure | | | org.apache.hadoop.hbase.io.TestFileLink | | | org.apache.hadoop.hbase.util.TestFSUtils | | | org.apache.hadoop.hbase.mapreduce.TestImportTsv | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f1cc2c | | JIRA Issue | HBASE-16894 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12889781/HBASE-16894.branch-1.patch | | Optional Tests | asflicense shadedjars javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 66af319e7784 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-1 / aa86657 | | Default Java | 1.7.0_151 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_144 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_151 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/8872/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/8872/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/8872/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > Create more than 1 split per region, generalize HBASE-12590 > ----------------------------------------------------------- > > Key: HBASE-16894 > URL: https://issues.apache.org/jira/browse/HBASE-16894 > Project: HBase > Issue Type: Improvement > Affects Versions: 3.0.0, 2.0.0-alpha-2 > Reporter: Enis Soztutar > Assignee: Yi Liang > Fix For: 2.0.0 > > Attachments: HBASE-16894.branch-1.patch, HBASE-16894.master.patch, > HBASE-16894-V2-master.patch, HBASE-16894-V3-master.patch, > ImplementaionAndSomeQuestion.docx > > > A common request from users is to be able to better control how many map > tasks are created per region. Right now, it is always 1 region = 1 input > split = 1 map task. Same goes for Spark since it uses the TIF. With region > sizes as large as 50 GBs, it is desirable to be able to create more than 1 > split per region. > HBASE-12590 adds a config property for MR jobs to be able to handle skew in > region sizes. The algorithm is roughly: > {code} > If (region size >= average size*ratio) : cut the region into two MR input > splits > If (average size <= region size < average size*ratio) : one region as one MR > input split > If (sum of several continuous regions size < average size * ratio): combine > these regions into one MR input split. > {code} > Although we can set data skew ratio to be 0.5 or something to abuse > HBASE-12590 into creating more than 1 split task per region, it is not ideal. > But there is no way to create more with the patch as it is. For example we > cannot create more than 2 tasks per region. > If we want to fix this properly, we should extend the approach in > HBASE-12590, and make it so that the client can specify the desired num of > mappers, or desired split size, and the TIF generates the splits based on the > current region sizes very similar to the algorithm in HBASE-12590, but a more > generic way. This also would eliminate the hand tuning of data skew ratio. > We also can think about the guidepost approach that Phoenix has in the stats > table which is used for exactly this purpose. Right now, the region can be > split into powers of two assuming uniform distribution within the region. -- This message was sent by Atlassian JIRA (v6.4.14#64029)