[ https://issues.apache.org/jira/browse/HBASE-22618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909134#comment-16909134 ]
HBase QA commented on HBASE-22618: ---------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 20s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 39s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 15m 51s{color} | {color:green} Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}292m 19s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}347m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.TestMasterShutdown | | | hadoop.hbase.replication.TestReplicationSmallTestsSync | | | hadoop.hbase.client.TestSnapshotTemporaryDirectory | | | hadoop.hbase.client.TestFromClientSide | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/PreCommit-HBASE-Build/786/artifact/patchprocess/Dockerfile | | JIRA Issue | HBASE-22618 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12977773/HBASE-22618.master.001.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux c7e9d7a1c47e 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | master / 3eb602c7f7 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.11 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/786/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/786/testReport/ | | Max. process+thread count | 4818 (vs. ulimit of 10000) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/786/console | | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org | This message was automatically generated. > Provide a way to have Heterogeneous deployment > ---------------------------------------------- > > Key: HBASE-22618 > URL: https://issues.apache.org/jira/browse/HBASE-22618 > Project: HBase > Issue Type: Improvement > Affects Versions: 3.0.0, 2.2.0, 2.2.1, 2.1.6, 1.4.11, 2.1.7 > Reporter: Pierre Zemb > Assignee: Pierre Zemb > Priority: Major > Attachments: HBASE-22618.master.001.patch > > > Hi, > We wouls like to open the discussion about bringing the possibility to have > regions deployed on {color:#222222}Heterogeneous deployment{color}, i.e Hbase > cluster running different kind of hardware. > h2. Why? > * Cloud deployments means that we may not be able to have the same hardware > throughout the years > * Some tables may need special requirements such as SSD whereas others > should be using hard-drives > * {color:#222222} {color}*in our usecase*{color:#222222}(single table, > dedicated HBase and Hadoop tuned for our usecase, good key > distribution){color}*, the number of regions per RS was the real limit for > us*{color:#222222}.{color} > h2. Our usecase > We found out that *in our usecase*(single table, dedicated HBase and Hadoop > tuned for our usecase, good key distribution)*, the number of regions per RS > was the real limit for us*. > Over the years, due to historical reasons and also the need to benchmark new > machines, we ended-up with differents groups of hardware: some servers can > handle only 180 regions, whereas the biggest can handle more than 900. > Because of such a difference, we had to disable the LoadBalancing to avoid > the {{roundRobinAssigmnent}}. We developed some internal tooling which are > responsible for load balancing regions across RegionServers. That was 1.5 > year ago. > h2. Our Proof-of-concept > We did work on a Proof-of-concept > [here|https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.java], > and some early tests > [here|https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/HeterogeneousBalancer.java], > > [here|https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalancerBalance.java], > and > [here|https://github.com/PierreZ/hbase/blob/dev/hbase14/balancer/hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestHeterogeneousBalancerRules.java]. > We wrote the balancer for our use-case, which means that: > * there is one table > * there is no region-replica > * good key dispersion > * there is no regions on master > A rule file is loaded before balancing. It contains lines of rules. A rule is > composed of a regexp for hostname, and a limit. For example, we could have: > > {quote}rs[0-9] 200 > rs1[0-9] 50 > {quote} > > RegionServers with hostname matching the first rules will have a limit of > 200, and the others 50. If there's no match, a default is set. > Thanks to the rule, we have two informations: the max number of regions for > this cluster, and the rules for each servers. {{HeterogeneousBalancer}} will > try to balance regions according to their capacity. > Let's take an example. Let's say that we have 20 RS: > * 10 RS, named through {{rs0}} to {{rs9}} loaded with 60 regions each, and > each can handle 200 regions. > * 10 RS, named through {{rs10}} to {{rs19}} loaded with 60 regions each, and > each can support 50 regions. > Based on the following rules: > > {quote}rs[0-9] 200 > rs1[0-9] 50 > {quote} > > The second group is overloaded, whereas the first group has plenty of space. > We know that we can handle at maximum *2500 regions* (200*10 + 50*10) and we > have currently *1200 regions* (60*20). {{HeterogeneousBalancer}} will > understand that the cluster is *full at 48.0%* (1200/2500). Based on this > information, we will then *try to put all the RegionServers to ~48% of load > according to the rules.* In this case, it will move regions from the second > group to the first. > The balancer will: > * compute how many regions needs to be moved. In our example, by moving 36 > regions on rs10, we could go from 120.0% to 46.0% > * select regions with lowest data-locality > * try to find an appropriate RS for the region. We will take the lowest > available RS. > h2. Other implementations and ideas > Clay Baenziger proposed this idea on the dev ML: > {quote}{color:#222222}Could it work to have the stochastic load balancer use > [pluggable cost functions instead of this static list of cost > functions|[https://github.com/apache/hbase/blob/baf3ae80f5588ee848176adefc9f56818458a387/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java#L198]]? > Then, could this type of a load balancer be implemented simply as a new cost > function which folks could choose to load and mix with the others?{color} > {quote} > {color:#222222}I think this could be an interesting way to include > user-functions in the mix. As you know your hardawre and the pattern access, > you can easily know which metrics is important for balancing, for us, it will > only be the number of regions, but we could mix-it with the incoming > writes!{color} > > bhupendra.jain proposed also the ideas of "labels" > > {quote}{color:#222222}Internally, we are also having discussion to develop > similar solution. In our approach, We were also thinking of adding "RS Label" > Feature similar to Hadoop Node Label feature. {color} > {color:#222222}Each RS can have a label to denote its capabilities / > resources . When user create table, there can be extra attributes with its > descriptor. The balancer can decide to host region of table based on RS label > and these attributes further. {color} > {color:#222222}With RS label feature, Balancer can be more intelligent. > Example tables with high read load needs more cache backed by SSDs , So such > table regions should be hosted on RS having SSDs ... {color} > {quote} > {color:#222222}I love the idea, but I think Clay's idea is better for a > better and faster first set of commits on the subject! What do you think? > {color} -- This message was sent by Atlassian JIRA (v7.6.14#76016)