[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17109784#comment-17109784 ] Zheng Wang commented on HBASE-24152: {quote}Why can't the ssd aspect be factored into the general server weight; e.g. why a replica that is on ssd can't be considered 'more' local that a replica on hdd. {quote} I think the locality should be the actual ratio, or else the users will be confused on it. Anyway, will back if i find a better way. Thanks a lot. > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105944#comment-17105944 ] Michael Stack commented on HBASE-24152: --- Pardon me [~filtertip]. I think this a mistaken direction. The balancer is complicated enough w/o the extra factors -- especially when the cluster large with many regions and we are making stochastic calculation. The balancer factors macro attributes such as rack locality and then replica locality. Teaching the balancer to factor a new attribute, ssd-ism, a sub-attribute of regionservers seems like the wrong direction. Why can't the ssd aspect be factored into the general server weight; e.g. why a replica that is on ssd can't be considered 'more' local that a replica on hdd. > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105888#comment-17105888 ] Zheng Wang commented on HBASE-24152: {quote}Thanks. So, one replica goes to SSD and you want the RS w/ that replica to be favored over the others? {quote} Yeah. {quote}Please explain why we need to carry around a dedicated weight just for this ONE_SSD placement policy; why we can't just use the existing measure? {quote} To be exact, this new added weight is for ssd storage type, both ONE_SSD and ALL_SSD storage policy ralated to it. The existing weight dose not care about storage type, so we could not reach the goal that you above-mentioned by it. > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104667#comment-17104667 ] Zheng Wang commented on HBASE-24152: The explain of One_SSD storage policy: "One_SSD - for storing one of the replicas in SSD. The remaining replicas are stored in DISK." Considering the host of replicas is diffrent, and hdfs try to read local replica first, so my starting point is we should move the region to follow the replica in ssd, the existing cost functions does not relate to this, so i add a new one in this issue. Thanks very much. :) > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104665#comment-17104665 ] Michael Stack commented on HBASE-24152: --- Thanks. So, one replica goes to SSD and you want the RS w/ that replica to be favored over the others? Please explain why we need to carry around a dedicated weight just for this ONE_SSD placement policy; why we can't just use the existing measure? > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104645#comment-17104645 ] Zheng Wang commented on HBASE-24152: Here is the doc: [https://hadoop.apache.org/docs/r2.10.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html.] > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104609#comment-17104609 ] Michael Stack commented on HBASE-24152: --- Sorry [~filtertip], I'm having trouble following whats going on here. I went looking for doc on ONE_SSD but doesn't seem to be any (grepping about in HDFS source). I see we set this placement policy per column family. How we make the jump from per-column family storage policy to balancer which is regions-to-server is giving me trouble. Pardon my being 'thick'. > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104054#comment-17104054 ] Zheng Wang commented on HBASE-24152: {quote}Why not just give host3 a more attractive score? Up the weights for host1 and host2 because they are 'only' hdd. Starting a new scoring that is exclusively about SSD doesn't seem like the right direction. Better if when scoring a host for the balancer, there is one scoring only which has taking into account all factors -- count of regions already assigned, as well as whether host host has SSD or not. {quote} Your proposal is target to move more regions to the hosts which has ssd, but it has three disadvantages: 1、Need to get the config of dfs.datanode.data.dir from namenode to judge whether a host has ssd or not. 2、The hosts may be mixed storage that have both hdd and ssd, so we can not easy to specify a consistent weight for them. 3、Only make effect after compaction. The proposal in this issue is about locality, target to the local replica is also the ssd replica, it has two advantages: 1、No need to get config from namenode. 2、Could make effect immediately after movement. {quote} Is this feature for the case where only some hosts in the cluster have SSD? {quote} Yeah, we do not need to worry about that these hosts will has too many regions, restrained by the effect of RegionCountSkewCostFunction. Thanks. > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103887#comment-17103887 ] Michael Stack commented on HBASE-24152: --- Why not just give host3 a more attractive score? Up the weights for host1 and host2 because they are 'only' hdd. Starting a new scoring that is exclusively about SSD doesn't seem like the right direction. Better if when scoring a host for the balancer, there is one scoring only which has taking into account all factors -- count of regions already assigned, as well as whether host host has SSD or not. Is this feature for the case where only some hosts in the cluster have SSD? Thanks. > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103071#comment-17103071 ] Zheng Wang commented on HBASE-24152: {quote}Was reviewing the PR but figured don't have enough understanding of what is going on so let me ask here. I don't follow why the PR is special casing SSD weight. Why it it not just factored into the host general weight? Balancer works at the RS level, not at type of storage? A local replica should be favored whether on ssd or not? Help me out [~filtertip] Thank you. {quote} [~stack] Consider this case when setting ONE_SSD as STORAGE_POLICY. The region-1 opened by host-1, it includes one hdfs block which has three replicas stored as below: replica-1 on host-1(hdd) replica-2 on host-2(hdd) replica-3 on host-3(ssd) Then the reader of hfile will read from hdd, because host-1 is local and has high priority. If we move the region-1 to host-3, the reader will read from ssd, and this cost function could increase the possibility of the movement when making plans. > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24152) Add ServerSsdLocalityCostFunction to StochasticLoadBalancer
[ https://issues.apache.org/jira/browse/HBASE-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102849#comment-17102849 ] Michael Stack commented on HBASE-24152: --- Was reviewing the PR but figured don't have enough understanding of what is going on so let me ask here. I don't follow why the PR is special casing SSD weight. Why it it not just factored into the host general weight? Balancer works at the RS level, not at type of storage? A local replica should be favored whether on ssd or not? Help me out [~filtertip] Thank you. > Add ServerSsdLocalityCostFunction to StochasticLoadBalancer > --- > > Key: HBASE-24152 > URL: https://issues.apache.org/jira/browse/HBASE-24152 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Zheng Wang >Assignee: Zheng Wang >Priority: Major > > When use ONE_SSD storagy policy, or ALL_SSD but has not enough SSD, there > will be some hdfs blocks on DISK and others on SSD,so it is reasonable to > consider the locality of ssd for StochasticLoadBalancer. -- This message was sent by Atlassian Jira (v8.3.4#803005)