[jira] [Created] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer
Hari Sekhon created HBASE-21014: --- Summary: Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer Key: HBASE-21014 URL: https://issues.apache.org/jira/browse/HBASE-21014 Project: HBase Issue Type: Improvement Components: Balancer Affects Versions: 1.1.2 Reporter: Hari Sekhon Improve Stochastic Balancer to include the HDFS region location hints to avoid HDFS Balancer destroying data locality. Right now according to a mix of docs, jiras and mailing list info it appears that one must change {code:java} hbase.master.loadbalancer.class{code} to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks like this functionality is only within FavouredNodeBalancer and not the standard Stochastic Balancer. [http://hbase.apache.org/book.html#_hbase_and_hdfs] This is not ideal because we'd still like to use all the heuristics and work that has gone in the Stochastic Balancer which I believe right now is the best and most mature HBase balancer. See also the linked Jiras and this discussion: [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: HDFS Balancer
Yeah, my thoughts exactly... though thanks Harsh for taking action to clean up the documentation! Good on you. On Thu, Mar 9, 2017 at 11:01 AM, Jean-Marc Spaggiari <jean-m...@spaggiari.org> wrote: > So there is no way to use the pinning feature without having to use the > favored nodes option? :( > > Le 2017-03-08 6:13 AM, "Harsh J" <ha...@cloudera.com> a écrit : > >> Hey Lars!, >> >> I was on a similar line of investigation today, and I've filed >> https://issues.apache.org/jira/browse/HBASE-17760 to change the text. The >> pinning part of the text is relevant, but the command part isn't. In >> addition, you'd need to manually use the FavoredNodeLoadBalancer work to >> actually get HBase to apply pinning to its writes by passing around proper >> favored-node hint hostnames. I've also linked past and future relevant work >> JIRAs to that one. >> >> Stumbled on this email when searching some follow-throughs, thought I'd >> drop a note. >> >> On Tue, 7 Mar 2017 at 20:18 Ted Yu <yuzhih...@gmail.com> wrote: >> >> > bq. how that - apparently wrong - information came about >> > >> > Maybe Sean / Misty can give some context. >> > >> > Cheers >> > >> > On Tue, Mar 7, 2017 at 6:37 AM, Lars George <lars.geo...@gmail.com> >> wrote: >> > >> > > Hey Ted, >> > > >> > > Thanks Cpt. Obvious :) >> > > >> > > I know how to use "blame" or git log how to find the JIRA, but what I >> was >> > > after is how that - apparently wrong - information came about. And if >> it >> > is >> > > wrong, what _is_ the current status of this feature. >> > > >> > > I do believe this is an important operational piece as it helps with >> > > rearranging clusters. Since it seems to still be missing, I am >> wondering >> > > what needs to be done here. >> > > >> > > Makes sense? >> > > >> > > Lars >> > > >> > > Sent from my iPhone >> > > >> > > > On 6 Mar 2017, at 19:50, Ted Yu <yuzhih...@gmail.com> wrote: >> > > > >> > > > w.r.t. the first question, the quoted paragraph came from: >> > > > >> > > > HBASE-15332 Document how to take advantage of HDFS-6133 in HBase >> > > > >> > > >> On Mon, Mar 6, 2017 at 6:38 PM, Lars George <lars.geo...@gmail.com> >> > > wrote: >> > > >> >> > > >> Hi, >> > > >> >> > > >> I am trying to grok what came out of all these issues about the HDFS >> > > >> balancer and being able to avoid it destroying HBase locality. There >> > > >> is this https://issues.apache.org/jira/browse/HBASE-13021 from JM, >> > and >> > > >> the book http://hbase.apache.org/book.html#_hbase_and_hdfs refers >> to >> > > >> https://issues.apache.org/jira/browse/HDFS-6133, stating: >> > > >> >> > > >> "HDFS-6133 provides the ability to exclude a given directory from >> the >> > > >> HDFS load balancer, by setting the dfs.datanode.block-pinning. >> enabled >> > > >> property to true in your HDFS configuration and running the >> following >> > > >> hdfs command: >> > > >> >> > > >> $ sudo -u hdfs hdfs balancer -exclude /hbase" >> > > >> >> > > >> I checked the Balancer class in 2.7.2 and it does not have that >> > > >> support, i.e. being able to exclude a path, it can only exclude >> hosts. >> > > >> That is also clear from HDFS-6133, which adds favoured nodes, but >> not >> > > >> being able to exclude paths (which would be nice). >> > > >> >> > > >> HBASE-13021 mentions that this works in tandem with the HBase >> favored >> > > >> node feature, but that makes it much more complicated since you have >> > > >> to pin individual regions to nodes, instead of doing that wholesale. >> > > >> >> > > >> Where does the above in the HBase book come from, and what is the >> > > >> current state as far as you know? >> > > >> >> > > >> Cheers, >> > > >> Lars >> > > >> >> > > >> > >>
Re: HDFS Balancer
So there is no way to use the pinning feature without having to use the favored nodes option? :( Le 2017-03-08 6:13 AM, "Harsh J" <ha...@cloudera.com> a écrit : > Hey Lars!, > > I was on a similar line of investigation today, and I've filed > https://issues.apache.org/jira/browse/HBASE-17760 to change the text. The > pinning part of the text is relevant, but the command part isn't. In > addition, you'd need to manually use the FavoredNodeLoadBalancer work to > actually get HBase to apply pinning to its writes by passing around proper > favored-node hint hostnames. I've also linked past and future relevant work > JIRAs to that one. > > Stumbled on this email when searching some follow-throughs, thought I'd > drop a note. > > On Tue, 7 Mar 2017 at 20:18 Ted Yu <yuzhih...@gmail.com> wrote: > > > bq. how that - apparently wrong - information came about > > > > Maybe Sean / Misty can give some context. > > > > Cheers > > > > On Tue, Mar 7, 2017 at 6:37 AM, Lars George <lars.geo...@gmail.com> > wrote: > > > > > Hey Ted, > > > > > > Thanks Cpt. Obvious :) > > > > > > I know how to use "blame" or git log how to find the JIRA, but what I > was > > > after is how that - apparently wrong - information came about. And if > it > > is > > > wrong, what _is_ the current status of this feature. > > > > > > I do believe this is an important operational piece as it helps with > > > rearranging clusters. Since it seems to still be missing, I am > wondering > > > what needs to be done here. > > > > > > Makes sense? > > > > > > Lars > > > > > > Sent from my iPhone > > > > > > > On 6 Mar 2017, at 19:50, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > > w.r.t. the first question, the quoted paragraph came from: > > > > > > > > HBASE-15332 Document how to take advantage of HDFS-6133 in HBase > > > > > > > >> On Mon, Mar 6, 2017 at 6:38 PM, Lars George <lars.geo...@gmail.com> > > > wrote: > > > >> > > > >> Hi, > > > >> > > > >> I am trying to grok what came out of all these issues about the HDFS > > > >> balancer and being able to avoid it destroying HBase locality. There > > > >> is this https://issues.apache.org/jira/browse/HBASE-13021 from JM, > > and > > > >> the book http://hbase.apache.org/book.html#_hbase_and_hdfs refers > to > > > >> https://issues.apache.org/jira/browse/HDFS-6133, stating: > > > >> > > > >> "HDFS-6133 provides the ability to exclude a given directory from > the > > > >> HDFS load balancer, by setting the dfs.datanode.block-pinning. > enabled > > > >> property to true in your HDFS configuration and running the > following > > > >> hdfs command: > > > >> > > > >> $ sudo -u hdfs hdfs balancer -exclude /hbase" > > > >> > > > >> I checked the Balancer class in 2.7.2 and it does not have that > > > >> support, i.e. being able to exclude a path, it can only exclude > hosts. > > > >> That is also clear from HDFS-6133, which adds favoured nodes, but > not > > > >> being able to exclude paths (which would be nice). > > > >> > > > >> HBASE-13021 mentions that this works in tandem with the HBase > favored > > > >> node feature, but that makes it much more complicated since you have > > > >> to pin individual regions to nodes, instead of doing that wholesale. > > > >> > > > >> Where does the above in the HBase book come from, and what is the > > > >> current state as far as you know? > > > >> > > > >> Cheers, > > > >> Lars > > > >> > > > > > >
Re: HDFS Balancer
Hey Lars!, I was on a similar line of investigation today, and I've filed https://issues.apache.org/jira/browse/HBASE-17760 to change the text. The pinning part of the text is relevant, but the command part isn't. In addition, you'd need to manually use the FavoredNodeLoadBalancer work to actually get HBase to apply pinning to its writes by passing around proper favored-node hint hostnames. I've also linked past and future relevant work JIRAs to that one. Stumbled on this email when searching some follow-throughs, thought I'd drop a note. On Tue, 7 Mar 2017 at 20:18 Ted Yu <yuzhih...@gmail.com> wrote: > bq. how that - apparently wrong - information came about > > Maybe Sean / Misty can give some context. > > Cheers > > On Tue, Mar 7, 2017 at 6:37 AM, Lars George <lars.geo...@gmail.com> wrote: > > > Hey Ted, > > > > Thanks Cpt. Obvious :) > > > > I know how to use "blame" or git log how to find the JIRA, but what I was > > after is how that - apparently wrong - information came about. And if it > is > > wrong, what _is_ the current status of this feature. > > > > I do believe this is an important operational piece as it helps with > > rearranging clusters. Since it seems to still be missing, I am wondering > > what needs to be done here. > > > > Makes sense? > > > > Lars > > > > Sent from my iPhone > > > > > On 6 Mar 2017, at 19:50, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > w.r.t. the first question, the quoted paragraph came from: > > > > > > HBASE-15332 Document how to take advantage of HDFS-6133 in HBase > > > > > >> On Mon, Mar 6, 2017 at 6:38 PM, Lars George <lars.geo...@gmail.com> > > wrote: > > >> > > >> Hi, > > >> > > >> I am trying to grok what came out of all these issues about the HDFS > > >> balancer and being able to avoid it destroying HBase locality. There > > >> is this https://issues.apache.org/jira/browse/HBASE-13021 from JM, > and > > >> the book http://hbase.apache.org/book.html#_hbase_and_hdfs refers to > > >> https://issues.apache.org/jira/browse/HDFS-6133, stating: > > >> > > >> "HDFS-6133 provides the ability to exclude a given directory from the > > >> HDFS load balancer, by setting the dfs.datanode.block-pinning.enabled > > >> property to true in your HDFS configuration and running the following > > >> hdfs command: > > >> > > >> $ sudo -u hdfs hdfs balancer -exclude /hbase" > > >> > > >> I checked the Balancer class in 2.7.2 and it does not have that > > >> support, i.e. being able to exclude a path, it can only exclude hosts. > > >> That is also clear from HDFS-6133, which adds favoured nodes, but not > > >> being able to exclude paths (which would be nice). > > >> > > >> HBASE-13021 mentions that this works in tandem with the HBase favored > > >> node feature, but that makes it much more complicated since you have > > >> to pin individual regions to nodes, instead of doing that wholesale. > > >> > > >> Where does the above in the HBase book come from, and what is the > > >> current state as far as you know? > > >> > > >> Cheers, > > >> Lars > > >> > > >
[jira] [Created] (HBASE-17760) HDFS Balancer doc is misleading
Harsh J created HBASE-17760: --- Summary: HDFS Balancer doc is misleading Key: HBASE-17760 URL: https://issues.apache.org/jira/browse/HBASE-17760 Project: HBase Issue Type: Bug Components: documentation Reporter: Harsh J Assignee: Harsh J Priority: Minor HBASE-15332 added a doc note about how to use HDFS-6133, but the steps it adds are incorrect. The specific balancer command provided in the doc note is incorrect and not required. Since HBase uses favored nodes features internally (HBASE-7932), and HBASE-7942 extended that information to cover HDFS hinting too, the only step required in the doc note is to enable the pinning feature DN-side. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: HDFS Balancer
bq. how that - apparently wrong - information came about Maybe Sean / Misty can give some context. Cheers On Tue, Mar 7, 2017 at 6:37 AM, Lars George <lars.geo...@gmail.com> wrote: > Hey Ted, > > Thanks Cpt. Obvious :) > > I know how to use "blame" or git log how to find the JIRA, but what I was > after is how that - apparently wrong - information came about. And if it is > wrong, what _is_ the current status of this feature. > > I do believe this is an important operational piece as it helps with > rearranging clusters. Since it seems to still be missing, I am wondering > what needs to be done here. > > Makes sense? > > Lars > > Sent from my iPhone > > > On 6 Mar 2017, at 19:50, Ted Yu <yuzhih...@gmail.com> wrote: > > > > w.r.t. the first question, the quoted paragraph came from: > > > > HBASE-15332 Document how to take advantage of HDFS-6133 in HBase > > > >> On Mon, Mar 6, 2017 at 6:38 PM, Lars George <lars.geo...@gmail.com> > wrote: > >> > >> Hi, > >> > >> I am trying to grok what came out of all these issues about the HDFS > >> balancer and being able to avoid it destroying HBase locality. There > >> is this https://issues.apache.org/jira/browse/HBASE-13021 from JM, and > >> the book http://hbase.apache.org/book.html#_hbase_and_hdfs refers to > >> https://issues.apache.org/jira/browse/HDFS-6133, stating: > >> > >> "HDFS-6133 provides the ability to exclude a given directory from the > >> HDFS load balancer, by setting the dfs.datanode.block-pinning.enabled > >> property to true in your HDFS configuration and running the following > >> hdfs command: > >> > >> $ sudo -u hdfs hdfs balancer -exclude /hbase" > >> > >> I checked the Balancer class in 2.7.2 and it does not have that > >> support, i.e. being able to exclude a path, it can only exclude hosts. > >> That is also clear from HDFS-6133, which adds favoured nodes, but not > >> being able to exclude paths (which would be nice). > >> > >> HBASE-13021 mentions that this works in tandem with the HBase favored > >> node feature, but that makes it much more complicated since you have > >> to pin individual regions to nodes, instead of doing that wholesale. > >> > >> Where does the above in the HBase book come from, and what is the > >> current state as far as you know? > >> > >> Cheers, > >> Lars > >> >
Re: HDFS Balancer
Hey Ted, Thanks Cpt. Obvious :) I know how to use "blame" or git log how to find the JIRA, but what I was after is how that - apparently wrong - information came about. And if it is wrong, what _is_ the current status of this feature. I do believe this is an important operational piece as it helps with rearranging clusters. Since it seems to still be missing, I am wondering what needs to be done here. Makes sense? Lars Sent from my iPhone > On 6 Mar 2017, at 19:50, Ted Yu <yuzhih...@gmail.com> wrote: > > w.r.t. the first question, the quoted paragraph came from: > > HBASE-15332 Document how to take advantage of HDFS-6133 in HBase > >> On Mon, Mar 6, 2017 at 6:38 PM, Lars George <lars.geo...@gmail.com> wrote: >> >> Hi, >> >> I am trying to grok what came out of all these issues about the HDFS >> balancer and being able to avoid it destroying HBase locality. There >> is this https://issues.apache.org/jira/browse/HBASE-13021 from JM, and >> the book http://hbase.apache.org/book.html#_hbase_and_hdfs refers to >> https://issues.apache.org/jira/browse/HDFS-6133, stating: >> >> "HDFS-6133 provides the ability to exclude a given directory from the >> HDFS load balancer, by setting the dfs.datanode.block-pinning.enabled >> property to true in your HDFS configuration and running the following >> hdfs command: >> >> $ sudo -u hdfs hdfs balancer -exclude /hbase" >> >> I checked the Balancer class in 2.7.2 and it does not have that >> support, i.e. being able to exclude a path, it can only exclude hosts. >> That is also clear from HDFS-6133, which adds favoured nodes, but not >> being able to exclude paths (which would be nice). >> >> HBASE-13021 mentions that this works in tandem with the HBase favored >> node feature, but that makes it much more complicated since you have >> to pin individual regions to nodes, instead of doing that wholesale. >> >> Where does the above in the HBase book come from, and what is the >> current state as far as you know? >> >> Cheers, >> Lars >>
Re: HDFS Balancer
w.r.t. the first question, the quoted paragraph came from: HBASE-15332 Document how to take advantage of HDFS-6133 in HBase On Mon, Mar 6, 2017 at 6:38 PM, Lars George <lars.geo...@gmail.com> wrote: > Hi, > > I am trying to grok what came out of all these issues about the HDFS > balancer and being able to avoid it destroying HBase locality. There > is this https://issues.apache.org/jira/browse/HBASE-13021 from JM, and > the book http://hbase.apache.org/book.html#_hbase_and_hdfs refers to > https://issues.apache.org/jira/browse/HDFS-6133, stating: > > "HDFS-6133 provides the ability to exclude a given directory from the > HDFS load balancer, by setting the dfs.datanode.block-pinning.enabled > property to true in your HDFS configuration and running the following > hdfs command: > > $ sudo -u hdfs hdfs balancer -exclude /hbase" > > I checked the Balancer class in 2.7.2 and it does not have that > support, i.e. being able to exclude a path, it can only exclude hosts. > That is also clear from HDFS-6133, which adds favoured nodes, but not > being able to exclude paths (which would be nice). > > HBASE-13021 mentions that this works in tandem with the HBase favored > node feature, but that makes it much more complicated since you have > to pin individual regions to nodes, instead of doing that wholesale. > > Where does the above in the HBase book come from, and what is the > current state as far as you know? > > Cheers, > Lars >
HDFS Balancer
Hi, I am trying to grok what came out of all these issues about the HDFS balancer and being able to avoid it destroying HBase locality. There is this https://issues.apache.org/jira/browse/HBASE-13021 from JM, and the book http://hbase.apache.org/book.html#_hbase_and_hdfs refers to https://issues.apache.org/jira/browse/HDFS-6133, stating: "HDFS-6133 provides the ability to exclude a given directory from the HDFS load balancer, by setting the dfs.datanode.block-pinning.enabled property to true in your HDFS configuration and running the following hdfs command: $ sudo -u hdfs hdfs balancer -exclude /hbase" I checked the Balancer class in 2.7.2 and it does not have that support, i.e. being able to exclude a path, it can only exclude hosts. That is also clear from HDFS-6133, which adds favoured nodes, but not being able to exclude paths (which would be nice). HBASE-13021 mentions that this works in tandem with the HBase favored node feature, but that makes it much more complicated since you have to pin individual regions to nodes, instead of doing that wholesale. Where does the above in the HBase book come from, and what is the current state as far as you know? Cheers, Lars