[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-17 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583615#comment-16583615
 ] 

Hari Sekhon commented on HBASE-21014:
-

I'm inclined to leave this open rather than close as a duplicate because this 
is a really important improvement and people are more likely to find one of 
these two tickets if both are left open

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-17 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583614#comment-16583614
 ] 

Hari Sekhon commented on HBASE-21014:
-

Yes it looks like the FavoredStochasticLoadBalancer was supposed to become a 
thing but didn't get finished and there hasn't been any movement on it in 
nearly 2 years :(

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-16 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16582857#comment-16582857
 ] 

Toshihiro Suzuki commented on HBASE-21014:
--

[~harisekhon] Sounds like similar to HBASE-15531? However, it looks like this 
is not complete yet.

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-10 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576290#comment-16576290
 ] 

Hari Sekhon commented on HBASE-21014:
-

I thought this was really the crux of it:

Write the HDFS location preference hints the same as the FavoredNodeBalancer 
does while applying all the usual Stochastic Balancer balancing heuristics to 
make sure regions and load are evenly spread.

That way if I need to rebalance HDFS blocks, the HDFS Balancer won't move the 
region blocks out of their primary active region locations when hdfs block 
pinning is enabled.

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-09 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575249#comment-16575249
 ] 

Toshihiro Suzuki commented on HBASE-21014:
--

{code}
So really what I'm asking for is for the Stochastic Balancer to include the 
hint writes like the FavoredNodeBalancer.
{code}
Could you please tell us the details of your idea?

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-09 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574532#comment-16574532
 ] 

Hari Sekhon commented on HBASE-21014:
-

Yes this is what I thought hence I'd already linked HBase-7932 and book 
reference as well as a discussion on the mailing list from some of my 
ex-colleagues from Cloudera who really know their stuff like Harsh J and Lars 
George.

So really what I'm asking for is for the Stochastic Balancer to include the 
hint writes like the FavoredNodeBalancer.

I already have dfs.datanode.block-pinning.enabled = true, it's just not much 
use until the Stochastic Balancer gets this support as I don't want to lose the 
better balancing which is used more often than an hdfs rebalance.

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-08 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573414#comment-16573414
 ] 

Toshihiro Suzuki commented on HBASE-21014:
--

[~harisekhon]
{quote}
This is the same even when you use FavoredNodeLoadBalancer.
{quote}
Sorry, I was wrong. If you use FavoredNodeLoadBalancer, you need to set the 
"dfs.datanode.block-pinning.enabled" property to true in the HDFS service 
configuration. In this case, HDFS balancer won't move the hdfs blocks of the 
hbase data, so you can run HDFS balancer without losing data locality.
Please see:
https://issues.apache.org/jira/browse/HBASE-7932
https://issues.apache.org/jira/browse/HDFS-6133
https://hbase.apache.org/book.html#_hbase_and_hdfs


> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-08 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16573153#comment-16573153
 ] 

Hari Sekhon commented on HBASE-21014:
-

I thought that was the whole point of the FavoredNodeLoadBalancer  - so that 
HBase Balancer can write those HDFS hints based on the knowledge it has of 
region locations so that HDFS Balancer does not move them and lose data 
locality?

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21014) Improve Stochastic Balancer to write HDFS favoured node hints for region primary blocks to avoid destroying data locality if needing to use HDFS Balancer

2018-08-08 Thread Toshihiro Suzuki (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16572945#comment-16572945
 ] 

Toshihiro Suzuki commented on HBASE-21014:
--

Hi [~harisekhon],

I think we can't avoid to lose data localities for HBase when you run the HDFS 
balancer. This is because HDFS doesn't know the region locations and it doesn't 
take the locations into account for block balancing. This is the same even when 
you use FavoredNodeLoadBalancer. If you need to run HDFS balancer, we can run 
major compaction after that to recover the data localities.

Thanks.

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> -
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer
>Affects Versions: 1.1.2
>Reporter: Hari Sekhon
>Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)