[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2022-03-07 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502334#comment-17502334
 ] 

Hudson commented on HBASE-26304:


Results for branch master
[build #528 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/528/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/528/General_20Nightly_20Build_20Report/]








(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/528/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-2
>
>
> Edit: Description updated to avoid needing to read the full investigation 
> laid out in the comments.
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> temporarily increase long tail latencies relative to configured backoff 
> strategy.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed. This implementation was complicated both in 
> integrating callbacks into the HDFS Dispatcher and in terms of safely 
> re-opening StoreFiles without impacting reads or caches. 
> In working to port the LocalityHealer to the Apache projects, I'm taking a 
> different approach:
>  * The part of the LocalityHealer that moves blocks will be an HDFS project 
> contribution
>  * As such, the DFSClient should be able to more gracefully recover from 
> block moves.
>  * Additionally, HBase has some caches of block locations for locality 
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in HDFS-16261 and HDFS-16262. As such, 
> this issue becomes about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could 
> come up with was to tie the HDFSBlockDistribution metrics directly to the 
> underlying DFSInputStream of each StoreFile's initialReader. That way, our 
> locality metrics are identically representing the block allocations that our 
> reads are going through. This also means that our locality metrics will 
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's 
> cache can easily be invalidated via our usual heartbeat methods. 
> RegionServers report to the HMaster periodically, which keeps a 
> ClusterMetrics method up to date. Right before each balancer invocation, the 
> balancer is updated with the latest ClusterMetrics. At this time, we compare 
> the old ClusterMetrics to the new, and invalidate the caches for any regions 
> whose locality has changed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-12-04 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453495#comment-17453495
 ] 

Hudson commented on HBASE-26304:


Results for branch branch-2
[build #409 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/409/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/409/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/409/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/409/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/branch-2/409/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
> Fix For: 2.5.0, 3.0.0-alpha-2
>
>
> Edit: Description updated to avoid needing to read the full investigation 
> laid out in the comments.
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> temporarily increase long tail latencies relative to configured backoff 
> strategy.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed. This implementation was complicated both in 
> integrating callbacks into the HDFS Dispatcher and in terms of safely 
> re-opening StoreFiles without impacting reads or caches. 
> In working to port the LocalityHealer to the Apache projects, I'm taking a 
> different approach:
>  * The part of the LocalityHealer that moves blocks will be an HDFS project 
> contribution
>  * As such, the DFSClient should be able to more gracefully recover from 
> block moves.
>  * Additionally, HBase has some caches of block locations for locality 
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in HDFS-16261 and HDFS-16262. As such, 
> this issue becomes about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could 
> come up with was to tie the HDFSBlockDistribution metrics directly to the 
> underlying DFSInputStream of each StoreFile's initialReader. That way, our 
> locality metrics are identically representing the block allocations that our 
> reads are going through. This also means that our locality metrics will 
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's 
> cache can easily be invalidated via our usual heartbeat methods. 
> RegionServers report to the HMaster periodically, which keeps a 
> ClusterMetrics method up to date. Right before each balancer invocation, the 
> balancer is updated with the latest ClusterMetrics. At this time, we compare 
> the old ClusterMetrics to the new, and invalidate the caches for any regions 
> whose locality has changed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-11-27 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17449916#comment-17449916
 ] 

Hudson commented on HBASE-26304:


Results for branch master
[build #453 on 
builds.a.o|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/453/]:
 (/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/453/General_20Nightly_20Build_20Report/]






(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/453/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hadoop.apache.org/job/HBase/job/HBase%20Nightly/job/master/453/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Edit: Description updated to avoid needing to read the full investigation 
> laid out in the comments.
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> temporarily increase long tail latencies relative to configured backoff 
> strategy.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed. This implementation was complicated both in 
> integrating callbacks into the HDFS Dispatcher and in terms of safely 
> re-opening StoreFiles without impacting reads or caches. 
> In working to port the LocalityHealer to the Apache projects, I'm taking a 
> different approach:
>  * The part of the LocalityHealer that moves blocks will be an HDFS project 
> contribution
>  * As such, the DFSClient should be able to more gracefully recover from 
> block moves.
>  * Additionally, HBase has some caches of block locations for locality 
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in HDFS-16261 and HDFS-16262. As such, 
> this issue becomes about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could 
> come up with was to tie the HDFSBlockDistribution metrics directly to the 
> underlying DFSInputStream of each StoreFile's initialReader. That way, our 
> locality metrics are identically representing the block allocations that our 
> reads are going through. This also means that our locality metrics will 
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's 
> cache can easily be invalidated via our usual heartbeat methods. 
> RegionServers report to the HMaster periodically, which keeps a 
> ClusterMetrics method up to date. Right before each balancer invocation, the 
> balancer is updated with the latest ClusterMetrics. At this time, we compare 
> the old ClusterMetrics to the new, and invalidate the caches for any regions 
> whose locality has changed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-11-27 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17449828#comment-17449828
 ] 

Bryan Beaudreault commented on HBASE-26304:
---

Will do on Monday, thanks for reviewing!

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Edit: Description updated to avoid needing to read the full investigation 
> laid out in the comments.
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> temporarily increase long tail latencies relative to configured backoff 
> strategy.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed. This implementation was complicated both in 
> integrating callbacks into the HDFS Dispatcher and in terms of safely 
> re-opening StoreFiles without impacting reads or caches. 
> In working to port the LocalityHealer to the Apache projects, I'm taking a 
> different approach:
>  * The part of the LocalityHealer that moves blocks will be an HDFS project 
> contribution
>  * As such, the DFSClient should be able to more gracefully recover from 
> block moves.
>  * Additionally, HBase has some caches of block locations for locality 
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in HDFS-16261 and HDFS-16262. As such, 
> this issue becomes about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could 
> come up with was to tie the HDFSBlockDistribution metrics directly to the 
> underlying DFSInputStream of each StoreFile's initialReader. That way, our 
> locality metrics are identically representing the block allocations that our 
> reads are going through. This also means that our locality metrics will 
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's 
> cache can easily be invalidated via our usual heartbeat methods. 
> RegionServers report to the HMaster periodically, which keeps a 
> ClusterMetrics method up to date. Right before each balancer invocation, the 
> balancer is updated with the latest ClusterMetrics. At this time, we compare 
> the old ClusterMetrics to the new, and invalidate the caches for any regions 
> whose locality has changed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-11-26 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17449725#comment-17449725
 ] 

Duo Zhang commented on HBASE-26304:
---

[~bbeaudreault] Please open a PR for branch-2? The master patch can not be 
applied to branch-2 cleanly.

Thanks.

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Edit: Description updated to avoid needing to read the full investigation 
> laid out in the comments.
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> temporarily increase long tail latencies relative to configured backoff 
> strategy.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed. This implementation was complicated both in 
> integrating callbacks into the HDFS Dispatcher and in terms of safely 
> re-opening StoreFiles without impacting reads or caches. 
> In working to port the LocalityHealer to the Apache projects, I'm taking a 
> different approach:
>  * The part of the LocalityHealer that moves blocks will be an HDFS project 
> contribution
>  * As such, the DFSClient should be able to more gracefully recover from 
> block moves.
>  * Additionally, HBase has some caches of block locations for locality 
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in HDFS-16261 and HDFS-16262. As such, 
> this issue becomes about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could 
> come up with was to tie the HDFSBlockDistribution metrics directly to the 
> underlying DFSInputStream of each StoreFile's initialReader. That way, our 
> locality metrics are identically representing the block allocations that our 
> reads are going through. This also means that our locality metrics will 
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's 
> cache can easily be invalidated via our usual heartbeat methods. 
> RegionServers report to the HMaster periodically, which keeps a 
> ClusterMetrics method up to date. Right before each balancer invocation, the 
> balancer is updated with the latest ClusterMetrics. At this time, we compare 
> the old ClusterMetrics to the new, and invalidate the caches for any regions 
> whose locality has changed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-11-08 Thread Huaxiang Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440687#comment-17440687
 ] 

Huaxiang Sun commented on HBASE-26304:
--

Thanks [~bbeaudreault] . Will try to look at the patch in the following days 
and try out at my testing clusters as well.

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> temporarily increase long tail latencies relative to configured backoff 
> strategy.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed. This implementation was complicated both in 
> integrating callbacks into the HDFS Dispatcher and in terms of safely 
> re-opening StoreFiles without impacting reads or caches. 
> In working to port the LocalityHealer to the Apache projects, I'm taking a 
> different approach:
>  * The part of the LocalityHealer that moves blocks will be an HDFS project 
> contribution
>  * As such, the DFSClient should be able to more gracefully recover from 
> block moves.
>  * Additionally, HBase has some caches of block locations for locality 
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in 
> https://issues.apache.org/jira/browse/HDFS-16261. As such, this issue becomes 
> about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could 
> come up with was to tie the HDFSBlockDistribution metrics directly to the 
> underlying DFSInputStream of each StoreFile's initialReader. That way, our 
> locality metrics are identically representing the block allocations that our 
> reads are going through. This also means that our locality metrics will 
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's 
> cache can easily be invalidated via our usual heartbeat methods. 
> RegionServers report to the HMaster periodically, which keeps a 
> ClusterMetrics method up to date. Right before each balancer invocation, the 
> balancer is updated with the latest ClusterMetrics. At this time, we compare 
> the old ClusterMetrics to the new, and invalidate the caches for any regions 
> whose locality has changed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-11-08 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440580#comment-17440580
 ] 

Bryan Beaudreault commented on HBASE-26304:
---

This has been rolled out to ~80 clusters in our QA environment. Will likely 
start doing prod rollouts in the near future.

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> temporarily increase long tail latencies relative to configured backoff 
> strategy.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed. This implementation was complicated both in 
> integrating callbacks into the HDFS Dispatcher and in terms of safely 
> re-opening StoreFiles without impacting reads or caches. 
> In working to port the LocalityHealer to the Apache projects, I'm taking a 
> different approach:
>  * The part of the LocalityHealer that moves blocks will be an HDFS project 
> contribution
>  * As such, the DFSClient should be able to more gracefully recover from 
> block moves.
>  * Additionally, HBase has some caches of block locations for locality 
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in 
> https://issues.apache.org/jira/browse/HDFS-16261. As such, this issue becomes 
> about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could 
> come up with was to tie the HDFSBlockDistribution metrics directly to the 
> underlying DFSInputStream of each StoreFile's initialReader. That way, our 
> locality metrics are identically representing the block allocations that our 
> reads are going through. This also means that our locality metrics will 
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's 
> cache can easily be invalidated via our usual heartbeat methods. 
> RegionServers report to the HMaster periodically, which keeps a 
> ClusterMetrics method up to date. Right before each balancer invocation, the 
> balancer is updated with the latest ClusterMetrics. At this time, we compare 
> the old ClusterMetrics to the new, and invalidate the caches for any regions 
> whose locality has changed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-10-28 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17435597#comment-17435597
 ] 

Bryan Beaudreault commented on HBASE-26304:
---

I was able to get the 3rd (arguably ideal) option to work. I've been running it 
in one of our internal test clusters and it's been working great.

I updated the original description and pushed an updated PR based on the final 
approach taken, so that people don't need to read the above wall of text :)

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> temporarily increase long tail latencies relative to configured backoff 
> strategy.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed. This implementation was complicated both in 
> integrating callbacks into the HDFS Dispatcher and in terms of safely 
> re-opening StoreFiles without impacting reads or caches. 
> In working to port the LocalityHealer to the Apache projects, I'm taking a 
> different approach:
>  * The part of the LocalityHealer that moves blocks will be an HDFS project 
> contribution
>  * As such, the DFSClient should be able to more gracefully recover from 
> block moves.
>  * Additionally, HBase has some caches of block locations for locality 
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in 
> https://issues.apache.org/jira/browse/HDFS-16261. As such, this issue becomes 
> about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could 
> come up with was to tie the HDFSBlockDistribution metrics directly to the 
> underlying DFSInputStream of each StoreFile's initialReader. That way, our 
> locality metrics are identically representing the block allocations that our 
> reads are going through. This also means that our locality metrics will 
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's 
> cache can easily be invalidated via our usual heartbeat methods. 
> RegionServers report to the HMaster periodically, which keeps a 
> ClusterMetrics method up to date. Right before each balancer invocation, the 
> balancer is updated with the latest ClusterMetrics. At this time, we compare 
> the old ClusterMetrics to the new, and invalidate the caches for any regions 
> whose locality has changed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-10-22 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433138#comment-17433138
 ] 

Bryan Beaudreault commented on HBASE-26304:
---

Unfortunately there isn't an easy way to achieve the 3rd option. I could 
potentially add something to DFSInputStream, but we'd be stuck waiting for it 
to get backported to all supported hadoop versions. I also realized that it's 
very possible even without the locality healer for blocks to move, and we don't 
really reflect that at all today. I ended up going with option 1 above, as the 
easiest option and also most off the critical path.

I created a LocalityMetricsRefreshChore which periodically refreshes the 
HDFSBlockDistribution for all stores on the server. I thought about limiting it 
to only non-100% locality stores, but decided against it so that we can also 
reflect locality _regressions_ due to datanodes dying or if someone ran 
Balancer or Mover. For anyone monitoring or alerting on locality, it's just as 
important to know if something just totally tanked locality as it is to knowing 
whether it was improved by something like the healer.

I'm doing some testing, but will push a PR with this new chore probably next 
week.

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> increase long tail latencies relative to configured backoff strategy.
> See https://issues.apache.org/jira/browse/HDFS-16155 for an improvement in 
> backoff strategy which can greatly mitigate latency impact of the missing 
> block retry.
> Even with that mitigation, a StoreFile is often made up of many blocks. 
> Without some sort of intervention, we will continue to hit 
> ReplicaNotFoundException over time as clients naturally request data from 
> moved blocks.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed.
> I will submit a PR with that implementation, but I am also investigating 
> other avenues. For example, I noticed 
> https://issues.apache.org/jira/browse/HDFS-15119 which doesn't seem ideal but 
> maybe can be improved as an automatic lower-level handling of block moves.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-10-21 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17432456#comment-17432456
 ] 

Bryan Beaudreault commented on HBASE-26304:
---

As mentioned above, I have implementations for the above 2 HDFS issues and it 
works great for ensuring HBase is able to take advantage of new locality 
improvements without any DFSClient warnings. Before pushing PRs for those, I'm 
now taking a look at the localityIndex reporting issue in case that affects the 
strategy. The core problem is that when a StoreFile is opened, a StoreFileInfo 
object is created. Initializing that StoreFileInfo calls 
computeHDFSBlocksDistribution and caches the result for the lifetime of the 
StoreFileInfo. The resulting value is available via the 
getHDFSBlockDistribution method.

The getHDFSBlockDistribution has three usages:
 * RatioBasedCompactionPolicy and DateTieredCompactionPolicy uses it to force a 
major compaction on files whose BlockLocalityIndex is less than a threshold
 * The value is aggregated for all StoreFiles in an HRegion, and used to create 
RegionLoad objects. RegionLoads are created in a few ways:
 ** On demand, when loading RegionServer UI "Regions" section
 ** On demand, through HBaseAdmin.getRegionLoad(ServerName, TableName)
 ** Periodically, in reporting heartbeat to HMaster, by default 3s. The HMaster 
uses these in a few ways:
 *** Available to query via HBaseAdmin
 *** Used in HMaster UI, where you can see localityIndex when viewing table page
 *** Used in various load balancer functions (though not localityIndex, since 
the balancer computes that separately)
 * The value is aggregated for all StoreFiles in an HRegion, and used to report 
localityIndex metrics.
 ** This happens in a thread which executes on an interval, by default 5s. The 
resulting metrics are available in JMX, hbtop, and the "Server Metrics" section 
at the top of RegionServer UIs.

All of these usages are non-time sensitive, i.e. not in a core read path or 
anything. As such I think we could consider the StoreFileInfo 
hdfsBlockDistribution a cache which must be cleared. Previously it was a cache 
of a value that rarely changed, and now we need more control over clearing. I 
can think of 3 options for this:
 * We could create a periodic chore which reloads the cached value for all 
store files. This could be filtered to only clear values which are not fully 
local.
 * We could add a TTL on the cached value, which gets enforced at read time. In 
other words, when getHDFSBlockDistribution is called, re-compute if TTL is 
expired. We could similarly limit this to only files which are not fully local.
 * We could use some trigger from the DFSInputStream to intelligently refresh 
the HDFSBlockDistribution only if the underlying stream has been updated. I 
think this would have to happen at the HStoreFile level, which has a similar 
getHDFSBlockDistribution which is the only caller to the StorefileInfo method. 
The HStoreFile has access to the initialReader object which can access the 
underlying FSDataInputStreamWrapper. We'd need to expose something in 
DFSInputStream that can be used to trigger the logic.

Of the options, I think the last one is most appealing because we could avoid 
yet another config (the refresh ttl/period). That one also is the most involved 
and requires some investigation. My second preference would be the 2nd option 
above, because I'd like to avoid another chore. I don't think the minor latency 
hit of fetching block locations should be an issue for any of the use cases 
mentioned above.

I'm going to do a little more investigation into what the 3rd option could look 
like.

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> increase long tail latencies relative to configured backoff strategy.
> See https://issues.apache.org/jira/browse/HDFS-16155 for an improvement in 
> backoff strategy which can greatly mitigate latency impact of the missing 
> block retry.
> Even with that mitigation, a StoreFile is often made up of many blocks. 
> Without some sort of intervention, we will continue to hit 
> Re

[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-10-07 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425868#comment-17425868
 ] 

Bryan Beaudreault commented on HBASE-26304:
---

I have a proof of concept working with the above 2 HDFS issues in a test 
cluster. Works great, though as mentioned above I still need to figure out how 
to update localityIndex, aka how to trigger computeHdfsBlockDistribution in 
StoreFileInfo

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> increase long tail latencies relative to configured backoff strategy.
> See https://issues.apache.org/jira/browse/HDFS-16155 for an improvement in 
> backoff strategy which can greatly mitigate latency impact of the missing 
> block retry.
> Even with that mitigation, a StoreFile is often made up of many blocks. 
> Without some sort of intervention, we will continue to hit 
> ReplicaNotFoundException over time as clients naturally request data from 
> moved blocks.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed.
> I will submit a PR with that implementation, but I am also investigating 
> other avenues. For example, I noticed 
> https://issues.apache.org/jira/browse/HDFS-15119 which doesn't seem ideal but 
> maybe can be improved as an automatic lower-level handling of block moves.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-10-06 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425263#comment-17425263
 ] 

Bryan Beaudreault commented on HBASE-26304:
---

Note: the above should drastically reduce the impact of block moves, but will 
not fix our localityIndex metrics. I'm going to need to find some other way to 
expose on the DFSInputStream that locality has improved, which could trigger a 
recalculation of localityIndex. Alternatively, we could just do it as part of a 
chore in the regionserver.

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> increase long tail latencies relative to configured backoff strategy.
> See https://issues.apache.org/jira/browse/HDFS-16155 for an improvement in 
> backoff strategy which can greatly mitigate latency impact of the missing 
> block retry.
> Even with that mitigation, a StoreFile is often made up of many blocks. 
> Without some sort of intervention, we will continue to hit 
> ReplicaNotFoundException over time as clients naturally request data from 
> moved blocks.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed.
> I will submit a PR with that implementation, but I am also investigating 
> other avenues. For example, I noticed 
> https://issues.apache.org/jira/browse/HDFS-15119 which doesn't seem ideal but 
> maybe can be improved as an automatic lower-level handling of block moves.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26304) Reflect out-of-band locality improvements in served requests

2021-10-06 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425216#comment-17425216
 ] 

Bryan Beaudreault commented on HBASE-26304:
---

I submitted the above PR which straight ported the approach I used internally 
for refreshing store files after locality had been healed. After thinking about 
it some more, I've decided to take this in a different direction:

The LocalityHealer (which moves blocks to requested hosts) itself will end up 
being an HDFS project contribution. In the narrow scope of HBase, this issue is 
about ensuring a RegionServer can gracefully recover after blocks have been 
moved from under it. Given the LocalityHealer will be an HDFS project 
contribution, I think ideally the DFSClient itself can gracefully recover from 
such an event.

With that in mind, I'm going to try to take a somewhat different approach:
 * HDFS-15119 added a basic invalidation of DFSInputStream cached 
LocatedBlocks. I'm going to expand upon that so that we can safely and reliably 
refresh block locations for DFSInputStreams lacking a local replica: 
https://issues.apache.org/jira/browse/HDFS-16262
 * Additionally, I'm going to try to add a grace period to block invalidations 
in https://issues.apache.org/jira/browse/HDFS-16261. When a block is moved with 
REPLACE_BLOCK, the block is invalidated on the old host and asynchronously 
deleted. Adding a configurable grace period on the deletion where will give the 
above refresh enough time to refresh cached locations and totally skip any pain 
related to moving blocks around.

> Reflect out-of-band locality improvements in served requests
> 
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> Once the LocalityHealer has improved locality of a StoreFile (by moving 
> blocks onto the correct host), the Reader's DFSInputStream and Region's 
> localityIndex metric must be refreshed. Without refreshing the 
> DFSInputStream, the improved locality will not improve latencies. In fact, 
> the DFSInputStream may try to fetch blocks that have moved, resulting in a 
> ReplicaNotFoundException. This is automatically retried, but the retry will 
> increase long tail latencies relative to configured backoff strategy.
> See https://issues.apache.org/jira/browse/HDFS-16155 for an improvement in 
> backoff strategy which can greatly mitigate latency impact of the missing 
> block retry.
> Even with that mitigation, a StoreFile is often made up of many blocks. 
> Without some sort of intervention, we will continue to hit 
> ReplicaNotFoundException over time as clients naturally request data from 
> moved blocks.
> In the original LocalityHealer design, I created a new 
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list 
> of region names and, for each region store, re-opens the underlying StoreFile 
> if the locality has changed.
> I will submit a PR with that implementation, but I am also investigating 
> other avenues. For example, I noticed 
> https://issues.apache.org/jira/browse/HDFS-15119 which doesn't seem ideal but 
> maybe can be improved as an automatic lower-level handling of block moves.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)