[
https://issues.apache.org/jira/browse/HBASE-26304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang resolved HBASE-26304.
-------------------------------
Hadoop Flags: Reviewed
Resolution: Fixed
Merged to master and branch-2.
Thanks [~bbeaudreault]!
> Reflect out-of-band locality improvements in served requests
> ------------------------------------------------------------
>
> Key: HBASE-26304
> URL: https://issues.apache.org/jira/browse/HBASE-26304
> Project: HBase
> Issue Type: Sub-task
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
>
> Edit: Description updated to avoid needing to read the full investigation
> laid out in the comments.
> Once the LocalityHealer has improved locality of a StoreFile (by moving
> blocks onto the correct host), the Reader's DFSInputStream and Region's
> localityIndex metric must be refreshed. Without refreshing the
> DFSInputStream, the improved locality will not improve latencies. In fact,
> the DFSInputStream may try to fetch blocks that have moved, resulting in a
> ReplicaNotFoundException. This is automatically retried, but the retry will
> temporarily increase long tail latencies relative to configured backoff
> strategy.
> In the original LocalityHealer design, I created a new
> RefreshHDFSBlockDistribution RPC on the RegionServer. This RPC accepts a list
> of region names and, for each region store, re-opens the underlying StoreFile
> if the locality has changed. This implementation was complicated both in
> integrating callbacks into the HDFS Dispatcher and in terms of safely
> re-opening StoreFiles without impacting reads or caches.
> In working to port the LocalityHealer to the Apache projects, I'm taking a
> different approach:
> * The part of the LocalityHealer that moves blocks will be an HDFS project
> contribution
> * As such, the DFSClient should be able to more gracefully recover from
> block moves.
> * Additionally, HBase has some caches of block locations for locality
> reporting and the balancer. Those need to be kept up-to-date.
> The DFSClient improvements are covered in HDFS-16261 and HDFS-16262. As such,
> this issue becomes about updating HBase's block location caches.
> I considered a few different approaches, but the most elegant one I could
> come up with was to tie the HDFSBlockDistribution metrics directly to the
> underlying DFSInputStream of each StoreFile's initialReader. That way, our
> locality metrics are identically representing the block allocations that our
> reads are going through. This also means that our locality metrics will
> naturally adjust as the DFSInputStream adjusts to block moves.
> Once we have accurate locality metrics on the regionserver, the Balancer's
> cache can easily be invalidated via our usual heartbeat methods.
> RegionServers report to the HMaster periodically, which keeps a
> ClusterMetrics method up to date. Right before each balancer invocation, the
> balancer is updated with the latest ClusterMetrics. At this time, we compare
> the old ClusterMetrics to the new, and invalidate the caches for any regions
> whose locality has changed.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)