wchevreuil opened a new pull request, #5471: URL: https://github.com/apache/hbase/pull/5471
On restarts, once we read the backing map from the persisted file, we compare the last modification time of the cache recorded there against the last modification time of the cache. If those differ, it means the cache has been updated after the backing map has been persisted, so the backing map might not be accurate. We then iterate though the backing map entires and compare the entries cached time against the related block in the cache, and if those differ, we remove the entry from the map. Currently this validation is made at RS initialisation time, but with caches as large as 1.6TB/30M+ blocks, it can last to an hour, meaning the RS is useless over that time. This PR changes this validation to be performed in the background, whilst direct accesses to a block in the cache would also perform the "cached time" comparison. This PR also moves the "cached time" to the beginning of the block in the cache, instead of the end. We noticed that with the "cached time" at the end we can fail to ensure consistency at some conditions. See UT added in TestRecoveryPersistentBucketCache for further reference. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org