[ 
https://issues.apache.org/jira/browse/HBASE-29857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18054898#comment-18054898
 ] 

Wellington Chevreuil commented on HBASE-29857:
----------------------------------------------

 Yeah, HBASE-28839 got recently backported to branch-2.6, so it's indeed 
missing on all current 2.6 releases. And yeah, I guess the "while 
(in.available() > 0) " check it introduced would avoid the original scenario 
you got.

 
{quote}However it could be good if we do a NPE check for 
parseDelimitedFrom(in), currently NPE is caught by a generic exception handler 
introduced in HBASE-28839.
{quote}
Yeah, 
[this|https://github.com/apache/hbase/pull/7579/changes#diff-b75abcdb76c582e16144df3a9bf2ddbc8fd0814c06190c33503a2c1cb365273cR397]
 was added because there were other sorts of error we could get when reading 
the persistence file and initialising bucket allocator. I'm fine with your 
proposal too. Please open a github PR targeting master branch and we can 
continue the discussions there. 

 

> BucketCache fails to start when persistence file was written with empty cache
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-29857
>                 URL: https://issues.apache.org/jira/browse/HBASE-29857
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 2.6.3
>            Reporter: rstest
>            Priority: Critical
>
> When a RegionServer with BucketCache persistence enabled is restarted, if the 
> BucketCache was empty at shutdown time, the new RegionServer fails to start 
> with a `NullPointerException` in `BucketCache.parsePB()`.
>  
> The bug is in the interaction between `BucketProtoUtils.serializeAsPB()` and 
> `BucketCache.retrieveChunkedBackingMap()`:
> 1. During shutdown with empty cache: When `backingMap.size() == 0`, 
> `serializeAsPB()` writes `numChunks = 0` to the persistence file, but the 
> loop that writes `BucketCacheEntry` objects never executes (because there are 
> no entries to iterate). This means no BucketCacheEntry is written to the file
> 2. During startup: `retrieveChunkedBackingMap()` reads `numChunks = 0` from 
> the file, but still attempts to read the first chunk using 
> `parseDelimitedFrom()`. Since no `BucketCacheEntry` was written, 
> `parseDelimitedFrom()` returns `null`.
> 3. NPE occurs: The null `firstChunk` is passed to `parsePB()`, which calls 
> `firstChunk.getDeserializersMap()` on the null object, causing NPE.
>  
> This bug just make the region server not able to be restarted.
> I will provide a fix in PR and also a unit test that can reproduce the bug 
> (if the fix is not applied).
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to