[ 
https://issues.apache.org/jira/browse/CASSANDRA-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-17049:
------------------------------------------
     Bug Category: Parent values: Degradation(12984)Level 1 values: Other 
Exception(12998)
       Complexity: Normal
    Discovered By: Code Inspection
    Fix Version/s: 4.x
                   4.0.x
                   3.11.x
                   3.0.x
         Severity: Low
      Description: 
Batchlog replay process collects addresses of the hosts that have been hinted 
to, so it can flush hints for them to disk before confirming deletion of the 
replayed batches. If a node has been decommissioned during replay, however, 
when the time comes to flush the hints at the very end of replay, 
{{StorageService.getHostIdForEndpoint()}} will return {{null}} for its address, 
which will, down the line, cause {{HintsCatalog::get()}} to be invoked with a 
{{null}} host id argument, causing an NPE.

The simple fix is to check returned host ids for addresses for nulls, and 
collect hinted host ids instead of hinted addresses.

  was:TBD

          Summary: Fix rare NPE caused by batchlog replay / node decomission 
races  (was: TBD)

> Fix rare NPE caused by batchlog replay / node decomission races
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-17049
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17049
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Batch Log, Consistency/Hints
>            Reporter: Aleksey Yeschenko
>            Assignee: Aleksey Yeschenko
>            Priority: Low
>             Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>
> Batchlog replay process collects addresses of the hosts that have been hinted 
> to, so it can flush hints for them to disk before confirming deletion of the 
> replayed batches. If a node has been decommissioned during replay, however, 
> when the time comes to flush the hints at the very end of replay, 
> {{StorageService.getHostIdForEndpoint()}} will return {{null}} for its 
> address, which will, down the line, cause {{HintsCatalog::get()}} to be 
> invoked with a {{null}} host id argument, causing an NPE.
> The simple fix is to check returned host ids for addresses for nulls, and 
> collect hinted host ids instead of hinted addresses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to