[jira] [Resolved] (HBASE-28260) Possible data loss in WAL after RegionServer crash

Bryan Beaudreault (Jira) Tue, 12 Mar 2024 05:52:12 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-28260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Bryan Beaudreault resolved HBASE-28260.
---------------------------------------
    Fix Version/s: 2.6.0
                   3.0.0-beta-2
       Resolution: Fixed

Pushed to branch-2.6+. Note that NO_LOCAL_WRITE was added back in 2016 for 
hbase's specific use, but apparently never used. So this Jira finally closes 
the loop on HDFS-3702. Thanks [~charlesconnell] for the contribution!

> Possible data loss in WAL after RegionServer crash
> --------------------------------------------------
>
>                 Key: HBASE-28260
>                 URL: https://issues.apache.org/jira/browse/HBASE-28260
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Bryan Beaudreault
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.6.0, 3.0.0-beta-2
>
>
> We recently had a production incident:
>  # RegionServer crashes, but local DataNode lives on
>  # WAL lease recovery kicks in
>  # Namenode reconstructs the block during lease recovery (which results in a 
> new genstamp). It chooses the replica on the local DataNode as the primary.
>  # Local DataNode reconstructs the block, so NameNode registers the new 
> genstamp.
>  # Local DataNode and the underlying host dies, before the new block could be 
> replicated to other replicas.
> This leaves us with a missing block, because the new genstamp block has no 
> replicas. The old replicas still remain, but are considered corrupt due to 
> GENSTAMP_MISMATCH.
> Thankfully we were able to confirm that the length of the corrupt blocks were 
> identical to the newly constructed and lost block. Further, the file in 
> question was only 1 block. So we downloaded one of those corrupt block files 
> and hdfs {{hdfs dfs -put -f}} to force that block to replace the file in 
> hdfs. So in this case we had no actual data loss, but it could have happened 
> easily if the file was more than 1 block or the replicas weren't fully in 
> sync prior to reconstruction.
> In order to avoid this issue, we should avoid writing WAL blocks too the 
> local datanode. We can use CreateFlag.NO_WRITE_LOCAL for this. Hat tip to 
> [~weichiu] for pointing this out.
> During reading of WALs we already reorder blocks so as to avoid reading from 
> the local datanode, but avoiding writing there altogether would be better.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HBASE-28260) Possible data loss in WAL after RegionServer crash

Reply via email to