This is actually an hdfs consistency question, not hbase. If the hdfs write
succeeded while u had only one DN available, then the other replica on the
offline DN would be invalid now. Then what u have is an under replicated
block, and of your only available DN goes offline before it could be
replicated, the file that block belongs to now is corrupt. If I turn on the
previous offline DN, it would still be corrupt as the replica it has is not
valid anymore (NN knows which is the last valid version of the replica), so
unless u can bring back the DN that has the only valid replica, your hfilr
is corrupt and your data is lost.

On Fri, 3 Jul 2020, 09:12 Paul Carey, <paul.p.ca...@gmail.com> wrote:

> Hi
>
> I'd like to understand how HBase deals with the situation where the
> only available DataNodes for a given offline Region contain stale
> data. Will HBase allow the Region to be brought online again,
> effectively making the inconsistency permanent, or will it refuse to
> do so?
>
> My question is motivated from seeing how Kafka and Elasticsearch
> handle this scenario. They both allow the inconsistency to become
> permanent, Kafka via unclean leader election, and Elasticsearch via
> the allocate_stale_primary command.
>
> To better understand my question, please consider the following example:
>
> - HDFS is configured with `dfs.replication=2` and
> `dfs.namenode.replication.min=1`
> - DataNodes DN1 and DN2 contain the blocks for Region R1
> - DN2 goes offline
> - R1 receives a writes which succeeds as it can be written successfully to
> DN1
> - DN1 goes offline before the NameNode can replicate the
> under-replicated block containing the write to another DataNode
> - At this point the R1 is offline
> - DN2 comes back online, but it does not contain the missed write
>
> There are now two options:
>
> - R1 is brought back online, violating consistency
> - R1 remains offline, indefinitely, until DN1 is brought back online
>
> How does HBase deal with this situation?
>
> Many thanks
>
> Paul
>

Reply via email to