Violating strong consistency after the fact

Paul Carey Fri, 03 Jul 2020 01:13:14 -0700

Hi

I'd like to understand how HBase deals with the situation where the
only available DataNodes for a given offline Region contain stale
data. Will HBase allow the Region to be brought online again,
effectively making the inconsistency permanent, or will it refuse to
do so?


My question is motivated from seeing how Kafka and Elasticsearch
handle this scenario. They both allow the inconsistency to become
permanent, Kafka via unclean leader election, and Elasticsearch via
the allocate_stale_primary command.

To better understand my question, please consider the following example:

- HDFS is configured with `dfs.replication=2` and
`dfs.namenode.replication.min=1`
- DataNodes DN1 and DN2 contain the blocks for Region R1
- DN2 goes offline
- R1 receives a writes which succeeds as it can be written successfully to DN1
- DN1 goes offline before the NameNode can replicate the
under-replicated block containing the write to another DataNode
- At this point the R1 is offline
- DN2 comes back online, but it does not contain the missed write

There are now two options:

- R1 is brought back online, violating consistency
- R1 remains offline, indefinitely, until DN1 is brought back online

How does HBase deal with this situation?

Many thanks

Paul

Violating strong consistency after the fact

Reply via email to