[
https://issues.apache.org/jira/browse/HDDS-15271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated HDDS-15271:
-------------------------------
Summary: Client should prioritize replicas with BCSID that cover the blocks
(was: Client should prioritize replicas with BCSID covering the blocks.)
> Client should prioritize replicas with BCSID that cover the blocks
> ------------------------------------------------------------------
>
> Key: HDDS-15271
> URL: https://issues.apache.org/jira/browse/HDDS-15271
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> Currently, client read prioritizes based on the locality, datanode status
> (maintenance & decommission), etc. However, the client do not check whether
> the replica BCSID covers the block the client is trying to read. This causes
> BCSID_MISMATCH which triggers failover and increases read latency.
> The idea of this patch is to also consider the BCSID as a hint (not a
> requirement) for client to pick a datanode. If a client requested a block
> with BCSID N, any datanodes that contains BCSID >= N should be prioritized
> over those that have datanodes BCSID < N.
> However, we need to note a few things
> * We should not exclude the replicas with BCSID < N since the container
> replica BCSID might be stale (either container location cache is stale or
> container replica heartbeat has not been recorded by SCM). This means that
> although we will read from replicas with BCSID < N if the previous replicas
> with BCSID >= N have been replicated.
> * We need to consider all BCSID >= N as being equal. So a replica 1 with
> BCSID N + 1 and replica 2 with BCSID N + 2 are the same even though replica 2
> is more up-to-date. This should prevent hotspot.
> We can include BCSID as the sorting requirements for client read.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]