[ 
https://issues.apache.org/jira/browse/HDDS-15271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-15271:
-------------------------------
    Summary: Client should prioritize replicas with BCSID that cover the blocks 
 (was: Client should prioritize replicas with BCSID covering the blocks.)

> Client should prioritize replicas with BCSID that cover the blocks
> ------------------------------------------------------------------
>
>                 Key: HDDS-15271
>                 URL: https://issues.apache.org/jira/browse/HDDS-15271
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> Currently, client read prioritizes based on the locality, datanode status 
> (maintenance & decommission), etc. However, the client do not check whether 
> the replica BCSID covers the block the client is trying to read. This causes 
> BCSID_MISMATCH which triggers failover and increases read latency.
> The idea of this patch is to also consider the BCSID as a hint (not a 
> requirement) for client to pick a datanode. If a client requested a block 
> with BCSID N, any datanodes that contains BCSID >= N should be prioritized 
> over those that have datanodes BCSID < N. 
> However, we need to note a few things
> * We should not exclude the replicas with BCSID < N since the container 
> replica BCSID might be stale (either container location cache is stale or 
> container replica heartbeat has not been recorded by SCM). This means that 
> although we will read from replicas with BCSID < N if the previous replicas 
> with BCSID >= N have been replicated.
> * We need to consider all BCSID >= N as being equal. So a replica 1 with 
> BCSID N + 1 and replica 2 with BCSID N + 2 are the same even though replica 2 
> is more up-to-date. This should prevent hotspot.
> We can include BCSID as the sorting requirements for client read.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to