[
https://issues.apache.org/jira/browse/HDDS-9651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Attila Doroszlai resolved HDDS-9651.
------------------------------------
Fix Version/s: 1.4.0
Resolution: Implemented
> Restore random selection of datanode when reading RATIS objects
> ---------------------------------------------------------------
>
> Key: HDDS-9651
> URL: https://issues.apache.org/jira/browse/HDDS-9651
> Project: Apache Ozone
> Issue Type: Task
> Reporter: Kirill Sizov
> Assignee: Ivan Brusentsev
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.4.0
>
>
> *Observations*
> During performance testing, an issue was identified where data written as
> RATIS/THREE was primarily read from a single datanode, typically the pipeline
> leader. The experiment involved the following steps:
> * Upload a set of files to a cluster using RATIS/THREE pipelines.
> * Attempt to read all those files.
> * Monitor datanode metrics.
> Notably, all reads were performed from the leader datanodes.
> *Motivation*
> Reading all the data from a single datanode creates an uneven load on the
> cluster and reduces the potential total read throughput.
> *Reasons*
> Upon researching this issue, two root causes were identified:
> # The RATIS object reading is currently implemented using a gRPC client
> (specifically, the {{BlockInputStream.getChunkInfos()}} method switches the
> pipeline type to {{STANDALONE}}, and the {{XceiverClientManager}} selects a
> gRPC client).
> Unfortunately, this configuration does not allow the Ozone Manager to
> determine the client's host address since the
> {{OmMetadatareader.getClientAddress()}} method only works for Hadoop RPC
> interactions. This failure to detect the client's address results in an empty
> {{nodesInOrder}} field in the response received by the client, causing the
> datanodes to be read in the default order in the pipeline, with the leader
> node being read first.
> # As the SCM cannot locate the client relative to the cluster topology (see
> {{ScmBlockProtocolServer.sortDatanodes}}), the client receives a Pipeline
> with a null {{nodesInOrder}} field, resulting in reads being done primarily
> from the leader datanode.
> Once these issues are resolved, it is expected that read throughput will
> improve, and the cluster's load when reading RATIS/THREE objects will become
> more balanced.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]