[ 
https://issues.apache.org/jira/browse/RATIS-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18078471#comment-18078471
 ] 

Tsz-wo Sze commented on RATIS-2509:
-----------------------------------

[~ivanandika] , thanks for running the benchmark!  It is very insightful.  
Could you also post the flamegraph for the "OM leader-read that does not go 
through Ratis" case?

As you mentioned, there are ByteString conversion.  In Ozone, since OM uses 
Hadoop RPC to receive a request from the network, the ratis request is just a 
local RaftServer call (i.e. no RaftClient).  So, we may add a OmMessage class 
in Ozone to avoid ByteString conversion:
{code}
  class OmMessage implements Message {
    private final OMRequest omRequest;
    private final RaftClientRequest raftClientRequest;

    public OmMessage(OMRequest omRequest, boolean isWrite) {
      this.omRequest = omRequest;

      this.raftClientRequest = RaftClientRequest.newBuilder()
          .setClientId(getClientId())
          .setServerId(server.getId())
          .setGroupId(raftGroupId)
          .setCallId(getCallId())
          .setMessage(this) // <----------- no ByteString conversion
          .setType(isWrite ? RaftClientRequest.writeRequestType() : 
getRaftReadRequestType(omRequest))
          .build();
    }

    public OMRequest getOmRequest() {
      return omRequest;
    }

    public RaftClientRequest getRaftClientRequest() {
      return raftClientRequest;
    }

    @Override
    public ByteString getContent() {
      throw new UnsupportedOperationException("Not supported yet.");
    }
  }
{code}

> Introduce local read API to reduce serde cost
> ---------------------------------------------
>
>                 Key: RATIS-2509
>                 URL: https://issues.apache.org/jira/browse/RATIS-2509
>             Project: Ratis
>          Issue Type: Improvement
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>         Attachments: om-benchmark-leader-read-all-ratis.html
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Recently, we did a benchmark comparing the OM leader-read that does not go 
> through Ratis and OM leader read that go through Ratis (through 
> submitClientRequestAsync). We saw that there is up to 25% decrease in read 
> throughput although the we make the raft.server.read.option to be DEFAULT 
> which should return immediately (235020 QPS -> 180433 QPS or 24% reduction in 
> throughput for pure reads with 100 threads).
> The overheads seem to be because of request/response proto conversion, 
> RaftClientRequest construction, future chaining, .get() blocking, Ratis 
> metrics/reply building, and parsing the Ratis response back into OMResponse. 
> See [^om-benchmark-leader-read-all-ratis.html] for the flamegraph.
> This means that if we submit linearizable read to follower, we incurs these 
> overhead on top of the linearizable read overhead (e.g. ReadIndex, etc).
> We can try to find a way to reduce this overhead. We might need to implement 
> another read flow without the overhead (unlike writes which requires 
> AppendEntries request to the followers which require serde, read can be 
> served locally). 
> One idea is that if we are submitting to RaftServer, we can use 
> submitServerRequestAsync which does not require RaftClientRequest and 
> RaftClientReply serde.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to