[ 
https://issues.apache.org/jira/browse/KUDU-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893246#comment-16893246
 ] 

Andrew Wong commented on KUDU-1973:
-----------------------------------

I'll note that on a very dense cluster (5000-7000 replicas per node), the 
consensus traffic between servers made it very difficult to get the output of a 
ksck, since ksck tries to get consensus state from the tablet servers. The 
overall state of the cluster was known to be quite poor, so it's very possible 
that there were many elections going on in the background; we ended up 
restarting the entire cluster with a higher Raft heartbeat interval to let 
things settle first before we were able to get a usable ksck output.

> Coalesce RPCs destined for the same server
> ------------------------------------------
>
>                 Key: KUDU-1973
>                 URL: https://issues.apache.org/jira/browse/KUDU-1973
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: rpc, tserver
>    Affects Versions: 1.4.0
>            Reporter: Adar Dembo
>            Priority: Major
>              Labels: data-scalability
>
> The krpc subsystem ensures that only one _connection_ exists between any pair 
> of nodes, but it doesn't coalesce the _RPCs_ themselves. In clusters with 
> dense nodes (especially with a lot of tablets), there's often a great number 
> of RPCs sent between pairs of nodes.
> We should explore ways of coalescing those RPCs. I don't know whether that 
> would happen within the krpc system itself (i.e. in a payload-agnostic way), 
> or whether we'd only coalesce RPCs known to be "hot" (like UpdateConsensus).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to