[ 
https://issues.apache.org/jira/browse/RATIS-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duong updated RATIS-2129:
-------------------------
    Description: 
Today, the GrpcLogAppender thread makes a lot of calls that need RaftLog's 
readLock. In an active environment, RaftLog is always busy appending 
transactions from clients, thus writeLock is frequently busy. This makes the 
replication performance slow. 

See the [^dn_echo_leader_profile.html], or in the picture below, the purple is 
the time taken to acquire readLock from RaftLog.

!Screenshot 2024-07-22 at 3.12.03 PM.png|width=926,height=469!

So far, I'm not sure if this is a regression from a recent change in 
3.1.0/3.0.0, or if it's been always the case. 

A few early considerations:
 # The rate of calling RaftLog per GrpcLogAppender seems to be too high. 
Instead of calling RaftLog multiple, maybe the log appended can call once to 
obtain all the required information?
 # Can RaftLog expose those data without requiring a read lock? 

  was:
Today, the GrpcLogAppender thread makes a lot of calls that need RaftLog's 
readLock. In an active environment, RaftLog is always busy appending 
transactions from clients, thus writeLock is frequently busy. This makes the 
replication performance slow. 

See the [^dn_echo_leader_profile.html], or in the picture below, the purple is 
the time taken to acquire readLock from RaftLog.

!Screenshot 2024-07-22 at 3.12.03 PM.png|width=926,height=469!

So far, I'm not sure if this is a regression from a recent change in 
3.1.0/3.0.0, or if it's been always the case. 

A few early considerations:
 # The rate of calling RaftLog per GrpcLogAppender seems to be too high. 
Instead of calling RaftLog multiple, maybe the log appended can call once to 
obtain all the required information?
 # Can RaftLog expose those information without requiring a read lock? 


> Low replication performance low because GrpcLogAppender is often blocked by 
> RaftLog's readLock
> ----------------------------------------------------------------------------------------------
>
>                 Key: RATIS-2129
>                 URL: https://issues.apache.org/jira/browse/RATIS-2129
>             Project: Ratis
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: Duong
>            Priority: Major
>         Attachments: Screenshot 2024-07-22 at 3.12.03 PM.png, 
> dn_echo_leader_profile.html
>
>
> Today, the GrpcLogAppender thread makes a lot of calls that need RaftLog's 
> readLock. In an active environment, RaftLog is always busy appending 
> transactions from clients, thus writeLock is frequently busy. This makes the 
> replication performance slow. 
> See the [^dn_echo_leader_profile.html], or in the picture below, the purple 
> is the time taken to acquire readLock from RaftLog.
> !Screenshot 2024-07-22 at 3.12.03 PM.png|width=926,height=469!
> So far, I'm not sure if this is a regression from a recent change in 
> 3.1.0/3.0.0, or if it's been always the case. 
> A few early considerations:
>  # The rate of calling RaftLog per GrpcLogAppender seems to be too high. 
> Instead of calling RaftLog multiple, maybe the log appended can call once to 
> obtain all the required information?
>  # Can RaftLog expose those data without requiring a read lock? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to