[ https://issues.apache.org/jira/browse/RATIS-726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16957616#comment-16957616 ]
Shashikant Banerjee commented on RATIS-726: ------------------------------------------- Thanks [~szetszwo]. The changes look good to me. I am +1 on this change. I will commit this shortly. Filed RATIS-728 to address the similar issue in Server side. > TimeoutScheduler holds on to the raftClientRequest till it times out even > though request succeeds > ------------------------------------------------------------------------------------------------- > > Key: RATIS-726 > URL: https://issues.apache.org/jira/browse/RATIS-726 > Project: Ratis > Issue Type: Bug > Components: client > Reporter: Shashikant Banerjee > Assignee: Tsz-wo Sze > Priority: Major > Attachments: r726_20191022.patch > > > While running freon with 1 Node ratis, it was observed that the > TimeoutScheduler holds on to the raftClientObject atleast for 3s(default for > requestTimeoutDuration) even though the request is processed successfully and > acknowledged back. This ends up creating a memory pressure causing ozone > client to go OOM . > Heapdump analysis of HDDS-2331 , it seems the timeout schduler holding onto > total of 176 requests, (88 of writeChunk containing actual data and 88 > putBlock requests) although data write is happening sequentially key by key > in ozone. > Thanks [~adoroszlai] for helping out discovering this. > cc ~ [~ljain] [~msingh] [~szetszwo] [~jnpandey] > Similar fix may be required in GrpCLogAppender as well it uses the same > TimeoutScheduler. -- This message was sent by Atlassian Jira (v8.3.4#803005)