[ https://issues.apache.org/jira/browse/RATIS-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duong updated RATIS-2141: ------------------------- Description: In 3.1.0, with stateMachineCache enabled, the RaftLogCache entries contain a reference to the original RaftClientRequest. This is not supposed to happen as RaftLogCache entries should only refer to the LogEntries with data truncated. This problem impacts Apache Ozone. The reference form RaftLogCache entries prevent the original RaftClientRequest (which contains a large data chunk) to be GCed timely. The result is Ozone datanodes quickly run out of heap memory. This is not the case with latest master branch, only with the 3.1.0 release. The fix for this issue in 3.1.0 is as simple as [6a141544c567a6325b05e2972cd426cdc14060cb|https://github.com/duongkame/ratis/commit/bcff74af0a5fa4b68af2267ce8dfa01f65a5445c]. was: In 3.1.0, with stateMachineCache enabled, the RaftLogCache entries contain a reference to the original RaftClientRequest. This is not supposed to happen as RaftLogCache entries should only refer to the LogEntries with data truncated. This problem impacts Apache Ozone. The reference form RaftLogCache entries prevent the original RaftClientRequest (which contains a large data chunk) to be GCed timely. The result is Ozone datanodes quickly run out of heap memory. This is not the case with latest master branch, only with the 3.1.0 release. > Memory leak for stateMachineCache use cases > ------------------------------------------- > > Key: RATIS-2141 > URL: https://issues.apache.org/jira/browse/RATIS-2141 > Project: Ratis > Issue Type: Bug > Components: server > Affects Versions: 3.1.0 > Reporter: Duong > Priority: Major > > In 3.1.0, with stateMachineCache enabled, the RaftLogCache entries contain a > reference to the original RaftClientRequest. This is not supposed to happen > as RaftLogCache entries should only refer to the LogEntries with data > truncated. > This problem impacts Apache Ozone. The reference form RaftLogCache entries > prevent the original RaftClientRequest (which contains a large data chunk) to > be GCed timely. The result is Ozone datanodes quickly run out of heap memory. > This is not the case with latest master branch, only with the 3.1.0 release. > The fix for this issue in 3.1.0 is as simple as > [6a141544c567a6325b05e2972cd426cdc14060cb|https://github.com/duongkame/ratis/commit/bcff74af0a5fa4b68af2267ce8dfa01f65a5445c]. -- This message was sent by Atlassian Jira (v8.20.10#820010)