[
https://issues.apache.org/jira/browse/RATIS-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated RATIS-2426:
-------------------------------
Description:
We encountered issues where Ozone datanodes heap memory increased suddenly
causing high GC and causes the DN to be stuck. After analyzing the memory dump,
it is suggested that there are memory leaks caused by these two issues
# StreamObservers.stop() doesn't close gRPC stream
# onCompleted()/onError() don't clear previousOnNext and requestFuture (Edit:
Technically, it's not really necessary since if 1 is resolved, the follower
StreamObserver should be GC-eligible, including the previousOnNext and
requestFuture)
# StreamObservers#onCompleted is not called if the GrpcLogAppender is
interrupted
AI analysis for 1
{quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does not
close gRPC streams.
The leak path is:
1. `resetClient()` (line 203) is called on error/timeout/inconsistency with a
follower
2. It calls `appendLogRequestObserver.stop()` — which only sets running =
false
3. It then sets appendLogRequestObserver = null — dropping the Java reference
4. But the underlying gRPC `CallStreamObserver` is never closed — no
onCompleted(), no onError(), no RST_STREAM sent
5. On the server side (follower datanode), the ServerRequestStreamObserver
stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString
(4MB chunk data) via the previousOnNext reference
6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream —
this is exactly what MAT showed retaining 99.08% of the heap
Compare with the clean shutdown at line 267, which correctly calls
StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes both
streams.
Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes,
timeouts, and retries across many pipelines, this accumulated to 112K leaked
streams / 52.2 GB.
The fix is to modify StreamObservers.stop() to also call
onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and
release server-side resources immediately.
{quote}
AI analysis for 2
{quote}Neither onCompleted() nor onError() clears previousOnNext. This is a
secondary bug that makes the leak worse even when streams are technically
"closed":
1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>,
which holds the last AppendEntriesRequestProto — containing LogEntryProto with
4MB ByteString chunk data.
2. Even after onCompleted()/onError() sets isClosed = true, the
previousOnNext reference keeps the last request's payload alive for as long as
the ServerRequestStreamObserver object exists.
3. In onCompleted(), requestFuture.get().thenAccept(...) creates a
CompletableFuture chain. If the chain hasn't completed yet, the lambda captures
the observer, preventing GC until all chained futures resolve. During this
time, the
4MB payload is still pinned.
4. Similarly, requestFuture itself chains all processing via composeRequest():
{quote}
AI analysis for 3
{quote}Root cause of the test failure: The run() method's stream cleanup on
line 284 was not in a finally block. When the appender thread was interrupted
during onNext() (visible in the logs as InterruptedIOException: Interrupted
appendLog), the exception propagated past the cleanup code, leaving the
server-side observer open forever. This is the exact leak we were trying to
detect.
The fix: Wrapping the loop in a try \{ ... } finally \{ onCompleted(); }
block ensures the server-side stream observer is always closed, even when the
appender exits via exception (e.g., InterruptedIOException from thread
interruption
during a restart).
{quote}
was:
We encountered issues where Ozone datanodes heap memory increased suddenly
causing high GC and causes the DN to be stuck. After analyzing the memory dump,
it is suggested that there are memory leaks caused by these two issues
# StreamObservers.stop() doesn't close gRPC stream
# onCompleted()/onError() don't clear previousOnNext and requestFuture (Edit:
Technically, it's not really necessary since if 1 is resolved, the follower
StreamObserver should be GC-eligible, including the previousOnNext and
requestFuture)
AI analysis for 1
{quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does not
close gRPC streams.
The leak path is:
1. `resetClient()` (line 203) is called on error/timeout/inconsistency with a
follower
2. It calls `appendLogRequestObserver.stop()` — which only sets running =
false
3. It then sets appendLogRequestObserver = null — dropping the Java reference
4. But the underlying gRPC `CallStreamObserver` is never closed — no
onCompleted(), no onError(), no RST_STREAM sent
5. On the server side (follower datanode), the ServerRequestStreamObserver
stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString
(4MB chunk data) via the previousOnNext reference
6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream —
this is exactly what MAT showed retaining 99.08% of the heap
Compare with the clean shutdown at line 267, which correctly calls
StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes both
streams.
Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes,
timeouts, and retries across many pipelines, this accumulated to 112K leaked
streams / 52.2 GB.
The fix is to modify StreamObservers.stop() to also call
onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and
release server-side resources immediately.
{quote}
AI analysis for 2
{quote}Neither onCompleted() nor onError() clears previousOnNext. This is a
secondary bug that makes the leak worse even when streams are technically
"closed":
1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>,
which holds the last AppendEntriesRequestProto — containing LogEntryProto with
4MB ByteString chunk data.
2. Even after onCompleted()/onError() sets isClosed = true, the
previousOnNext reference keeps the last request's payload alive for as long as
the ServerRequestStreamObserver object exists.
3. In onCompleted(), requestFuture.get().thenAccept(...) creates a
CompletableFuture chain. If the chain hasn't completed yet, the lambda captures
the observer, preventing GC until all chained futures resolve. During this
time, the
4MB payload is still pinned.
4. Similarly, requestFuture itself chains all processing via composeRequest():
{quote}
> Fix memory leak in ServerRequestStreamObserver
> ----------------------------------------------
>
> Key: RATIS-2426
> URL: https://issues.apache.org/jira/browse/RATIS-2426
> Project: Ratis
> Issue Type: Bug
> Components: gRPC
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> We encountered issues where Ozone datanodes heap memory increased suddenly
> causing high GC and causes the DN to be stuck. After analyzing the memory
> dump, it is suggested that there are memory leaks caused by these two issues
> # StreamObservers.stop() doesn't close gRPC stream
> # onCompleted()/onError() don't clear previousOnNext and requestFuture
> (Edit: Technically, it's not really necessary since if 1 is resolved, the
> follower StreamObserver should be GC-eligible, including the previousOnNext
> and requestFuture)
> # StreamObservers#onCompleted is not called if the GrpcLogAppender is
> interrupted
> AI analysis for 1
> {quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does
> not close gRPC streams.
> The leak path is:
> 1. `resetClient()` (line 203) is called on error/timeout/inconsistency with
> a follower
> 2. It calls `appendLogRequestObserver.stop()` — which only sets running =
> false
> 3. It then sets appendLogRequestObserver = null — dropping the Java
> reference
> 4. But the underlying gRPC `CallStreamObserver` is never closed — no
> onCompleted(), no onError(), no RST_STREAM sent
> 5. On the server side (follower datanode), the ServerRequestStreamObserver
> stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString
> (4MB chunk data) via the previousOnNext reference
> 6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream —
> this is exactly what MAT showed retaining 99.08% of the heap
> Compare with the clean shutdown at line 267, which correctly calls
> StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes
> both streams.
> Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes,
> timeouts, and retries across many pipelines, this accumulated to 112K leaked
> streams / 52.2 GB.
> The fix is to modify StreamObservers.stop() to also call
> onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and
> release server-side resources immediately.
> {quote}
> AI analysis for 2
> {quote}Neither onCompleted() nor onError() clears previousOnNext. This is a
> secondary bug that makes the leak worse even when streams are technically
> "closed":
> 1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>,
> which holds the last AppendEntriesRequestProto — containing LogEntryProto
> with 4MB ByteString chunk data.
> 2. Even after onCompleted()/onError() sets isClosed = true, the
> previousOnNext reference keeps the last request's payload alive for as long
> as the ServerRequestStreamObserver object exists.
> 3. In onCompleted(), requestFuture.get().thenAccept(...) creates a
> CompletableFuture chain. If the chain hasn't completed yet, the lambda
> captures the observer, preventing GC until all chained futures resolve.
> During this time, the
> 4MB payload is still pinned.
> 4. Similarly, requestFuture itself chains all processing via
> composeRequest():
> {quote}
> AI analysis for 3
> {quote}Root cause of the test failure: The run() method's stream cleanup on
> line 284 was not in a finally block. When the appender thread was interrupted
> during onNext() (visible in the logs as InterruptedIOException: Interrupted
> appendLog), the exception propagated past the cleanup code, leaving the
> server-side observer open forever. This is the exact leak we were trying to
> detect.
> The fix: Wrapping the loop in a try \{ ... } finally \{ onCompleted(); }
> block ensures the server-side stream observer is always closed, even when the
> appender exits via exception (e.g., InterruptedIOException from thread
> interruption
> during a restart).
> {quote}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)