[
https://issues.apache.org/jira/browse/RATIS-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated RATIS-2426:
-------------------------------
Description:
We encountered issues where Ozone datanodes heap memory increased suddenly
causing high GC and causes the DN to be stuck. After analyzing the memory dump,
it is suggested that there are memory leaks caused by these two issues
# StreamObservers.stop() doesn't close gRPC stream
# onCompleted()/onError() don't clear previousOnNext and requestFuture (Edit:
Technically, it's not really necessary since if 1 is resolved, the follower
StreamObserver should be GC-eligible, including the previousOnNext and
requestFuture)
AI analysis for 1
{quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does not
close gRPC streams.
The leak path is:
1. `resetClient()` (line 203) is called on error/timeout/inconsistency with a
follower
2. It calls `appendLogRequestObserver.stop()` — which only sets running =
false
3. It then sets appendLogRequestObserver = null — dropping the Java reference
4. But the underlying gRPC `CallStreamObserver` is never closed — no
onCompleted(), no onError(), no RST_STREAM sent
5. On the server side (follower datanode), the ServerRequestStreamObserver
stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString
(4MB chunk data) via the previousOnNext reference
6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream —
this is exactly what MAT showed retaining 99.08% of the heap
Compare with the clean shutdown at line 267, which correctly calls
StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes both
streams.
Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes,
timeouts, and retries across many pipelines, this accumulated to 112K leaked
streams / 52.2 GB.
The fix is to modify StreamObservers.stop() to also call
onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and
release server-side resources immediately. I've updated the markdown file with
the detailed fix code.
{quote}
AI analysis for 2
{quote}Neither onCompleted() nor onError() clears previousOnNext. This is a
secondary bug that makes the leak worse even when streams are technically
"closed":
1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>,
which holds the last AppendEntriesRequestProto — containing LogEntryProto with
4MB ByteString chunk data.
2. Even after onCompleted()/onError() sets isClosed = true, the
previousOnNext reference keeps the last request's payload alive for as long as
the ServerRequestStreamObserver object exists.
3. In onCompleted(), requestFuture.get().thenAccept(...) creates a
CompletableFuture chain. If the chain hasn't completed yet, the lambda captures
the observer, preventing GC until all chained futures resolve. During this
time, the
4MB payload is still pinned.
4. Similarly, requestFuture itself chains all processing via composeRequest():
{quote}
was:
We encountered issues where Ozone datanodes heap memory increased suddenly
causing high GC and performance degradation. After analyzing the memory dump,
it is suggested that it's due to two issues
# StreamObservers.stop() doesn't close gRPC stream
# onCompleted()/onError() don't clear previousOnNext and requestFuture (Edit:
Technically, it's not really necessary since if 1 is resolved, the follower
StreamObserver should be GC-eligible, including the previousOnNext and
requestFuture)
AI analysis for 1
{quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does not
close gRPC streams.
The leak path is:
1. `resetClient()` (line 203) is called on error/timeout/inconsistency with a
follower
2. It calls `appendLogRequestObserver.stop()` — which only sets running =
false
3. It then sets appendLogRequestObserver = null — dropping the Java reference
4. But the underlying gRPC `CallStreamObserver` is never closed — no
onCompleted(), no onError(), no RST_STREAM sent
5. On the server side (follower datanode), the ServerRequestStreamObserver
stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString
(4MB chunk data) via the previousOnNext reference
6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream —
this is exactly what MAT showed retaining 99.08% of the heap
Compare with the clean shutdown at line 267, which correctly calls
StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes both
streams.
Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes,
timeouts, and retries across many pipelines, this accumulated to 112K leaked
streams / 52.2 GB.
The fix is to modify StreamObservers.stop() to also call
onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and
release server-side resources immediately. I've updated the markdown file with
the detailed fix code.
{quote}
AI analysis for 2
{quote}Neither onCompleted() nor onError() clears previousOnNext. This is a
secondary bug that makes the leak worse even when streams are technically
"closed":
1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>,
which holds the last AppendEntriesRequestProto — containing LogEntryProto with
4MB ByteString chunk data.
2. Even after onCompleted()/onError() sets isClosed = true, the
previousOnNext reference keeps the last request's payload alive for as long as
the ServerRequestStreamObserver object exists.
3. In onCompleted(), requestFuture.get().thenAccept(...) creates a
CompletableFuture chain. If the chain hasn't completed yet, the lambda captures
the observer, preventing GC until all chained futures resolve. During this
time, the
4MB payload is still pinned.
4. Similarly, requestFuture itself chains all processing via composeRequest():
{quote}
> Fix memory leak in ServerRequestStreamObserver
> ----------------------------------------------
>
> Key: RATIS-2426
> URL: https://issues.apache.org/jira/browse/RATIS-2426
> Project: Ratis
> Issue Type: Bug
> Components: gRPC
> Reporter: Ivan Andika
> Assignee: Ivan Andika
> Priority: Major
>
> We encountered issues where Ozone datanodes heap memory increased suddenly
> causing high GC and causes the DN to be stuck. After analyzing the memory
> dump, it is suggested that there are memory leaks caused by these two issues
> # StreamObservers.stop() doesn't close gRPC stream
> # onCompleted()/onError() don't clear previousOnNext and requestFuture
> (Edit: Technically, it's not really necessary since if 1 is resolved, the
> follower StreamObserver should be GC-eligible, including the previousOnNext
> and requestFuture)
> AI analysis for 1
> {quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does
> not close gRPC streams.
> The leak path is:
> 1. `resetClient()` (line 203) is called on error/timeout/inconsistency with
> a follower
> 2. It calls `appendLogRequestObserver.stop()` — which only sets running =
> false
> 3. It then sets appendLogRequestObserver = null — dropping the Java
> reference
> 4. But the underlying gRPC `CallStreamObserver` is never closed — no
> onCompleted(), no onError(), no RST_STREAM sent
> 5. On the server side (follower datanode), the ServerRequestStreamObserver
> stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString
> (4MB chunk data) via the previousOnNext reference
> 6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream —
> this is exactly what MAT showed retaining 99.08% of the heap
> Compare with the clean shutdown at line 267, which correctly calls
> StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes
> both streams.
> Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes,
> timeouts, and retries across many pipelines, this accumulated to 112K leaked
> streams / 52.2 GB.
> The fix is to modify StreamObservers.stop() to also call
> onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and
> release server-side resources immediately. I've updated the markdown file
> with the detailed fix code.
> {quote}
> AI analysis for 2
> {quote}Neither onCompleted() nor onError() clears previousOnNext. This is a
> secondary bug that makes the leak worse even when streams are technically
> "closed":
> 1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>,
> which holds the last AppendEntriesRequestProto — containing LogEntryProto
> with 4MB ByteString chunk data.
> 2. Even after onCompleted()/onError() sets isClosed = true, the
> previousOnNext reference keeps the last request's payload alive for as long
> as the ServerRequestStreamObserver object exists.
> 3. In onCompleted(), requestFuture.get().thenAccept(...) creates a
> CompletableFuture chain. If the chain hasn't completed yet, the lambda
> captures the observer, preventing GC until all chained futures resolve.
> During this time, the
> 4MB payload is still pinned.
> 4. Similarly, requestFuture itself chains all processing via
> composeRequest():
> {quote}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)