[ 
https://issues.apache.org/jira/browse/RATIS-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated RATIS-2426:
-------------------------------
    Description: 
We encountered issues where Ozone datanodes heap memory increased suddenly 
causing high GC and causes the DN to be stuck. After analyzing the memory dump, 
it is suggested that there are memory leaks caused by these two issues 
 # StreamObservers.stop() doesn't close gRPC stream
 # onCompleted()/onError() don't clear previousOnNext and requestFuture (Edit: 
Technically, it's not really necessary since if 1 is resolved, the follower 
StreamObserver should be GC-eligible, including the previousOnNext and 
requestFuture)
 # StreamObservers#onCompleted is not called if the GrpcLogAppender is 
interrupted 

AI analysis for 1
{quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does not 
close gRPC streams.
  The leak path is:
  1. `resetClient()` (line 203) is called on error/timeout/inconsistency with a 
follower
  2. It calls `appendLogRequestObserver.stop()` — which only sets running = 
false
  3. It then sets appendLogRequestObserver = null — dropping the Java reference
  4. But the underlying gRPC `CallStreamObserver` is never closed — no 
onCompleted(), no onError(), no RST_STREAM sent
  5. On the server side (follower datanode), the ServerRequestStreamObserver 
stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString 
(4MB chunk data) via the previousOnNext reference
  6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream — 
this is exactly what MAT showed retaining 99.08% of the heap

  Compare with the clean shutdown at line 267, which correctly calls 
StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes both 
streams.
  Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes, 
timeouts, and retries across many pipelines, this accumulated to 112K leaked 
streams / 52.2 GB.
  The fix is to modify StreamObservers.stop() to also call 
onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and 
release server-side resources immediately.
{quote}
AI analysis for 2
{quote}Neither onCompleted() nor onError() clears previousOnNext. This is a 
secondary bug that makes the leak worse even when streams are technically 
"closed":
  1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>, 
which holds the last AppendEntriesRequestProto — containing LogEntryProto with 
4MB ByteString chunk data.
  2. Even after onCompleted()/onError() sets isClosed = true, the 
previousOnNext reference keeps the last request's payload alive for as long as 
the ServerRequestStreamObserver object exists.
  3. In onCompleted(), requestFuture.get().thenAccept(...) creates a 
CompletableFuture chain. If the chain hasn't completed yet, the lambda captures 
the observer, preventing GC until all chained futures resolve. During this 
time, the
     4MB payload is still pinned.
  4. Similarly, requestFuture itself chains all processing via composeRequest():
{quote}
AI analysis for 3
{quote}Root cause of the test failure: The run() method's stream cleanup on 
line 284 was not in a finally block. When the appender thread was interrupted 
during onNext() (visible in the logs as InterruptedIOException: Interrupted
  appendLog), the exception propagated past the cleanup code, leaving the 
server-side observer open forever. This is the exact leak we were trying to 
detect.
  The fix: Wrapping the loop in a try \{ ... } finally \{ onCompleted(); } 
block ensures the server-side stream observer is always closed, even when the 
appender exits via exception (e.g., InterruptedIOException from thread 
interruption
   during a restart).
{quote}

  was:
We encountered issues where Ozone datanodes heap memory increased suddenly 
causing high GC and causes the DN to be stuck. After analyzing the memory dump, 
it is suggested that there are memory leaks caused by these two issues 
 # StreamObservers.stop() doesn't close gRPC stream
 # onCompleted()/onError() don't clear previousOnNext and requestFuture (Edit: 
Technically, it's not really necessary since if 1 is resolved, the follower 
StreamObserver should be GC-eligible, including the previousOnNext and 
requestFuture)

AI analysis for 1
{quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does not 
close gRPC streams.
  The leak path is:
  1. `resetClient()` (line 203) is called on error/timeout/inconsistency with a 
follower
  2. It calls `appendLogRequestObserver.stop()` — which only sets running = 
false
  3. It then sets appendLogRequestObserver = null — dropping the Java reference
  4. But the underlying gRPC `CallStreamObserver` is never closed — no 
onCompleted(), no onError(), no RST_STREAM sent
  5. On the server side (follower datanode), the ServerRequestStreamObserver 
stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString 
(4MB chunk data) via the previousOnNext reference
  6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream — 
this is exactly what MAT showed retaining 99.08% of the heap

  Compare with the clean shutdown at line 267, which correctly calls 
StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes both 
streams.
  Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes, 
timeouts, and retries across many pipelines, this accumulated to 112K leaked 
streams / 52.2 GB.
  The fix is to modify StreamObservers.stop() to also call 
onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and 
release server-side resources immediately.
{quote}
AI analysis for 2
{quote}Neither onCompleted() nor onError() clears previousOnNext. This is a 
secondary bug that makes the leak worse even when streams are technically 
"closed":
  1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>, 
which holds the last AppendEntriesRequestProto — containing LogEntryProto with 
4MB ByteString chunk data.
  2. Even after onCompleted()/onError() sets isClosed = true, the 
previousOnNext reference keeps the last request's payload alive for as long as 
the ServerRequestStreamObserver object exists.
  3. In onCompleted(), requestFuture.get().thenAccept(...) creates a 
CompletableFuture chain. If the chain hasn't completed yet, the lambda captures 
the observer, preventing GC until all chained futures resolve. During this 
time, the
     4MB payload is still pinned.
  4. Similarly, requestFuture itself chains all processing via composeRequest():
{quote}


> Fix memory leak in ServerRequestStreamObserver
> ----------------------------------------------
>
>                 Key: RATIS-2426
>                 URL: https://issues.apache.org/jira/browse/RATIS-2426
>             Project: Ratis
>          Issue Type: Bug
>          Components: gRPC
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> We encountered issues where Ozone datanodes heap memory increased suddenly 
> causing high GC and causes the DN to be stuck. After analyzing the memory 
> dump, it is suggested that there are memory leaks caused by these two issues 
>  # StreamObservers.stop() doesn't close gRPC stream
>  # onCompleted()/onError() don't clear previousOnNext and requestFuture 
> (Edit: Technically, it's not really necessary since if 1 is resolved, the 
> follower StreamObserver should be GC-eligible, including the previousOnNext 
> and requestFuture)
>  # StreamObservers#onCompleted is not called if the GrpcLogAppender is 
> interrupted 
> AI analysis for 1
> {quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does 
> not close gRPC streams.
>   The leak path is:
>   1. `resetClient()` (line 203) is called on error/timeout/inconsistency with 
> a follower
>   2. It calls `appendLogRequestObserver.stop()` — which only sets running = 
> false
>   3. It then sets appendLogRequestObserver = null — dropping the Java 
> reference
>   4. But the underlying gRPC `CallStreamObserver` is never closed — no 
> onCompleted(), no onError(), no RST_STREAM sent
>   5. On the server side (follower datanode), the ServerRequestStreamObserver 
> stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString 
> (4MB chunk data) via the previousOnNext reference
>   6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream — 
> this is exactly what MAT showed retaining 99.08% of the heap
>   Compare with the clean shutdown at line 267, which correctly calls 
> StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes 
> both streams.
>   Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes, 
> timeouts, and retries across many pipelines, this accumulated to 112K leaked 
> streams / 52.2 GB.
>   The fix is to modify StreamObservers.stop() to also call 
> onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and 
> release server-side resources immediately.
> {quote}
> AI analysis for 2
> {quote}Neither onCompleted() nor onError() clears previousOnNext. This is a 
> secondary bug that makes the leak worse even when streams are technically 
> "closed":
>   1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>, 
> which holds the last AppendEntriesRequestProto — containing LogEntryProto 
> with 4MB ByteString chunk data.
>   2. Even after onCompleted()/onError() sets isClosed = true, the 
> previousOnNext reference keeps the last request's payload alive for as long 
> as the ServerRequestStreamObserver object exists.
>   3. In onCompleted(), requestFuture.get().thenAccept(...) creates a 
> CompletableFuture chain. If the chain hasn't completed yet, the lambda 
> captures the observer, preventing GC until all chained futures resolve. 
> During this time, the
>      4MB payload is still pinned.
>   4. Similarly, requestFuture itself chains all processing via 
> composeRequest():
> {quote}
> AI analysis for 3
> {quote}Root cause of the test failure: The run() method's stream cleanup on 
> line 284 was not in a finally block. When the appender thread was interrupted 
> during onNext() (visible in the logs as InterruptedIOException: Interrupted
>   appendLog), the exception propagated past the cleanup code, leaving the 
> server-side observer open forever. This is the exact leak we were trying to 
> detect.
>   The fix: Wrapping the loop in a try \{ ... } finally \{ onCompleted(); } 
> block ensures the server-side stream observer is always closed, even when the 
> appender exits via exception (e.g., InterruptedIOException from thread 
> interruption
>    during a restart).
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to