[ 
https://issues.apache.org/jira/browse/RATIS-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated RATIS-2426:
-------------------------------
    Description: 
We encountered issues where Ozone datanodes heap memory increased suddenly 
causing high GC and causes the DN to be stuck. After analyzing the memory dump, 
it is suggested that there are memory leaks caused by these two issues 
 # StreamObservers.stop() doesn't close gRPC stream
 # onCompleted()/onError() don't clear previousOnNext and requestFuture (Edit: 
Technically, it's not really necessary since if 1 is resolved, the follower 
StreamObserver should be GC-eligible, including the previousOnNext and 
requestFuture)

AI analysis for 1
{quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does not 
close gRPC streams.
  The leak path is:
  1. `resetClient()` (line 203) is called on error/timeout/inconsistency with a 
follower
  2. It calls `appendLogRequestObserver.stop()` — which only sets running = 
false
  3. It then sets appendLogRequestObserver = null — dropping the Java reference
  4. But the underlying gRPC `CallStreamObserver` is never closed — no 
onCompleted(), no onError(), no RST_STREAM sent
  5. On the server side (follower datanode), the ServerRequestStreamObserver 
stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString 
(4MB chunk data) via the previousOnNext reference
  6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream — 
this is exactly what MAT showed retaining 99.08% of the heap

  Compare with the clean shutdown at line 267, which correctly calls 
StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes both 
streams.
  Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes, 
timeouts, and retries across many pipelines, this accumulated to 112K leaked 
streams / 52.2 GB.
  The fix is to modify StreamObservers.stop() to also call 
onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and 
release server-side resources immediately. I've updated the markdown file with 
the detailed fix code.
{quote}
AI analysis for 2
{quote}Neither onCompleted() nor onError() clears previousOnNext. This is a 
secondary bug that makes the leak worse even when streams are technically 
"closed":
  1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>, 
which holds the last AppendEntriesRequestProto — containing LogEntryProto with 
4MB ByteString chunk data.
  2. Even after onCompleted()/onError() sets isClosed = true, the 
previousOnNext reference keeps the last request's payload alive for as long as 
the ServerRequestStreamObserver object exists.
  3. In onCompleted(), requestFuture.get().thenAccept(...) creates a 
CompletableFuture chain. If the chain hasn't completed yet, the lambda captures 
the observer, preventing GC until all chained futures resolve. During this 
time, the
     4MB payload is still pinned.
  4. Similarly, requestFuture itself chains all processing via composeRequest():
{quote}

  was:
We encountered issues where Ozone datanodes heap memory increased suddenly 
causing high GC and performance degradation. After analyzing the memory dump, 
it is suggested that it's due to two issues
 # StreamObservers.stop() doesn't close gRPC stream
 # onCompleted()/onError() don't clear previousOnNext and requestFuture (Edit: 
Technically, it's not really necessary since if 1 is resolved, the follower 
StreamObserver should be GC-eligible, including the previousOnNext and 
requestFuture)

AI analysis for 1
{quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does not 
close gRPC streams.
  The leak path is:
  1. `resetClient()` (line 203) is called on error/timeout/inconsistency with a 
follower
  2. It calls `appendLogRequestObserver.stop()` — which only sets running = 
false
  3. It then sets appendLogRequestObserver = null — dropping the Java reference
  4. But the underlying gRPC `CallStreamObserver` is never closed — no 
onCompleted(), no onError(), no RST_STREAM sent
  5. On the server side (follower datanode), the ServerRequestStreamObserver 
stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString 
(4MB chunk data) via the previousOnNext reference
  6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream — 
this is exactly what MAT showed retaining 99.08% of the heap

  Compare with the clean shutdown at line 267, which correctly calls 
StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes both 
streams.
  Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes, 
timeouts, and retries across many pipelines, this accumulated to 112K leaked 
streams / 52.2 GB.
  The fix is to modify StreamObservers.stop() to also call 
onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and 
release server-side resources immediately. I've updated the markdown file with 
the detailed fix code.
{quote}
AI analysis for 2
{quote}Neither onCompleted() nor onError() clears previousOnNext. This is a 
secondary bug that makes the leak worse even when streams are technically 
"closed":
  1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>, 
which holds the last AppendEntriesRequestProto — containing LogEntryProto with 
4MB ByteString chunk data.
  2. Even after onCompleted()/onError() sets isClosed = true, the 
previousOnNext reference keeps the last request's payload alive for as long as 
the ServerRequestStreamObserver object exists.
  3. In onCompleted(), requestFuture.get().thenAccept(...) creates a 
CompletableFuture chain. If the chain hasn't completed yet, the lambda captures 
the observer, preventing GC until all chained futures resolve. During this 
time, the
     4MB payload is still pinned.
  4. Similarly, requestFuture itself chains all processing via composeRequest():
{quote}


> Fix memory leak in ServerRequestStreamObserver
> ----------------------------------------------
>
>                 Key: RATIS-2426
>                 URL: https://issues.apache.org/jira/browse/RATIS-2426
>             Project: Ratis
>          Issue Type: Bug
>          Components: gRPC
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> We encountered issues where Ozone datanodes heap memory increased suddenly 
> causing high GC and causes the DN to be stuck. After analyzing the memory 
> dump, it is suggested that there are memory leaks caused by these two issues 
>  # StreamObservers.stop() doesn't close gRPC stream
>  # onCompleted()/onError() don't clear previousOnNext and requestFuture 
> (Edit: Technically, it's not really necessary since if 1 is resolved, the 
> follower StreamObserver should be GC-eligible, including the previousOnNext 
> and requestFuture)
> AI analysis for 1
> {quote}Root cause confirmed: `GrpcLogAppender.StreamObservers.stop()` does 
> not close gRPC streams.
>   The leak path is:
>   1. `resetClient()` (line 203) is called on error/timeout/inconsistency with 
> a follower
>   2. It calls `appendLogRequestObserver.stop()` — which only sets running = 
> false
>   3. It then sets appendLogRequestObserver = null — dropping the Java 
> reference
>   4. But the underlying gRPC `CallStreamObserver` is never closed — no 
> onCompleted(), no onError(), no RST_STREAM sent
>   5. On the server side (follower datanode), the ServerRequestStreamObserver 
> stays alive, holding AppendEntriesRequestProto → LogEntryProto → ByteString 
> (4MB chunk data) via the previousOnNext reference
>   6. The HTTP/2 stream stays open in DefaultHttp2Connection$DefaultStream — 
> this is exactly what MAT showed retaining 99.08% of the heap
>   Compare with the clean shutdown at line 267, which correctly calls 
> StreamObservers.onCompleted() → appendLog.onCompleted() → properly closes 
> both streams.
>   Each resetClient() leaks 1-2 HTTP/2 streams. With frequent leader changes, 
> timeouts, and retries across many pipelines, this accumulated to 112K leaked 
> streams / 52.2 GB.
>   The fix is to modify StreamObservers.stop() to also call 
> onError(Status.CANCELLED) on the CallStreamObserver to send RST_STREAM and 
> release server-side resources immediately. I've updated the markdown file 
> with the detailed fix code.
> {quote}
> AI analysis for 2
> {quote}Neither onCompleted() nor onError() clears previousOnNext. This is a 
> secondary bug that makes the leak worse even when streams are technically 
> "closed":
>   1. previousOnNext holds a PendingServerRequest<AppendEntriesRequestProto>, 
> which holds the last AppendEntriesRequestProto — containing LogEntryProto 
> with 4MB ByteString chunk data.
>   2. Even after onCompleted()/onError() sets isClosed = true, the 
> previousOnNext reference keeps the last request's payload alive for as long 
> as the ServerRequestStreamObserver object exists.
>   3. In onCompleted(), requestFuture.get().thenAccept(...) creates a 
> CompletableFuture chain. If the chain hasn't completed yet, the lambda 
> captures the observer, preventing GC until all chained futures resolve. 
> During this time, the
>      4MB payload is still pinned.
>   4. Similarly, requestFuture itself chains all processing via 
> composeRequest():
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to