gortiz commented on code in PR #18519:
URL: https://github.com/apache/pinot/pull/18519#discussion_r3274327204


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/channel/GrpcMailboxServer.java:
##########
@@ -166,9 +187,26 @@ public void shutdown() {
     }
   }
 
+  /// Bytes of direct (off-heap) memory currently pinned by the gRPC server
+  /// allocator backing this mailbox server. This is the same allocator whose
+  /// values are exported as `MAILBOX_SERVER_USED_DIRECT_MEMORY` gauges.
+  public long usedDirectMemoryBytes() {
+    return _bufAllocatorMetric.usedDirectMemory();
+  }
+
+  /// Bytes of heap memory currently pinned by the gRPC server allocator 
backing
+  /// this mailbox server. Exported as `MAILBOX_SERVER_USED_HEAP_MEMORY` 
gauges.
+  public long usedHeapMemoryBytes() {
+    return _bufAllocatorMetric.usedHeapMemory();
+  }
+
   @Override
   public StreamObserver<Mailbox.MailboxContent> 
open(StreamObserver<Mailbox.MailboxStatus> responseObserver) {
     String mailboxId = ChannelUtils.MAILBOX_ID_CTX_KEY.get();
-    return new MailboxContentObserver(_mailboxService, mailboxId, 
responseObserver);
+    ServerCallStreamObserver<Mailbox.MailboxStatus> serverCallObserver =
+        (ServerCallStreamObserver<Mailbox.MailboxStatus>) responseObserver;
+    serverCallObserver.disableAutoInboundFlowControl();
+    serverCallObserver.request(_inboundMessageCredit);

Review Comment:
   Two-part response landed in `acdd1e3266`:
   
   **1. Rollback knob 
(`pinot.query.runner.grpc.manual.inbound.flow.control.enabled`, default 
`true`).**
   Lets operators flip the receiver back to gRPC's auto-inbound-flow-control (1 
in-flight message at a time, no `disableAutoInboundFlowControl()` call, no 
manual `request(1)` in `onNext`) without rebuilding. The rollback path matches 
the pre-PR receiver behaviour verbatim and is exercised in 
`MailboxContentObserverTest` (new 
`testOnNextDoesNotReplenishCreditWhenManualFlowControlDisabled` asserts 
`request(anyInt())` is never called when the flag is `false`). This shrinks the 
worst-case cancel-latency window back to where it was before this PR, at the 
cost of the throughput improvement the manual flow control buys.
   
   **2. GH issue #18541 — [MSE: out-of-band cancel propagation for stuck 
receivers](https://github.com/apache/pinot/issues/18541).**
   Documents the underlying hang (in-band EOS gated by a parked dispatch thread 
on a full application queue) and **explicitly notes it's pre-existing** — the 
rollback path above doesn't fix it either, just shrinks the window. Two design 
options written up in the issue:
   
      - **Option 1**: `setOnCancelHandler` + sender `stream.cancel()`. Faster, 
but the in-band error EOS can be dropped → loss of specific `QueryErrorCode` on 
the receiver.
      - **Option 2** (preferred): dedicated `Mailbox.Cancel(mailboxId, 
errorPayload)` unary RPC. Out-of-band, preserves error code fidelity, slightly 
more work (proto + handler + sender wiring). This is the route the user 
suggested when we discussed.
   
   Also included: a concrete test recipe 
(cancel-while-receiver-parked-on-full-queue, assert termination within ~100 ms) 
that would fail today and pass after either option.
   
   Going with documentation + rollback + issue in this PR rather than 
implementing the out-of-band cancel here: the proper fix is a non-trivial 
behavioural change to the cancel contract, and the issue Yash flagged is 
bounded (the rollback gives operators a release valve, and the hang it would 
re-expose predates this PR). Issue #18541 carries the design for the proper fix 
as a focused follow-up.
   
   `KEY_OF_GRPC_INBOUND_MESSAGE_CREDIT` Javadoc now spells out the 
cancel-latency tradeoff (`min(credit messages, flowControlWindow bytes)` of 
buffered inbound to drain before the EOS reaches the application) and links 
#18541. `GrpcSendingMailbox.cancel(Throwable)` Javadoc carries the same link.
   
   Tests: 41/41 mailbox tests still pass, including the new 
manual-flow-disabled test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to