gortiz commented on code in PR #18519:
URL: https://github.com/apache/pinot/pull/18519#discussion_r3274449091
##########
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java:
##########
@@ -2167,6 +2167,73 @@ public static class MultiStageQueryRunner {
public static final String KEY_OF_MAX_INBOUND_QUERY_DATA_BLOCK_SIZE_BYTES
= "pinot.query.runner.max.msg.size.bytes";
public static final int DEFAULT_MAX_INBOUND_QUERY_DATA_BLOCK_SIZE_BYTES =
16 * 1024 * 1024;
+ /**
+ * Whether the sender side of every {@code GrpcSendingMailbox} respects
gRPC client-side flow control by waiting
+ * on {@code ClientCallStreamObserver.isReady()} before pushing each chunk.
+ *
+ * <p>Default {@code true}. Set to {@code false} to restore the pre-1.6
behaviour where the sender pushes
+ * unconditionally; useful as a production kill-switch if the gate causes
an unexpected regression, and as an
+ * A/B knob for benchmarks (see {@code BenchmarkGrpcMailboxSend}).
+ *
+ * <p>Disabling this flag is what re-introduces the {@code
OutOfDirectMemoryError} failure mode the gate exists
+ * to prevent. It is here as a safety valve, not as a recommended setting.
+ */
+ public static final String KEY_OF_GRPC_SENDER_BACKPRESSURE_ENABLED =
+ "pinot.query.runner.grpc.sender.backpressure.enabled";
+ public static final boolean DEFAULT_GRPC_SENDER_BACKPRESSURE_ENABLED =
true;
+
+ /**
+ * Per-stream HTTP/2 flow control window, in bytes. The receiver
advertises this value to the sender as
+ * the number of bytes it will accept before requiring a `WINDOW_UPDATE`
frame. Wider windows let the
+ * sender push a whole `MseBlock` without {@link
io.grpc.stub.ClientCallStreamObserver#isReady} flipping
+ * mid-block. Applied via `NettyServerBuilder.flowControlWindow` in
`GrpcMailboxServer`.
+ *
+ * <p>This is per HTTP/2 stream, so total inbound buffering at the
receiver scales as
+ * {@code value × #concurrent streams to this server}.
+ */
+ public static final String KEY_OF_GRPC_FLOW_CONTROL_WINDOW_BYTES =
+ "pinot.query.runner.grpc.flow.control.window.bytes";
+ public static final int DEFAULT_GRPC_FLOW_CONTROL_WINDOW_BYTES = 64 * 1024
* 1024;
Review Comment:
A better explaination from Claude:
- flowControlWindow (64 MB): per-stream HTTP/2 inbound byte window the
receiver advertises. Caps bytes in flight per stream
before the sender has to wait for WINDOW_UPDATE.
- writeBufferHighWaterMark (64 MB): per-channel Netty outbound queue size
on the sender. Once the queue grows past this,
Channel.isWritable() flips false and the application gate parks.
- writeBufferLowWaterMark (32 MB): the drain threshold that flips
isWritable() back to true.
flowControlWindow and writeBufferHighWaterMark are aligned at 64 MB by
design — they cap roughly the same conceptual thing (one peer's worth of
in-flight bytes) from the two ends of the wire. We could collapse them into a
single config, but I prefer keeping them separate so operators can tune them
independently if their workload calls for it (e.g. an asymmetric one-way
fan-in).
writeBufferLowWaterMark intentionally is not equal to high — they're a
hysteresis pair, not a single threshold. Netty marks the channel unwritable
when the outbound queue grows above high, and writable again only when it
drains below low. If we set them equal, the channel would oscillate between
writable/unwritable on every byte once the queue is around that size, forcing
constant park/wake on the application gate under sustained pressure.
Half-of-high is the standard hysteresis ratio.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]