xiangfu0 commented on code in PR #18519:
URL: https://github.com/apache/pinot/pull/18519#discussion_r3273796304
##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/mailbox/GrpcSendingMailbox.java:
##########
@@ -208,17 +261,84 @@ public boolean isTerminated() {
return _senderSideClosed || _statusObserver.isFinished();
}
- private StreamObserver<MailboxContent> getContentObserver() {
+ private ClientCallStreamObserver<MailboxContent> getContentObserver() {
Metadata metadata = new Metadata();
metadata.put(ChannelUtils.MAILBOX_ID_METADATA_KEY, _id);
- return PinotMailboxGrpc.newStub(_channelManager.getChannel(_hostname,
_port))
+ // We wrap `_statusObserver` in a ClientResponseObserver so we can
register the on-ready handler through
+ // `beforeStart` — gRPC rejects setOnReadyHandler() if it is called after
open() returns. Wrapping (rather than
+ // making MailboxStatusObserver itself a ClientResponseObserver) keeps the
back-pressure plumbing local to this
+ // class. The wrapper delegates the data callbacks unchanged, and signals
our `_readyCond` on stream close so a
+ // blocked sender wakes up to observe `_statusObserver.isFinished()`
becoming true.
+ ClientResponseObserver<MailboxContent, MailboxStatus> responseObserver =
+ new ClientResponseObserver<MailboxContent, MailboxStatus>() {
+ @Override
+ public void beforeStart(ClientCallStreamObserver<MailboxContent>
requestStream) {
+ // Fires on a gRPC channel/Netty thread whenever isReady()
transitions false -> true. Just signal; the
+ // sender re-checks the predicate after waking.
+
requestStream.setOnReadyHandler(GrpcSendingMailbox.this::wakeWaiters);
+ }
+
+ @Override
+ public void onNext(MailboxStatus value) {
+ _statusObserver.onNext(value);
+ // Only wake on receiver early-terminate. Transport-level
isReady() transitions reach a parked
+ // sender through setOnReadyHandler (registered in beforeStart
above); normal buffer-size ACKs
+ // do not change any predicate awaitReady() actually waits on, so
signalling them would force a
+ // spurious park/unpark cycle on every receiver ACK.
Early-terminate is the one status-only
+ // change (the stream stays open) that awaitReady() must observe
promptly, so we still signal
+ // here when its metadata is set.
+ if (Boolean.parseBoolean(
+
value.getMetadataMap().get(ChannelUtils.MAILBOX_METADATA_REQUEST_EARLY_TERMINATE)))
{
+ wakeWaiters();
+ }
+ }
+
+ @Override
+ public void onError(Throwable t) {
+ try {
+ _statusObserver.onError(t);
+ } finally {
+ wakeWaiters();
+ }
+ }
+
+ @Override
+ public void onCompleted() {
+ try {
+ _statusObserver.onCompleted();
+ } finally {
+ wakeWaiters();
+ }
+ }
+ };
+
+ return (ClientCallStreamObserver<MailboxContent>) PinotMailboxGrpc.newStub(
+ _channelManager.getChannel(_hostname, _port))
.withInterceptors(MetadataUtils.newAttachHeadersInterceptor(metadata))
.withDeadlineAfter(_deadlineMs - System.currentTimeMillis(),
TimeUnit.MILLISECONDS)
- .open(_statusObserver);
+ .open(responseObserver);
}
protected void sendContent(ByteString byteString, boolean waitForMore) {
+ sendContent(byteString, waitForMore, false);
+ }
+
+ protected void sendContent(ByteString byteString, boolean waitForMore,
boolean bypassReady) {
+ if (!awaitReady(bypassReady)) {
+ // Either the mailbox was cancelled while we were waiting (normal path)
or the gRPC stream is already dead
+ // (bypass path). Either way, skip the send.
+ return;
+ }
+ // Narrow-window race mitigation: a concurrent cancel() may have run
between awaitReady() returning true and
+ // here, setting _senderSideClosed and pushing its own error EOS. If we
proceed, both threads would call
+ // onNext() on the same non-thread-safe ClientCallStreamObserver.
Re-checking after the gate reduces (but
+ // does not fully eliminate) that window; fully eliminating it would
require serializing all onNext() calls
+ // under _readyLock, which is more invasive. The bypass path
(cancel/close) must push through regardless,
+ // so this guard only applies when bypassReady == false.
+ if (!bypassReady && isTerminated()) {
Review Comment:
This post-`awaitReady()` guard still leaves `ClientCallStreamObserver`
exposed to concurrent use. If `cancel()` or `close()` wins the race after this
check but before the sender thread reaches `_contentObserver.onNext(...)`, the
cancel path can still call `onNext()`/`onCompleted()` on the same observer
concurrently. The comment above already notes the window is only reduced, not
eliminated, and gRPC does not guarantee these observers are thread-safe. This
needs a fully serialized send/close path (or a single-thread handoff),
otherwise the back-pressure fix still leaves a correctness race in cancellation
handling.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]