Github user sudheeshkatkam commented on a diff in the pull request:
https://github.com/apache/drill/pull/442#discussion_r58285085
--- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RpcBus.java ---
@@ -159,19 +159,15 @@ public ChannelClosedHandler(C clientConnection,
Channel channel) {
@Override
public void operationComplete(ChannelFuture future) throws Exception {
String msg;
- if(local!=null) {
+ if(local != null) {
msg = String.format("Channel closed %s <--> %s.", local, remote);
}else{
msg = String.format("Channel closed %s <--> %s.",
future.channel().localAddress(), future.channel().remoteAddress());
}
- if (RpcBus.this.isClient()) {
- if(local != null) {
- logger.info(String.format(msg));
- }
- } else {
- queue.channelClosed(new ChannelClosedException(msg));
- }
+ logger.info(msg); // should we leave this at info level ?
+
+ queue.channelClosed(new ChannelClosedException(msg));
--- End diff --
@adeneche @jacques-n correct me if I am wrong.
Per my understanding, this logic is incomplete, with or without this
change. Let's looks at bit-to-bit comm.
There is one CoordinationQueue **for each instance** of RpcBus (\*Server,
\*Client classes inherit from RpcBus). Also, this queue is used by requestors
to listen to outcomes of requests.
1. DataClient <--> DataServer. DataClient is always the requestor, and
there can be at most two data connections between two bits (A --> B and B -->
A).
a. Since a DataClient is created per connection, the client's queue
contains outcomes of requests to one DataServer. When this connection closes,
failing all RPC outcomes in the queue makes sense.
b. There is only one instance of DataServer per Drillbit, and so **one
server queue**. Since DataServer never makes requests, this queue should be
empty. `queue.channelClosed(...)` should be a noop.
2. ControlClient <--> ControlServer. This communication is peer-to-peer
i.e. ControlServer and ControlClient can make requests and handle requests. The
bit initiating a connection is the ControlClient, and its peer is the
ControlServer for lifetime of this connection. There can be at most one
connection between two bits (A <--> B), and messages are sent both ways. Now,
assume a connection is made.
a. On the ControlClient side, the queue contains outcomes of requests to
one ControlServer. When this connection closes, failing all RPC outcomes in the
queue makes sense.
b. There is only one instance of ControlServer per Drillbit, and so **one
server queue**. However, ControlServer can make requests to other bits, and to
multiple clients! So this queue can contain outcomes from multiple connections
and `queue.channelClosed(...)` fails outcomes of requests from **all**
connections??
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---