rymarm commented on pull request #2333:
URL: https://github.com/apache/drill/pull/2333#issuecomment-949656576


   Hi @paul-rogers, thank you for your review!
   
   By answering the main questions:   
   > ...So, the question is, does the isValid() method need to know the state 
at this exact instant? Or, is it good enough to know that the connection was OK 
when we last checked 30 seconds (or whatever) ago?   
   
   I think the answer is pretty easy and very predictable — yes, the method 
needs to know the state at this exact instant, as it is required by JDBC API 
and is described in [java 
documentation](https://docs.oracle.com/javase/8/docs/api/java/sql/Connection.html#isValid-int-).
 As we implement some interface we should comply with it, so as users may 
simply use Drill JDBC driver without deep knowledge of its implementation and 
the method result will be predictable.
   
   ------------------------
   
   And let me comment on other parts of your message.
   
   > ...So, the client always knows that the connection is still valid (or was 
at the time of the last heartbeat. We can check how frequently they occur; I 
think it is on the order of 30 seconds or so, but I may be wrong.   
   
   The current implementation does not correspond to what you said. For a now, 
only the server always knows that the connection is still valid, because it 
receives a heartbeat from the client, while the client doesn't check, whether 
the server (drillbit) has answered on the PING with a PONG. Heartbeat mechanism 
was implemented in [this 
commit](https://github.com/apache/drill/commit/960f876a1945eee4eeb0d6a98c5c58dfe2eea1a9),
 and there were missed functions that on the client side checks whether the 
server answers for requests.   
   
   There are 
[`IdleStateHandler`](https://github.com/apache/drill/blob/960f876a1945eee4eeb0d6a98c5c58dfe2eea1a9/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/BasicClient.java#L119)
 that sends PING (heartbeat) request if the client doesn't send any request for 
a while. But this handler, [doesn't check, whether the client received a PONG 
answer](https://github.com/apache/drill/blob/960f876a1945eee4eeb0d6a98c5c58dfe2eea1a9/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/RpcBus.java#L266)
 from the server. I may suppose, that author of this logic (heartbeat) 
believed, that netty's 
[`Future#isSuccess`](https://netty.io/4.1/api/io/netty/util/concurrent/Future.html#isSuccess--)
 will return `false` if the server [didn't receive PING 
message](https://github.com/apache/drill/blob/960f876a1945eee4eeb0d6a98c5c58dfe2eea1a9/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/BasicClient.java#L123),
 but it doesn't: `Future#isSuccess` mean only successful flush to a socket and 
it d
 oesn't mean that any data was delivered 
([source](https://github.com/netty/netty/issues/10343#issuecomment-641365702)). 
   
   
   Therefore, the client doesn't have any mechanism, that checks whether the 
server (drillbit) still answers. And in cases when drillbit is not accessible 
now due to network issues, drillbit failing, etc — the client (or HikariCP for 
example) will think, that connection is still valid.
   
   >  ...I think it is on the order of 30 seconds or so, but I may be wrong.   
   
   It is 15 seconds by default and calculates as `drill.exec.rpc.user.timeout` 
* 0.5 
([souserce](https://github.com/apache/drill/blob/4aefcef2b665c5737471664912a26ef6ed9a6cfc/exec/rpc/src/main/java/org/apache/drill/exec/rpc/BasicClient.java#L87)).
   
   > ...a heartbeat message was backed up behind data batches and the 
connection ended up failing as a result.   
   
   I even heard about cases, when zookeeper was losing drillbit from a cluster, 
because it didn't send heartbeat in time due to garbage collector freeze. And 
therefore in infrequent cases, queries may fail, because a foreman decides, 
that one of the drillbits is lost.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to