Sorabh Hamirwasia created DRILL-6187:
----------------------------------------

             Summary: Exception in RPC communication between 
DataClient/ControlClient and respective servers when bit-to-bit security is on
                 Key: DRILL-6187
                 URL: https://issues.apache.org/jira/browse/DRILL-6187
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - RPC, Security
            Reporter: Sorabh Hamirwasia
            Assignee: Sorabh Hamirwasia
             Fix For: 1.13.0


 
{color:#000000}Below is the summary of issue: {color}
 
{color:#000000}*Scenario:*{color}
{color:#000000}It seems like first sendRecordBatch was sent to Foreman which 
initiated the Authentication handshake. But before initiating handshake for 
auth we establish a connection and store that in a registry. Now if in parallel 
there is another recordBatch (by a different minor fragment running on same 
Drillbit) to be sent then that will see the connection available in registry 
and will initiate the send. Before the authentication is completed this second 
request reached foreman and it throws below exception saying RPC type 3 message 
is not allowed and closes the connection. This also fails the authentication 
handshake which was in progress.{color}{color:#000000} Here the logs with 
details:{color}
{color:#000000} {color}
{color:#000000}*Forman received the SASL_START message from another 
node:*{color}
{color:#000000}*_2018-02-21 18:43:30,759 [_*{color}{color:#000000}_BitServer-4] 
TRACE o.a.d.e.r.s.ServerAuthenticationHandler - Received SASL message 
SASL_START from /10.10.100.161:35482_{color}
{color:#000000} {color}
{color:#000000}*Then around same time it received another message from client 
of Rpc Type 3 which is for SendRecordBatch and fails since handshake is not 
completed yet.*{color}
{color:#000000} {color}
{color:#000000}*_2018-02-21 18:43:30,762_*{color}{color:#000000} _[BitServer-4] 
ERROR o.a.d.exec.rpc.RpcExceptionHandler - Exception in RPC communication.  
Connection: /10.10.100.162:31012 <--> /__10.10.100.161:35482_ _(data server).  
Closing connection._{color}
{color:#000000}_io.netty.handler.codec.DecoderException: 
org.apache.drill.exec.rpc.RpcException: Request of type 3 is not allowed 
without authentication. Client on /__10.10.100.161:35482_ _must authenticate 
before making requests. Connection dropped. [Details: Encryption: enabled , 
MaxWrappedSize: 65536 , WrapSizeLimit: 0]_{color}
{color:#000000} {color}
{color:#000000}*Then client receives an channel closed exception:*{color}
{color:#000000} {color}
{color:#000000}*2018-02-21 18:43:30,764 [*{color}{color:#000000}BitClient-4] 
WARN  o.a.d.exec.rpc.RpcExceptionHandler - Exception occurred with closed 
channel.  Connection: /_10.10.100.161:35482_ <--> _10.10.100.162:31012_ (data 
client){color}
{color:#000000} {color}
{color:#000000}*and due to this it's initial command for authentication also 
fails. Since there is channel closed exception above I will think that 
triggered the failure of authentication request as well.*{color}
{color:#000000} {color}
{color:#000000}_Caused by: org.apache.drill.exec.rpc.RpcException: Command 
failed while establishing connection.  Failure type AUTHENTICATION._{color}
{color:#000000}        _at 
org.apache.drill.exec.rpc.RpcException.mapException(RpcException.java:67) 
~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.rpc.ListeningCommand.connectionFailed(ListeningCommand.java:66)
 ~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.rpc.data.DataTunnel$SendBatchAsyncListen.connectionFailed(DataTunnel.java:166)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.rpc.data.DataClient$AuthenticationCommand.connectionSucceeded(DataClient.java:203)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.rpc.data.DataClient$AuthenticationCommand.connectionSucceeded(DataClient.java:147)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:122)
 ~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.rpc.ReconnectingConnection.runCommand(ReconnectingConnection.java:83)
 ~[drill-rpc-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.rpc.data.DataTunnel._{color}{color:#000000}*_sendRecordBatch_*{color}{color:#000000}_(DataTunnel.java:84)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.ops.AccountingDataTunnel.sendRecordBatch(AccountingDataTunnel.java:45)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000}        _at 
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:127)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]_{color}
{color:#000000} {color}
{color:#000000}So I think there is a concurrency issue where even though the 
authentication is not completed the other requests are send to remote node as 
soon as TCP connection is available. Instead it should wait until 
authentication is completed. Something like TCP connection should be made 
available from registry only if authentication is completed.{color}
 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to