[ 
https://issues.apache.org/jira/browse/HBASE-23881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046977#comment-17046977
 ] 

Josh Elser edited comment on HBASE-23881 at 2/27/20 9:16 PM:
-------------------------------------------------------------

{noformat}
2020-02-27 16:03:51,668 INFO  [Time-limited test] 
example.TestShadeSaslAuthenticationProvider$3(243): Caught exception
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=4, exceptions:
2020-02-27T21:03:51.026Z, RpcRetryingCaller{globalStartTime=1582837371011, 
pause=100, maxAttempts=4}, org.apache.hadoop.hbase.MasterNotRunningException: 
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to 
mizar.local/192.168.2.28:56690 failed on local exception: 
org.apache.hadoop.hbase.ipc.CallTimeoutException: 
Call[id=0,methodName=IsMasterRunning], waitTime=60011, rpcTimeout=60000
2020-02-27T21:03:51.138Z, RpcRetryingCaller{globalStartTime=1582837371011, 
pause=100, maxAttempts=4}, org.apache.hadoop.hbase.MasterNotRunningException: 
java.io.IOException: Call to mizar.local/192.168.2.28:56690 failed on local 
exception: java.io.IOException: Connection reset by peer
2020-02-27T21:03:51.347Z, RpcRetryingCaller{globalStartTime=1582837371011, 
pause=100, maxAttempts=4}, org.apache.hadoop.hbase.MasterNotRunningException: 
java.io.IOException: Call to mizar.local/192.168.2.28:56690 failed on local 
exception: java.io.IOException: Connection reset by peer
2020-02-27T21:03:51.656Z, RpcRetryingCaller{globalStartTime=1582837371011, 
pause=100, maxAttempts=4}, org.apache.hadoop.hbase.MasterNotRunningException: 
java.io.IOException: Call to mizar.local/192.168.2.28:56690 failed on local 
exception: java.io.IOException: Connection reset by peer {noformat}
So, when you run this test on branch-2 we see a different set of exceptions. 
This looks like the client is seeing the exception properly. Granted, it comes 
back as {{MasterNotRunningException}} instead of {{InvalidToken}}, but still 
the test fails as I'd expect.

On Master, the server is definitely throwing an exception up the netty server 
callstack, but the client never gets it. I'm still trying to unwrap how Netty 
is supposed to be propagating the exception. NettyRpcConnection#saslNegotiate's 
operationComplete callback returns that the call was successful when it 
definitely should not be. Makes me a little worried we have an authentication 
problem on master with Netty (where Netty is the only RPC option).

Not sure if you have any tips you could give me to help me debug this more, 
[~zhangduo] :)


was (Author: elserj):
{noformat}
2020-02-27 16:03:51,668 INFO  [Time-limited test] 
example.TestShadeSaslAuthenticationProvider$3(243): Caught exception
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=4, exceptions:
2020-02-27T21:03:51.026Z, RpcRetryingCaller{globalStartTime=1582837371011, 
pause=100, maxAttempts=4}, org.apache.hadoop.hbase.MasterNotRunningException: 
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call to 
mizar.local/192.168.2.28:56690 failed on local exception: 
org.apache.hadoop.hbase.ipc.CallTimeoutException: 
Call[id=0,methodName=IsMasterRunning], waitTime=60011, rpcTimeout=60000
2020-02-27T21:03:51.138Z, RpcRetryingCaller{globalStartTime=1582837371011, 
pause=100, maxAttempts=4}, org.apache.hadoop.hbase.MasterNotRunningException: 
java.io.IOException: Call to mizar.local/192.168.2.28:56690 failed on local 
exception: java.io.IOException: Connection reset by peer
2020-02-27T21:03:51.347Z, RpcRetryingCaller{globalStartTime=1582837371011, 
pause=100, maxAttempts=4}, org.apache.hadoop.hbase.MasterNotRunningException: 
java.io.IOException: Call to mizar.local/192.168.2.28:56690 failed on local 
exception: java.io.IOException: Connection reset by peer
2020-02-27T21:03:51.656Z, RpcRetryingCaller{globalStartTime=1582837371011, 
pause=100, maxAttempts=4}, org.apache.hadoop.hbase.MasterNotRunningException: 
java.io.IOException: Call to mizar.local/192.168.2.28:56690 failed on local 
exception: java.io.IOException: Connection reset by peer {noformat}
So, when you run this test on branch-2 we see a different set of exceptions. 
This looks like the client is seeing the exception properly. Granted, it comes 
back as {{MasterNotRunningException}} instead of {{InvalidToken}}, but still 
the test fails as I'd expect.

On Master, the server is definitely throwing an exception up the netty server 
callstack, but the client never gets it. I'm still trying to unwrap how Netty 
is supposed to be propagating the exception. NettyRpcConnection#saslNegotiate's 
operationComplete callback returns that the call was successful when it 
definitely should not be. Makes me a little worried we have an authentication 
problem on master with Netty (where Netty is the only RPC option).

Not sure if you have any tips you could give me, [~zhangduo] :)

> TestShadeSaslAuthenticationProvider failures
> --------------------------------------------
>
>                 Key: HBASE-23881
>                 URL: https://issues.apache.org/jira/browse/HBASE-23881
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 3.0.0, 2.3.0
>            Reporter: Bharath Vissapragada
>            Assignee: Josh Elser
>            Priority: Major
>
> TestShadeSaslAuthenticationProvider now fails deterministically with the 
> following exception..
> {noformat}
> java.lang.Exception: Unexpected exception, 
> expected<org.apache.hadoop.hbase.DoNotRetryIOException> but 
> was<java.io.IOException>
>       at 
> org.apache.hadoop.hbase.security.provider.example.TestShadeSaslAuthenticationProvider.testNegativeAuthentication(TestShadeSaslAuthenticationProvider.java:233)
> {noformat}
> The test now fails a different place than before merging HBASE-18095 because 
> the RPCs are also a part of connection setup. We might need to rewrite the 
> test..  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to