[ https://issues.apache.org/jira/browse/HADOOP-15609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548625#comment-16548625 ]
Steve Loughran commented on HADOOP-15609: ----------------------------------------- h2. LoadBalancingKMSClientProvider L50: javax. imports should go after java. imports at the top of the import list L139. Use initCause() to add original stack trace -critical for diagnosing problems later. h2. TestLoadBalancingKMSClientProvider L662. thrown exception should include the original exception, to help debug test failure. L702 use LambdaTestUtils.intercept, something like {code} intercept(ConnectException.class, "SSLHandshakeException: p1 exception message", ()-> kp.createKey("test", new Options(conf))); {code} > Retry KMS calls when SSLHandshakeException occurs > ------------------------------------------------- > > Key: HADOOP-15609 > URL: https://issues.apache.org/jira/browse/HADOOP-15609 > Project: Hadoop Common > Issue Type: Improvement > Components: common, kms > Affects Versions: 3.1.0 > Reporter: Kitti Nanasi > Assignee: Kitti Nanasi > Priority: Major > Attachments: HADOOP-15609.001.patch, HADOOP-15609.002.patch > > > KMS call should retry when javax.net.ssl.SSLHandshakeException occurs and > FailoverOnNetworkExceptionRetry policy is used. > For example in the following stack trace, we can see that the KMS Provider's > connection is lost, an SSLHandshakeException is thrown and the operation is > not retried: > {code} > W0711 18:19:50.213472 1508 LoadBalancingKMSClientProvider.java:132] KMS > provider at [https://example.com:16000/kms/v1/] threw an IOException: > Java exception follows: > javax.net.ssl.SSLHandshakeException: Remote host closed connection during > handshake > at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002) > at > sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413) > at > sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397) > at > sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559) > at > sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185) > at > sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316) > at > sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291) > at > sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:512) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:502) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:791) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:288) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:284) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:124) > at > org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:284) > at > org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:532) > at > org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:927) > at > org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:946) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316) > at > org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:311) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:323) > Caused by: java.io.EOFException: SSL peer shut down incorrectly > at sun.security.ssl.InputRecord.read(InputRecord.java:505) > at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983) > ... 22 more > W0711 18:19:50.239328 1508 LoadBalancingKMSClientProvider.java:149] Aborting > since the Request has failed with all KMS providers(depending on > hadoop.security.kms.client.failover.max.retries=1 setting and numProviders=1) > in the group OR the exception is not recoverable > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org