[ 
https://issues.apache.org/jira/browse/HADOOP-15609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545725#comment-16545725
 ] 

Xiao Chen commented on HADOOP-15609:
------------------------------------

Thanks [~knanasi] for filing the Jira and [~jojochuang] for the discussion.

I think this surfaces only after the recent HADOOP-14841 fix (it's masked as 
EOFE before). [~daryn] also mentioned this in one occasion to me. While more 
investigation should be done regarding how to handle SSL more effectively, it 
makes sense to me to retry on these exceptions.

> Retry KMS calls when SSLHandshakeException occurs
> -------------------------------------------------
>
>                 Key: HADOOP-15609
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15609
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common, kms
>    Affects Versions: 3.1.0
>            Reporter: Kitti Nanasi
>            Assignee: Kitti Nanasi
>            Priority: Major
>
> KMS call should retry when javax.net.ssl.SSLHandshakeException occurs and 
> FailoverOnNetworkExceptionRetry policy is used.
> For example in the following stack trace, we can see that the KMS Provider's 
> connection is lost, an SSLHandshakeException is thrown and the operation is 
> not retried:
> {code}
> W0711 18:19:50.213472  1508 LoadBalancingKMSClientProvider.java:132] KMS 
> provider at [https://example.com:16000/kms/v1/] threw an IOException:
> Java exception follows:
> javax.net.ssl.SSLHandshakeException: Remote host closed connection during 
> handshake
>         at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
>         at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
>         at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1413)
>         at 
> sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1397)
>         at 
> sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559)
>         at 
> sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:185)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316)
>         at 
> sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291)
>         at 
> sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250)
>         at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:512)
>         at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.call(KMSClientProvider.java:502)
>         at 
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:791)
>         at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:288)
>         at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:284)
>         at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:124)
>         at 
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:284)
>         at 
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:532)
>         at 
> org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(DFSClient.java:927)
>         at 
> org.apache.hadoop.hdfs.DFSClient.createWrappedInputStream(DFSClient.java:946)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:316)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:311)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:323)
> Caused by: java.io.EOFException: SSL peer shut down incorrectly
>         at sun.security.ssl.InputRecord.read(InputRecord.java:505)
>         at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
>         ... 22 more
> W0711 18:19:50.239328  1508 LoadBalancingKMSClientProvider.java:149] Aborting 
> since the Request has failed with all KMS providers(depending on 
> hadoop.security.kms.client.failover.max.retries=1 setting and numProviders=1) 
> in the group OR the exception is not recoverable
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to