[ https://issues.apache.org/jira/browse/SPARK-38330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510830#comment-17510830 ]
Steve Loughran commented on SPARK-38330: ---------------------------------------- the hadoop fix is in, but it will take a while. note that on hadoop 3.3.1+, if you can switch to the unshaded aws sdk, then you can change the http client version. > Certificate doesn't match any of the subject alternative names: > [*.s3.amazonaws.com, s3.amazonaws.com] > ------------------------------------------------------------------------------------------------------ > > Key: SPARK-38330 > URL: https://issues.apache.org/jira/browse/SPARK-38330 > Project: Spark > Issue Type: Bug > Components: EC2 > Affects Versions: 3.2.1 > Environment: Spark 3.2.1 built with `hadoop-cloud` flag. > Direct access to s3 using default file committer. > JDK8. > > Reporter: André F. > Priority: Major > > Trying to run any job after bumping our Spark version from 3.1.2 to 3.2.1, > lead us to the current exception while reading files on s3: > {code:java} > org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on > s3a://<bucket>/<path>.parquet: com.amazonaws.SdkClientException: Unable to > execute HTTP request: Certificate for <bucket.s3.amazonaws.com> doesn't match > any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]: > Unable to execute HTTP request: Certificate for <bucket> doesn't match any of > the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com] at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208) at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:170) at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3351) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3185) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.isDirectory(S3AFileSystem.java:4277) > at > org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370) > at > org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274) > at > org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245) > at scala.Option.getOrElse(Option.scala:189) at > org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245) at > org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:596) {code} > > {code:java} > Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for > <bucket.s3.amazonaws.com> doesn't match any of the subject alternative names: > [*.s3.amazonaws.com, s3.amazonaws.com] > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:507) > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:437) > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) > at > com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) > at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76) > at com.amazonaws.http.conn.$Proxy16.connect(Unknown Source) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at > com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) > at > com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) > {code} > We found similar problems in the following tickets but: > - https://issues.apache.org/jira/browse/HADOOP-17017 (we don't use `.` in > our bucket names) > - [https://github.com/aws/aws-sdk-java-v2/issues/1786] (we tried to override > it by building Spark with `httpclient:4.5.10` or `httpclient:4.5.8`, with no > effect. We also made sure we are using the same `httpclient` version on our > main jar). -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org