[ https://issues.apache.org/jira/browse/SPARK-38330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580707#comment-17580707 ]
Steve Loughran edited comment on SPARK-38330 at 8/17/22 9:46 AM: ----------------------------------------------------------------- bq. Is there a way to work-around this issue while waiting for a version of Spark which uses hadoop 3.3.4 (Spark 3.4?) remove all jars with cos in the title from your classpath note, emr is unaffected by this. so are cloudera products, primarily because they never backported the cos module. this is why it didn't show up in those tests. was (Author: ste...@apache.org): bq. Is there a way to work-around this issue while waiting for a version of Spark which uses hadoop 3.3.4 (Spark 3.4?) remove all jars with cos in the title from your classpath > Certificate doesn't match any of the subject alternative names: > [*.s3.amazonaws.com, s3.amazonaws.com] > ------------------------------------------------------------------------------------------------------ > > Key: SPARK-38330 > URL: https://issues.apache.org/jira/browse/SPARK-38330 > Project: Spark > Issue Type: Bug > Components: EC2 > Affects Versions: 3.2.1 > Environment: Spark 3.2.1 built with `hadoop-cloud` flag. > Direct access to s3 using default file committer. > JDK8. > > Reporter: André F. > Priority: Major > > Trying to run any job after bumping our Spark version from 3.1.2 to 3.2.1, > lead us to the current exception while reading files on s3: > {code:java} > org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on > s3a://<bucket>/<path>.parquet: com.amazonaws.SdkClientException: Unable to > execute HTTP request: Certificate for <bucket.s3.amazonaws.com> doesn't match > any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]: > Unable to execute HTTP request: Certificate for <bucket> doesn't match any of > the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com] at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208) at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:170) at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3351) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3185) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.isDirectory(S3AFileSystem.java:4277) > at > org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370) > at > org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274) > at > org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245) > at scala.Option.getOrElse(Option.scala:189) at > org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245) at > org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:596) {code} > > {code:java} > Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for > <bucket.s3.amazonaws.com> doesn't match any of the subject alternative names: > [*.s3.amazonaws.com, s3.amazonaws.com] > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:507) > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:437) > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) > at > com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) > at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76) > at com.amazonaws.http.conn.$Proxy16.connect(Unknown Source) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at > com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) > at > com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) > {code} > We found similar problems in the following tickets but: > - https://issues.apache.org/jira/browse/HADOOP-17017 (we don't use `.` in > our bucket names) > - [https://github.com/aws/aws-sdk-java-v2/issues/1786] (we tried to override > it by building Spark with `httpclient:4.5.10` or `httpclient:4.5.8`, with no > effect. We also made sure we are using the same `httpclient` version on our > main jar). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org