[ https://issues.apache.org/jira/browse/SPARK-38330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580762#comment-17580762 ]
Maksim Grinman edited comment on SPARK-38330 at 8/17/22 12:28 PM: ------------------------------------------------------------------ Thanks for the response. I did try compiling myself from the Spark github repo in the v3.3.0 tagged commit (with hadoop-aws jar added in the pom) and generating the python wheel to see what's in the python wheel and none of them have cos in the name: {code:java} [110] → ls -l python/build/bdist.linux-x86_64/wheel/pyspark/jars total 921296 rw-rr- 1 maks staff 227K Aug 12 13:21 JLargeArrays-1.5.jar rw-rr- 1 maks staff 1.1M Aug 12 13:21 JTransforms-3.1.jar rw-rr- 1 maks staff 418K Aug 12 13:18 RoaringBitmap-0.9.25.jar rw-rr- 1 maks staff 68K Oct 4 2021 activation-1.1.1.jar rw-rr- 1 maks staff 179K Aug 12 13:25 aircompressor-0.21.jar rw-rr- 1 maks staff 1.1M Aug 12 13:21 algebra_2.12-2.0.1.jar rw-rr- 1 maks staff 19K Aug 12 13:25 annotations-17.0.0.jar rw-rr- 1 maks staff 330K Aug 12 13:23 antlr4-runtime-4.8.jar rw-rr- 1 maks staff 26K Aug 12 13:19 aopalliance-repackaged-2.6.1.jar rw-rr- 1 maks staff 76K Aug 12 13:28 arpack-2.2.1.jar rw-rr- 1 maks staff 1.1M Oct 4 2021 arpack_combined_all-0.1.jar rw-rr- 1 maks staff 107K Aug 12 13:23 arrow-format-7.0.0.jar rw-rr- 1 maks staff 106K Aug 12 13:23 arrow-memory-core-7.0.0.jar rw-rr- 1 maks staff 38K Aug 12 13:23 arrow-memory-netty-7.0.0.jar rw-rr- 1 maks staff 1.8M Aug 12 13:23 arrow-vector-7.0.0.jar rw-rr- 1 maks staff 20K Aug 12 13:19 audience-annotations-0.5.0.jar rw-rr- 1 maks staff 580K Aug 12 13:19 avro-1.11.0.jar rw-rr- 1 maks staff 181K Aug 12 13:19 avro-ipc-1.11.0.jar rw-rr- 1 maks staff 184K Aug 12 13:19 avro-mapred-1.11.0.jar rw-rr- 1 maks staff 216M Aug 12 13:19 aws-java-sdk-bundle-1.11.1026.jar rw-rr- 1 maks staff 194K Aug 12 13:21 blas-2.2.1.jar rw-rr- 1 maks staff 73K Aug 12 13:21 breeze-macros_2.12-1.2.jar rw-rr- 1 maks staff 13M Aug 12 13:21 breeze_2.12-1.2.jar rw-rr- 1 maks staff 3.2M Aug 12 13:21 cats-kernel_2.12-2.1.1.jar rw-rr- 1 maks staff 57K Aug 12 13:18 chill-java-0.10.0.jar rw-rr- 1 maks staff 207K Aug 12 13:18 chill_2.12-0.10.0.jar rw-rr- 1 maks staff 346K Aug 12 13:18 commons-codec-1.15.jar rw-rr- 1 maks staff 575K Oct 4 2021 commons-collections-3.2.2.jar rw-rr- 1 maks staff 734K Aug 12 13:19 commons-collections4-4.4.jar rw-rr- 1 maks staff 70K Aug 12 13:23 commons-compiler-3.0.16.jar rw-rr- 1 maks staff 994K Aug 12 13:19 commons-compress-1.21.jar rw-rr- 1 maks staff 162K Aug 12 13:18 commons-crypto-1.1.0.jar rw-rr- 1 maks staff 319K Aug 12 13:18 commons-io-2.11.0.jar rw-rr- 1 maks staff 278K Jan 15 2021 commons-lang-2.6.jar rw-rr- 1 maks staff 574K Aug 12 13:18 commons-lang3-3.12.0.jar rw-rr- 1 maks staff 61K Oct 4 2021 commons-logging-1.1.3.jar rw-rr- 1 maks staff 2.1M Aug 12 13:19 commons-math3-3.6.1.jar rw-rr- 1 maks staff 211K Aug 12 13:18 commons-text-1.9.jar rw-rr- 1 maks staff 80K Aug 12 13:19 compress-lzf-1.1.jar rw-rr- 1 maks staff 161K Oct 4 2021 core-1.1.2.jar rw-rr- 1 maks staff 2.3M Aug 12 13:19 curator-client-2.13.0.jar rw-rr- 1 maks staff 197K Aug 12 13:19 curator-framework-2.13.0.jar rw-rr- 1 maks staff 277K Aug 12 13:19 curator-recipes-2.13.0.jar rw-rr- 1 maks staff 63K Aug 12 13:23 flatbuffers-java-1.12.0.jar rw-rr- 1 maks staff 235K Aug 12 13:18 gson-2.8.6.jar rw-rr- 1 maks staff 2.1M Oct 4 2021 guava-14.0.1.jar rw-rr- 1 maks staff 940K Aug 12 13:19 hadoop-aws-3.3.2.jar rw-rr- 1 maks staff 19M Aug 12 13:19 hadoop-client-api-3.3.2.jar rw-rr- 1 maks staff 29M Aug 12 13:19 hadoop-client-runtime-3.3.2.jar rw-rr- 1 maks staff 3.2M Aug 12 14:02 hadoop-shaded-guava-1.1.1.jar rw-rr- 1 maks staff 55K Aug 12 14:02 hadoop-yarn-server-web-proxy-3.3.2.jar rw-rr- 1 maks staff 231K Aug 12 13:25 hive-storage-api-2.7.2.jar rw-rr- 1 maks staff 196K Aug 12 13:19 hk2-api-2.6.1.jar rw-rr- 1 maks staff 199K Aug 12 13:19 hk2-locator-2.6.1.jar rw-rr- 1 maks staff 129K Aug 12 13:19 hk2-utils-2.6.1.jar rw-rr- 1 maks staff 27K Aug 12 13:28 istack-commons-runtime-3.0.8.jar rw-rr- 1 maks staff 1.3M Aug 12 13:19 ivy-2.5.0.jar rw-rr- 1 maks staff 74K Aug 12 13:18 jackson-annotations-2.13.3.jar rw-rr- 1 maks staff 366K Aug 12 13:18 jackson-core-2.13.3.jar rw-rr- 1 maks staff 1.5M Aug 12 13:18 jackson-databind-2.13.3.jar rw-rr- 1 maks staff 448K Aug 12 13:19 jackson-module-scala_2.12-2.13.3.jar rw-rr- 1 maks staff 24K Aug 12 13:19 jakarta.annotation-api-1.3.5.jar rw-rr- 1 maks staff 18K Aug 12 13:19 jakarta.inject-2.6.1.jar rw-rr- 1 maks staff 81K Aug 12 13:19 jakarta.servlet-api-4.0.3.jar rw-rr- 1 maks staff 90K Aug 12 13:19 jakarta.validation-api-2.0.2.jar rw-rr- 1 maks staff 137K Aug 12 13:19 jakarta.ws.rs-api-2.1.6.jar rw-rr- 1 maks staff 113K Aug 12 13:28 jakarta.xml.bind-api-2.3.2.jar rw-rr- 1 maks staff 905K Aug 12 13:23 janino-3.0.16.jar rw-rr- 1 maks staff 762K Aug 12 13:19 javassist-3.25.0-GA.jar rw-rr- 1 maks staff 990K Aug 12 13:28 jaxb-runtime-2.3.2.jar rw-rr- 1 maks staff 16K Aug 12 13:19 jcl-over-slf4j-1.7.32.jar rw-rr- 1 maks staff 253K Aug 12 13:19 jersey-client-2.34.jar rw-rr- 1 maks staff 1.1M Aug 12 13:19 jersey-common-2.34.jar rw-rr- 1 maks staff 32K Aug 12 13:19 jersey-container-servlet-2.34.jar rw-rr- 1 maks staff 72K Aug 12 13:19 jersey-container-servlet-core-2.34.jar rw-rr- 1 maks staff 75K Aug 12 13:19 jersey-hk2-2.34.jar rw-rr- 1 maks staff 925K Aug 12 13:19 jersey-server-2.34.jar rw-rr- 1 maks staff 88K Aug 12 13:19 json4s-ast_2.12-3.7.0-M11.jar rw-rr- 1 maks staff 514K Aug 12 13:19 json4s-core_2.12-3.7.0-M11.jar rw-rr- 1 maks staff 36K Aug 12 13:19 json4s-jackson_2.12-3.7.0-M11.jar rw-rr- 1 maks staff 340K Aug 12 13:19 json4s-scalap_2.12-3.7.0-M11.jar rw-rr- 1 maks staff 32K Aug 12 13:18 jsr305-3.0.0.jar rw-rr- 1 maks staff 4.5K Aug 12 13:19 jul-to-slf4j-1.7.32.jar rw-rr- 1 maks staff 401K Oct 4 2021 kryo-shaded-4.0.2.jar rw-rr- 1 maks staff 794K Aug 12 13:28 lapack-2.2.1.jar rw-rr- 1 maks staff 1.0M Oct 4 2021 leveldbjni-all-1.8.jar rw-rr- 1 maks staff 296K Aug 12 13:18 log4j-1.2-api-2.17.2.jar rw-rr- 1 maks staff 295K Aug 12 13:18 log4j-api-2.17.2.jar rw-rr- 1 maks staff 1.7M Aug 12 13:18 log4j-core-2.17.2.jar rw-rr- 1 maks staff 24K Aug 12 13:18 log4j-slf4j-impl-2.17.2.jar rw-rr- 1 maks staff 667K Aug 12 13:19 lz4-java-1.8.0.jar rw-rr- 1 maks staff 123K Aug 12 13:18 metrics-core-4.2.7.jar rw-rr- 1 maks staff 23K Aug 12 13:19 metrics-graphite-4.2.7.jar rw-rr- 1 maks staff 21K Aug 12 13:19 metrics-jmx-4.2.7.jar rw-rr- 1 maks staff 16K Aug 12 13:19 metrics-json-4.2.7.jar rw-rr- 1 maks staff 24K Aug 12 13:19 metrics-jvm-4.2.7.jar rw-rr- 1 maks staff 5.6K Oct 4 2021 minlog-1.3.0.jar rw-rr- 1 maks staff 4.3K Aug 12 13:18 netty-all-4.1.74.Final.jar rw-rr- 1 maks staff 296K Aug 12 13:18 netty-buffer-4.1.74.Final.jar rw-rr- 1 maks staff 329K Aug 12 13:18 netty-codec-4.1.74.Final.jar rw-rr- 1 maks staff 635K Aug 12 13:18 netty-common-4.1.74.Final.jar rw-rr- 1 maks staff 516K Aug 12 13:18 netty-handler-4.1.74.Final.jar rw-rr- 1 maks staff 37K Aug 12 13:18 netty-resolver-4.1.74.Final.jar rw-rr- 1 maks staff 34K Aug 12 13:18 netty-tcnative-classes-2.0.48.Final.jar rw-rr- 1 maks staff 468K Aug 12 13:18 netty-transport-4.1.74.Final.jar rw-rr- 1 maks staff 135K Aug 12 13:18 netty-transport-classes-epoll-4.1.74.Final.jar rw-rr- 1 maks staff 105K Aug 12 13:18 netty-transport-classes-kqueue-4.1.74.Final.jar rw-rr- 1 maks staff 37K Aug 12 13:18 netty-transport-native-epoll-4.1.74.Final-linux-aarch_64.jar rw-rr- 1 maks staff 35K Aug 12 13:18 netty-transport-native-epoll-4.1.74.Final-linux-x86_64.jar rw-rr- 1 maks staff 24K Aug 12 13:18 netty-transport-native-kqueue-4.1.74.Final-osx-aarch_64.jar rw-rr- 1 maks staff 25K Aug 12 13:18 netty-transport-native-kqueue-4.1.74.Final-osx-x86_64.jar rw-rr- 1 maks staff 39K Aug 12 13:18 netty-transport-native-unix-common-4.1.74.Final.jar rw-rr- 1 maks staff 48K Aug 12 13:18 objenesis-3.2.jar rw-rr- 1 maks staff 19K Oct 4 2021 opencsv-2.3.jar rw-rr- 1 maks staff 1.0M Aug 12 13:25 orc-core-1.7.4.jar rw-rr- 1 maks staff 47K Aug 12 13:25 orc-mapreduce-1.7.4.jar rw-rr- 1 maks staff 29K Aug 12 13:25 orc-shims-1.7.4.jar rw-rr- 1 maks staff 64K Jan 15 2021 oro-2.0.8.jar rw-rr- 1 maks staff 19K Aug 12 13:19 osgi-resource-locator-1.0.3.jar rw-rr- 1 maks staff 34K Oct 4 2021 paranamer-2.8.jar rw-rr- 1 maks staff 1.9M Aug 12 13:25 parquet-column-1.12.2.jar rw-rr- 1 maks staff 94K Aug 12 13:25 parquet-common-1.12.2.jar rw-rr- 1 maks staff 829K Aug 12 13:25 parquet-encoding-1.12.2.jar rw-rr- 1 maks staff 691K Aug 12 13:25 parquet-format-structures-1.12.2.jar rw-rr- 1 maks staff 955K Aug 12 13:25 parquet-hadoop-1.12.2.jar rw-rr- 1 maks staff 1.8M Aug 12 13:25 parquet-jackson-1.12.2.jar rw-rr- 1 maks staff 53K Aug 12 13:19 pickle-1.2.jar rw-rr- 1 maks staff 521K Oct 4 2021 protobuf-java-2.5.0.jar rw-rr- 1 maks staff 120K Aug 12 13:19 py4j-0.10.9.5.jar rw-rr- 1 maks staff 34M Aug 12 13:18 rocksdbjni-6.20.3.jar rw-rr- 1 maks staff 110K Aug 12 13:21 scala-collection-compat_2.12-2.1.1.jar rw-rr- 1 maks staff 10M Aug 12 13:18 scala-compiler-2.12.15.jar rw-rr- 1 maks staff 5.2M Aug 12 13:18 scala-library-2.12.15.jar rw-rr- 1 maks staff 218K Aug 12 13:18 scala-parser-combinators_2.12-1.1.2.jar rw-rr- 1 maks staff 3.5M Aug 12 13:18 scala-reflect-2.12.15.jar rw-rr- 1 maks staff 544K Aug 12 13:18 scala-xml_2.12-1.2.0.jar rw-rr- 1 maks staff 3.1M Aug 12 13:21 shapeless_2.12-2.3.7.jar rw-rr- 1 maks staff 2.5K Aug 12 13:18 shims-0.9.25.jar rw-rr- 1 maks staff 41K Aug 12 13:18 slf4j-api-1.7.32.jar rw-rr- 1 maks staff 1.9M Aug 12 13:19 snappy-java-1.1.8.4.jar rw-rr- 1 maks staff 12M Aug 12 14:19 spark-catalyst_2.12-3.3.0.jar rw-rr- 1 maks staff 10M Aug 12 14:16 spark-core_2.12-3.3.0.jar rw-rr- 1 maks staff 424K Aug 12 14:16 spark-graphx_2.12-3.3.0.jar rw-rr- 1 maks staff 82K Aug 12 14:13 spark-kvstore_2.12-3.3.0.jar rw-rr- 1 maks staff 76K Aug 12 14:14 spark-launcher_2.12-3.3.0.jar rw-rr- 1 maks staff 113K Aug 12 14:16 spark-mllib-local_2.12-3.3.0.jar rw-rr- 1 maks staff 5.9M Aug 12 14:37 spark-mllib_2.12-3.3.0.jar rw-rr- 1 maks staff 2.3M Aug 12 14:14 spark-network-common_2.12-3.3.0.jar rw-rr- 1 maks staff 156K Aug 12 14:14 spark-network-shuffle_2.12-3.3.0.jar rw-rr- 1 maks staff 50K Aug 12 14:40 spark-repl_2.12-3.3.0.jar rw-rr- 1 maks staff 30K Aug 12 14:13 spark-sketch_2.12-3.3.0.jar rw-rr- 1 maks staff 8.4M Aug 12 14:30 spark-sql_2.12-3.3.0.jar rw-rr- 1 maks staff 1.1M Aug 12 14:17 spark-streaming_2.12-3.3.0.jar rw-rr- 1 maks staff 15K Aug 12 14:13 spark-tags_2.12-3.3.0.jar rw-rr- 1 maks staff 52K Aug 12 14:14 spark-unsafe_2.12-3.3.0.jar rw-rr- 1 maks staff 349K Aug 12 14:49 spark-yarn_2.12-3.3.0.jar rw-rr- 1 maks staff 112K Aug 12 13:21 spire-macros_2.12-0.17.0.jar rw-rr- 1 maks staff 8.3K Aug 12 13:21 spire-platform_2.12-0.17.0.jar rw-rr- 1 maks staff 34K Aug 12 13:21 spire-util_2.12-0.17.0.jar rw-rr- 1 maks staff 6.9M Aug 12 13:21 spire_2.12-0.17.0.jar rw-rr- 1 maks staff 174K Aug 12 13:19 stream-2.9.6.jar rw-rr- 1 maks staff 228K Aug 12 13:25 threeten-extra-1.5.0.jar rw-rr- 1 maks staff 1.3M Aug 12 13:18 tink-1.6.1.jar rw-rr- 1 maks staff 437K Aug 12 13:23 univocity-parsers-2.9.1.jar rw-rr- 1 maks staff 414K Aug 12 13:19 wildfly-openssl-1.0.7.Final.jar rw-rr- 1 maks staff 288K Aug 12 13:19 xbean-asm9-shaded-4.20.jar rw-rr- 1 maks staff 106K Aug 12 13:18 xz-1.8.jar rw-rr- 1 maks staff 1.2M Aug 12 13:19 zookeeper-3.6.2.jar rw-rr- 1 maks staff 245K Aug 12 13:19 zookeeper-jute-3.6.2.jar rw-rr- 1 maks staff 5.6M Aug 12 13:19 zstd-jni-1.5.2-1.jar{code} And attempting to read in Spark from s3 from one of our buckets with no dot in the name threw the error. I also tried the same with 3.2.1. To be sure I also built with this to emulate the Hadoop fix (not sure if it's right, but it built successfully) {code:java} diff --git a/hadoop-cloud/pom.xml b/hadoop-cloud/pom.xml index 08bcae6e0f..ceb2677648 100644 --- a/hadoop-cloud/pom.xml +++ b/hadoop-cloud/pom.xml @@ -77,6 +77,14 @@ <version>${hadoop.version}</version> <scope>${hadoop.deps.scope}</scope> <exclusions> + <!-- + Trying to replicate the fix here while we wait for fixed Spark release + https://github.com/apache/hadoop/pull/4481/files + --> + <exclusion> + <groupId>org.apache.hadoop</groupId> + <artifactId>hadoop-cos</artifactId> + </exclusion> <exclusion> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> {code} was (Author: JIRAUSER290629): Thanks for the response. I did try compiling myself (with hadoop-aws jar included) and generating the python wheel in Python 3.3 to see what's in the python wheel and none of them have cos in the name: {code:java} [110] → ls -l python/build/bdist.linux-x86_64/wheel/pyspark/jars total 921296 rw-rr- 1 maks staff 227K Aug 12 13:21 JLargeArrays-1.5.jar rw-rr- 1 maks staff 1.1M Aug 12 13:21 JTransforms-3.1.jar rw-rr- 1 maks staff 418K Aug 12 13:18 RoaringBitmap-0.9.25.jar rw-rr- 1 maks staff 68K Oct 4 2021 activation-1.1.1.jar rw-rr- 1 maks staff 179K Aug 12 13:25 aircompressor-0.21.jar rw-rr- 1 maks staff 1.1M Aug 12 13:21 algebra_2.12-2.0.1.jar rw-rr- 1 maks staff 19K Aug 12 13:25 annotations-17.0.0.jar rw-rr- 1 maks staff 330K Aug 12 13:23 antlr4-runtime-4.8.jar rw-rr- 1 maks staff 26K Aug 12 13:19 aopalliance-repackaged-2.6.1.jar rw-rr- 1 maks staff 76K Aug 12 13:28 arpack-2.2.1.jar rw-rr- 1 maks staff 1.1M Oct 4 2021 arpack_combined_all-0.1.jar rw-rr- 1 maks staff 107K Aug 12 13:23 arrow-format-7.0.0.jar rw-rr- 1 maks staff 106K Aug 12 13:23 arrow-memory-core-7.0.0.jar rw-rr- 1 maks staff 38K Aug 12 13:23 arrow-memory-netty-7.0.0.jar rw-rr- 1 maks staff 1.8M Aug 12 13:23 arrow-vector-7.0.0.jar rw-rr- 1 maks staff 20K Aug 12 13:19 audience-annotations-0.5.0.jar rw-rr- 1 maks staff 580K Aug 12 13:19 avro-1.11.0.jar rw-rr- 1 maks staff 181K Aug 12 13:19 avro-ipc-1.11.0.jar rw-rr- 1 maks staff 184K Aug 12 13:19 avro-mapred-1.11.0.jar rw-rr- 1 maks staff 216M Aug 12 13:19 aws-java-sdk-bundle-1.11.1026.jar rw-rr- 1 maks staff 194K Aug 12 13:21 blas-2.2.1.jar rw-rr- 1 maks staff 73K Aug 12 13:21 breeze-macros_2.12-1.2.jar rw-rr- 1 maks staff 13M Aug 12 13:21 breeze_2.12-1.2.jar rw-rr- 1 maks staff 3.2M Aug 12 13:21 cats-kernel_2.12-2.1.1.jar rw-rr- 1 maks staff 57K Aug 12 13:18 chill-java-0.10.0.jar rw-rr- 1 maks staff 207K Aug 12 13:18 chill_2.12-0.10.0.jar rw-rr- 1 maks staff 346K Aug 12 13:18 commons-codec-1.15.jar rw-rr- 1 maks staff 575K Oct 4 2021 commons-collections-3.2.2.jar rw-rr- 1 maks staff 734K Aug 12 13:19 commons-collections4-4.4.jar rw-rr- 1 maks staff 70K Aug 12 13:23 commons-compiler-3.0.16.jar rw-rr- 1 maks staff 994K Aug 12 13:19 commons-compress-1.21.jar rw-rr- 1 maks staff 162K Aug 12 13:18 commons-crypto-1.1.0.jar rw-rr- 1 maks staff 319K Aug 12 13:18 commons-io-2.11.0.jar rw-rr- 1 maks staff 278K Jan 15 2021 commons-lang-2.6.jar rw-rr- 1 maks staff 574K Aug 12 13:18 commons-lang3-3.12.0.jar rw-rr- 1 maks staff 61K Oct 4 2021 commons-logging-1.1.3.jar rw-rr- 1 maks staff 2.1M Aug 12 13:19 commons-math3-3.6.1.jar rw-rr- 1 maks staff 211K Aug 12 13:18 commons-text-1.9.jar rw-rr- 1 maks staff 80K Aug 12 13:19 compress-lzf-1.1.jar rw-rr- 1 maks staff 161K Oct 4 2021 core-1.1.2.jar rw-rr- 1 maks staff 2.3M Aug 12 13:19 curator-client-2.13.0.jar rw-rr- 1 maks staff 197K Aug 12 13:19 curator-framework-2.13.0.jar rw-rr- 1 maks staff 277K Aug 12 13:19 curator-recipes-2.13.0.jar rw-rr- 1 maks staff 63K Aug 12 13:23 flatbuffers-java-1.12.0.jar rw-rr- 1 maks staff 235K Aug 12 13:18 gson-2.8.6.jar rw-rr- 1 maks staff 2.1M Oct 4 2021 guava-14.0.1.jar rw-rr- 1 maks staff 940K Aug 12 13:19 hadoop-aws-3.3.2.jar rw-rr- 1 maks staff 19M Aug 12 13:19 hadoop-client-api-3.3.2.jar rw-rr- 1 maks staff 29M Aug 12 13:19 hadoop-client-runtime-3.3.2.jar rw-rr- 1 maks staff 3.2M Aug 12 14:02 hadoop-shaded-guava-1.1.1.jar rw-rr- 1 maks staff 55K Aug 12 14:02 hadoop-yarn-server-web-proxy-3.3.2.jar rw-rr- 1 maks staff 231K Aug 12 13:25 hive-storage-api-2.7.2.jar rw-rr- 1 maks staff 196K Aug 12 13:19 hk2-api-2.6.1.jar rw-rr- 1 maks staff 199K Aug 12 13:19 hk2-locator-2.6.1.jar rw-rr- 1 maks staff 129K Aug 12 13:19 hk2-utils-2.6.1.jar rw-rr- 1 maks staff 27K Aug 12 13:28 istack-commons-runtime-3.0.8.jar rw-rr- 1 maks staff 1.3M Aug 12 13:19 ivy-2.5.0.jar rw-rr- 1 maks staff 74K Aug 12 13:18 jackson-annotations-2.13.3.jar rw-rr- 1 maks staff 366K Aug 12 13:18 jackson-core-2.13.3.jar rw-rr- 1 maks staff 1.5M Aug 12 13:18 jackson-databind-2.13.3.jar rw-rr- 1 maks staff 448K Aug 12 13:19 jackson-module-scala_2.12-2.13.3.jar rw-rr- 1 maks staff 24K Aug 12 13:19 jakarta.annotation-api-1.3.5.jar rw-rr- 1 maks staff 18K Aug 12 13:19 jakarta.inject-2.6.1.jar rw-rr- 1 maks staff 81K Aug 12 13:19 jakarta.servlet-api-4.0.3.jar rw-rr- 1 maks staff 90K Aug 12 13:19 jakarta.validation-api-2.0.2.jar rw-rr- 1 maks staff 137K Aug 12 13:19 jakarta.ws.rs-api-2.1.6.jar rw-rr- 1 maks staff 113K Aug 12 13:28 jakarta.xml.bind-api-2.3.2.jar rw-rr- 1 maks staff 905K Aug 12 13:23 janino-3.0.16.jar rw-rr- 1 maks staff 762K Aug 12 13:19 javassist-3.25.0-GA.jar rw-rr- 1 maks staff 990K Aug 12 13:28 jaxb-runtime-2.3.2.jar rw-rr- 1 maks staff 16K Aug 12 13:19 jcl-over-slf4j-1.7.32.jar rw-rr- 1 maks staff 253K Aug 12 13:19 jersey-client-2.34.jar rw-rr- 1 maks staff 1.1M Aug 12 13:19 jersey-common-2.34.jar rw-rr- 1 maks staff 32K Aug 12 13:19 jersey-container-servlet-2.34.jar rw-rr- 1 maks staff 72K Aug 12 13:19 jersey-container-servlet-core-2.34.jar rw-rr- 1 maks staff 75K Aug 12 13:19 jersey-hk2-2.34.jar rw-rr- 1 maks staff 925K Aug 12 13:19 jersey-server-2.34.jar rw-rr- 1 maks staff 88K Aug 12 13:19 json4s-ast_2.12-3.7.0-M11.jar rw-rr- 1 maks staff 514K Aug 12 13:19 json4s-core_2.12-3.7.0-M11.jar rw-rr- 1 maks staff 36K Aug 12 13:19 json4s-jackson_2.12-3.7.0-M11.jar rw-rr- 1 maks staff 340K Aug 12 13:19 json4s-scalap_2.12-3.7.0-M11.jar rw-rr- 1 maks staff 32K Aug 12 13:18 jsr305-3.0.0.jar rw-rr- 1 maks staff 4.5K Aug 12 13:19 jul-to-slf4j-1.7.32.jar rw-rr- 1 maks staff 401K Oct 4 2021 kryo-shaded-4.0.2.jar rw-rr- 1 maks staff 794K Aug 12 13:28 lapack-2.2.1.jar rw-rr- 1 maks staff 1.0M Oct 4 2021 leveldbjni-all-1.8.jar rw-rr- 1 maks staff 296K Aug 12 13:18 log4j-1.2-api-2.17.2.jar rw-rr- 1 maks staff 295K Aug 12 13:18 log4j-api-2.17.2.jar rw-rr- 1 maks staff 1.7M Aug 12 13:18 log4j-core-2.17.2.jar rw-rr- 1 maks staff 24K Aug 12 13:18 log4j-slf4j-impl-2.17.2.jar rw-rr- 1 maks staff 667K Aug 12 13:19 lz4-java-1.8.0.jar rw-rr- 1 maks staff 123K Aug 12 13:18 metrics-core-4.2.7.jar rw-rr- 1 maks staff 23K Aug 12 13:19 metrics-graphite-4.2.7.jar rw-rr- 1 maks staff 21K Aug 12 13:19 metrics-jmx-4.2.7.jar rw-rr- 1 maks staff 16K Aug 12 13:19 metrics-json-4.2.7.jar rw-rr- 1 maks staff 24K Aug 12 13:19 metrics-jvm-4.2.7.jar rw-rr- 1 maks staff 5.6K Oct 4 2021 minlog-1.3.0.jar rw-rr- 1 maks staff 4.3K Aug 12 13:18 netty-all-4.1.74.Final.jar rw-rr- 1 maks staff 296K Aug 12 13:18 netty-buffer-4.1.74.Final.jar rw-rr- 1 maks staff 329K Aug 12 13:18 netty-codec-4.1.74.Final.jar rw-rr- 1 maks staff 635K Aug 12 13:18 netty-common-4.1.74.Final.jar rw-rr- 1 maks staff 516K Aug 12 13:18 netty-handler-4.1.74.Final.jar rw-rr- 1 maks staff 37K Aug 12 13:18 netty-resolver-4.1.74.Final.jar rw-rr- 1 maks staff 34K Aug 12 13:18 netty-tcnative-classes-2.0.48.Final.jar rw-rr- 1 maks staff 468K Aug 12 13:18 netty-transport-4.1.74.Final.jar rw-rr- 1 maks staff 135K Aug 12 13:18 netty-transport-classes-epoll-4.1.74.Final.jar rw-rr- 1 maks staff 105K Aug 12 13:18 netty-transport-classes-kqueue-4.1.74.Final.jar rw-rr- 1 maks staff 37K Aug 12 13:18 netty-transport-native-epoll-4.1.74.Final-linux-aarch_64.jar rw-rr- 1 maks staff 35K Aug 12 13:18 netty-transport-native-epoll-4.1.74.Final-linux-x86_64.jar rw-rr- 1 maks staff 24K Aug 12 13:18 netty-transport-native-kqueue-4.1.74.Final-osx-aarch_64.jar rw-rr- 1 maks staff 25K Aug 12 13:18 netty-transport-native-kqueue-4.1.74.Final-osx-x86_64.jar rw-rr- 1 maks staff 39K Aug 12 13:18 netty-transport-native-unix-common-4.1.74.Final.jar rw-rr- 1 maks staff 48K Aug 12 13:18 objenesis-3.2.jar rw-rr- 1 maks staff 19K Oct 4 2021 opencsv-2.3.jar rw-rr- 1 maks staff 1.0M Aug 12 13:25 orc-core-1.7.4.jar rw-rr- 1 maks staff 47K Aug 12 13:25 orc-mapreduce-1.7.4.jar rw-rr- 1 maks staff 29K Aug 12 13:25 orc-shims-1.7.4.jar rw-rr- 1 maks staff 64K Jan 15 2021 oro-2.0.8.jar rw-rr- 1 maks staff 19K Aug 12 13:19 osgi-resource-locator-1.0.3.jar rw-rr- 1 maks staff 34K Oct 4 2021 paranamer-2.8.jar rw-rr- 1 maks staff 1.9M Aug 12 13:25 parquet-column-1.12.2.jar rw-rr- 1 maks staff 94K Aug 12 13:25 parquet-common-1.12.2.jar rw-rr- 1 maks staff 829K Aug 12 13:25 parquet-encoding-1.12.2.jar rw-rr- 1 maks staff 691K Aug 12 13:25 parquet-format-structures-1.12.2.jar rw-rr- 1 maks staff 955K Aug 12 13:25 parquet-hadoop-1.12.2.jar rw-rr- 1 maks staff 1.8M Aug 12 13:25 parquet-jackson-1.12.2.jar rw-rr- 1 maks staff 53K Aug 12 13:19 pickle-1.2.jar rw-rr- 1 maks staff 521K Oct 4 2021 protobuf-java-2.5.0.jar rw-rr- 1 maks staff 120K Aug 12 13:19 py4j-0.10.9.5.jar rw-rr- 1 maks staff 34M Aug 12 13:18 rocksdbjni-6.20.3.jar rw-rr- 1 maks staff 110K Aug 12 13:21 scala-collection-compat_2.12-2.1.1.jar rw-rr- 1 maks staff 10M Aug 12 13:18 scala-compiler-2.12.15.jar rw-rr- 1 maks staff 5.2M Aug 12 13:18 scala-library-2.12.15.jar rw-rr- 1 maks staff 218K Aug 12 13:18 scala-parser-combinators_2.12-1.1.2.jar rw-rr- 1 maks staff 3.5M Aug 12 13:18 scala-reflect-2.12.15.jar rw-rr- 1 maks staff 544K Aug 12 13:18 scala-xml_2.12-1.2.0.jar rw-rr- 1 maks staff 3.1M Aug 12 13:21 shapeless_2.12-2.3.7.jar rw-rr- 1 maks staff 2.5K Aug 12 13:18 shims-0.9.25.jar rw-rr- 1 maks staff 41K Aug 12 13:18 slf4j-api-1.7.32.jar rw-rr- 1 maks staff 1.9M Aug 12 13:19 snappy-java-1.1.8.4.jar rw-rr- 1 maks staff 12M Aug 12 14:19 spark-catalyst_2.12-3.3.0.jar rw-rr- 1 maks staff 10M Aug 12 14:16 spark-core_2.12-3.3.0.jar rw-rr- 1 maks staff 424K Aug 12 14:16 spark-graphx_2.12-3.3.0.jar rw-rr- 1 maks staff 82K Aug 12 14:13 spark-kvstore_2.12-3.3.0.jar rw-rr- 1 maks staff 76K Aug 12 14:14 spark-launcher_2.12-3.3.0.jar rw-rr- 1 maks staff 113K Aug 12 14:16 spark-mllib-local_2.12-3.3.0.jar rw-rr- 1 maks staff 5.9M Aug 12 14:37 spark-mllib_2.12-3.3.0.jar rw-rr- 1 maks staff 2.3M Aug 12 14:14 spark-network-common_2.12-3.3.0.jar rw-rr- 1 maks staff 156K Aug 12 14:14 spark-network-shuffle_2.12-3.3.0.jar rw-rr- 1 maks staff 50K Aug 12 14:40 spark-repl_2.12-3.3.0.jar rw-rr- 1 maks staff 30K Aug 12 14:13 spark-sketch_2.12-3.3.0.jar rw-rr- 1 maks staff 8.4M Aug 12 14:30 spark-sql_2.12-3.3.0.jar rw-rr- 1 maks staff 1.1M Aug 12 14:17 spark-streaming_2.12-3.3.0.jar rw-rr- 1 maks staff 15K Aug 12 14:13 spark-tags_2.12-3.3.0.jar rw-rr- 1 maks staff 52K Aug 12 14:14 spark-unsafe_2.12-3.3.0.jar rw-rr- 1 maks staff 349K Aug 12 14:49 spark-yarn_2.12-3.3.0.jar rw-rr- 1 maks staff 112K Aug 12 13:21 spire-macros_2.12-0.17.0.jar rw-rr- 1 maks staff 8.3K Aug 12 13:21 spire-platform_2.12-0.17.0.jar rw-rr- 1 maks staff 34K Aug 12 13:21 spire-util_2.12-0.17.0.jar rw-rr- 1 maks staff 6.9M Aug 12 13:21 spire_2.12-0.17.0.jar rw-rr- 1 maks staff 174K Aug 12 13:19 stream-2.9.6.jar rw-rr- 1 maks staff 228K Aug 12 13:25 threeten-extra-1.5.0.jar rw-rr- 1 maks staff 1.3M Aug 12 13:18 tink-1.6.1.jar rw-rr- 1 maks staff 437K Aug 12 13:23 univocity-parsers-2.9.1.jar rw-rr- 1 maks staff 414K Aug 12 13:19 wildfly-openssl-1.0.7.Final.jar rw-rr- 1 maks staff 288K Aug 12 13:19 xbean-asm9-shaded-4.20.jar rw-rr- 1 maks staff 106K Aug 12 13:18 xz-1.8.jar rw-rr- 1 maks staff 1.2M Aug 12 13:19 zookeeper-3.6.2.jar rw-rr- 1 maks staff 245K Aug 12 13:19 zookeeper-jute-3.6.2.jar rw-rr- 1 maks staff 5.6M Aug 12 13:19 zstd-jni-1.5.2-1.jar{code} And attempting to read in Spark from s3 from one of our buckets with no dot in the name threw the error. I also tried the same with 3.2.1. > Certificate doesn't match any of the subject alternative names: > [*.s3.amazonaws.com, s3.amazonaws.com] > ------------------------------------------------------------------------------------------------------ > > Key: SPARK-38330 > URL: https://issues.apache.org/jira/browse/SPARK-38330 > Project: Spark > Issue Type: Bug > Components: EC2 > Affects Versions: 3.2.1 > Environment: Spark 3.2.1 built with `hadoop-cloud` flag. > Direct access to s3 using default file committer. > JDK8. > > Reporter: André F. > Priority: Major > > Trying to run any job after bumping our Spark version from 3.1.2 to 3.2.1, > lead us to the current exception while reading files on s3: > {code:java} > org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on > s3a://<bucket>/<path>.parquet: com.amazonaws.SdkClientException: Unable to > execute HTTP request: Certificate for <bucket.s3.amazonaws.com> doesn't match > any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]: > Unable to execute HTTP request: Certificate for <bucket> doesn't match any of > the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com] at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208) at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:170) at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3351) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3185) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.isDirectory(S3AFileSystem.java:4277) > at > org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370) > at > org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274) > at > org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245) > at scala.Option.getOrElse(Option.scala:189) at > org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245) at > org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:596) {code} > > {code:java} > Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for > <bucket.s3.amazonaws.com> doesn't match any of the subject alternative names: > [*.s3.amazonaws.com, s3.amazonaws.com] > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:507) > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:437) > at > com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384) > at > com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) > at > com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376) > at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76) > at com.amazonaws.http.conn.$Proxy16.connect(Unknown Source) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236) > at > com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) > at > com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) > at > com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) > {code} > We found similar problems in the following tickets but: > - https://issues.apache.org/jira/browse/HADOOP-17017 (we don't use `.` in > our bucket names) > - [https://github.com/aws/aws-sdk-java-v2/issues/1786] (we tried to override > it by building Spark with `httpclient:4.5.10` or `httpclient:4.5.8`, with no > effect. We also made sure we are using the same `httpclient` version on our > main jar). -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org