[ https://issues.apache.org/jira/browse/SPARK-28903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-28903. ------------------------------- Fix Version/s: 3.0.0 2.4.5 Resolution: Fixed Issue resolved by pull request 25559 [https://github.com/apache/spark/pull/25559] > Fix AWS JDK version conflict that breaks Pyspark Kinesis tests > -------------------------------------------------------------- > > Key: SPARK-28903 > URL: https://issues.apache.org/jira/browse/SPARK-28903 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 3.0.0, 2.4.3 > Reporter: Sean Owen > Assignee: Sean Owen > Priority: Major > Fix For: 2.4.5, 3.0.0 > > > The Pyspark Kinesis tests are failing, at least in master: > {code} > ====================================================================== > ERROR: test_kinesis_stream > (pyspark.streaming.tests.test_kinesis.KinesisStreamTests) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/pyspark/streaming/tests/test_kinesis.py", > line 44, in test_kinesis_stream > kinesisTestUtils = > self.ssc._jvm.org.apache.spark.streaming.kinesis.KinesisTestUtils(2) > File > "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", > line 1554, in __call__ > answer, self._gateway_client, None, self._fqn) > File > "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/lib/py4j-0.10.8.1-src.zip/py4j/protocol.py", > line 328, in get_return_value > format(target_id, ".", name), value) > Py4JJavaError: An error occurred while calling > None.org.apache.spark.streaming.kinesis.KinesisTestUtils. > : java.lang.NoSuchMethodError: > com.amazonaws.regions.Region.getAvailableEndpoints()Ljava/util/Collection; > at > org.apache.spark.streaming.kinesis.KinesisTestUtils$.$anonfun$getRegionNameByEndpoint$1(KinesisTestUtils.scala:211) > at > org.apache.spark.streaming.kinesis.KinesisTestUtils$.$anonfun$getRegionNameByEndpoint$1$adapted(KinesisTestUtils.scala:211) > at scala.collection.Iterator.find(Iterator.scala:993) > at scala.collection.Iterator.find$(Iterator.scala:990) > at scala.collection.AbstractIterator.find(Iterator.scala:1429) > at scala.collection.IterableLike.find(IterableLike.scala:81) > at scala.collection.IterableLike.find$(IterableLike.scala:80) > at scala.collection.AbstractIterable.find(Iterable.scala:56) > at > org.apache.spark.streaming.kinesis.KinesisTestUtils$.getRegionNameByEndpoint(KinesisTestUtils.scala:211) > at > org.apache.spark.streaming.kinesis.KinesisTestUtils.<init>(KinesisTestUtils.scala:46) > ... > {code} > The non-Python Kinesis tests are fine though. It turns out that this is > because Pyspark tests use the output of the Spark assembly, and it pulls in > hadoop-cloud, which in turn pulls in an old AWS Java SDK. > Per [~ste...@apache.org], it seems like we can just resolve this by excluding > the aws-java-sdk dependency. See the attached PR for some more detail about > the debugging and other options. -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org