Hi,
I get a NoClassDefFoundError from IcebergSparkExtensions when running Spark
3.3, with iceberg-spark-runtime-3.3_2.12-1.0.0.jar. I noticed this jar
doesn't contain scala classes, unlike previous jars
iceberg-spark-runtime-3.3_2.12-0.14.1.jar.
scala> spark.sql("show databases").show
java.lang.NoClassDefFoundError: scala/collection/SeqOps
at
org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions.$anonfun$apply$2(IcebergSparkSessionExtensions.scala:50)
at
org.apache.spark.sql.SparkSessionExtensions.$anonfun$buildResolutionRules$1(SparkSessionExtensions.scala:152)
at
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
at
scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at scala.collection.TraversableLike.map(TraversableLike.scala:286)
at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
at scala.collection.AbstractTraversable.map(Traversable.scala:108)
at
org.apache.spark.sql.SparkSessionExtensions.buildResolutionRules(SparkSessionExtensions.scala:152)
at
org.apache.spark.sql.internal.BaseSessionStateBuilder.customResolutionRules(BaseSessionStateBuilder.scala:216)
at
org.apache.spark.sql.hive.HiveSessionStateBuilder$$anon$1.<init>(HiveSessionStateBuilder.scala:94)
at
org.apache.spark.sql.hive.HiveSessionStateBuilder.analyzer(HiveSessionStateBuilder.scala:85)
at
org.apache.spark.sql.internal.BaseSessionStateBuilder.$anonfun$build$2(BaseSessionStateBuilder.scala:360)
at
org.apache.spark.sql.internal.SessionState.analyzer$lzycompute(SessionState.scala:87)
at
org.apache.spark.sql.internal.SessionState.analyzer(SessionState.scala:87)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76)
at
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185)
at
org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184)
at
org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76)
at
org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
at
org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
at
org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
... 47 elided
Caused by: java.lang.ClassNotFoundException: scala.collection.SeqOps
at
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
... 79 more
Note, I usually verify by copying the spark-runtime jar to spark jars dir
(can't usually get --packages flag to work as indicated on
https://iceberg.apache.org/how-to-release/#verifying-with-spark, as version
is not released yet), so let me know if I am using the wrong jar?
Thanks
Szehon
On Mon, Oct 10, 2022 at 9:22 AM Eduard Tudenhoefner <[email protected]>
wrote:
> +1 (non-binding)
>
> - validated checksum and signature
> - checked license docs & ran RAT checks
> - ran build and tests with JDK11
>
>
> Eduard
>
> On Mon, Oct 10, 2022 at 8:01 AM Ajantha Bhat <[email protected]>
> wrote:
>
>> +1 (non-binding)
>>
>>
>> - Verified the Spark runtime jar contents.
>> - Checked license docs, ran RAT checks.
>> - Validated checksum and signature.
>>
>>
>> Thanks,
>> Ajantha
>>
>> On Mon, Oct 10, 2022 at 10:45 AM Prashant Singh <[email protected]>
>> wrote:
>>
>>> Hello Everyone,
>>>
>>> Wanted to know your thoughts on whether we should also include the
>>> following bug fixes in this release as well:
>>>
>>> 1. MERGE INTO nullability fix, leads to query failure otherwise:
>>> *Reported instances :*
>>> a.
>>> https://stackoverflow.com/questions/73424454/spark-iceberg-merge-into-issue-caused-by-org-apache-spark-sql-analysisexcep
>>> b. https://github.com/apache/iceberg/issues/5739
>>> c. https://github.com/apache/iceberg/issues/5424#issuecomment-1220688298
>>>
>>> *PR's (Merged):*
>>> a. https://github.com/apache/iceberg/pull/5880
>>> b. https://github.com/apache/iceberg/pull/5679
>>>
>>> 2. QueryFailure when running RewriteManifestProcedure on Date /
>>> Timestamp partitioned table when
>>> `spark.sql.datetime.java8API.enabled` is true.
>>> *Reported instances :*
>>> a. https://github.com/apache/iceberg/issues/5104
>>> b.
>>> https://apache-iceberg.slack.com/archives/C025PH0G1D4/p1663982635731469
>>>
>>> *PR* :
>>> a. https://github.com/apache/iceberg/pull/5860
>>>
>>> Regards,
>>> Prashant Singh
>>>
>>> On Mon, Oct 10, 2022 at 4:15 AM Ryan Blue <[email protected]> wrote:
>>>
>>>> +1 (binding)
>>>>
>>>> - Checked license docs, ran RAT checks
>>>> - Validated checksum and signature
>>>> - Built and tested with Java 11
>>>> - Built binary artifacts with Java 8
>>>>
>>>>
>>>> On Sun, Oct 9, 2022 at 3:42 PM Ryan Blue <[email protected]> wrote:
>>>>
>>>>> Hi Everyone,
>>>>>
>>>>> I propose that we release the following RC as the official Apache
>>>>> Iceberg 1.0.0 release.
>>>>>
>>>>> The commit ID is e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01
>>>>> * This corresponds to the tag: apache-iceberg-1.0.0-rc0
>>>>> * https://github.com/apache/iceberg/commits/apache-iceberg-1.0.0-rc0
>>>>> *
>>>>> https://github.com/apache/iceberg/tree/e2bb9ad7e792efca419fa7c4a1afde7c4c44fa01
>>>>>
>>>>> The release tarball, signature, and checksums are here:
>>>>> *
>>>>> https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.0.0-rc0
>>>>>
>>>>> You can find the KEYS file here:
>>>>> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>>>>>
>>>>> Convenience binary artifacts are staged on Nexus. The Maven repository
>>>>> URL is:
>>>>> *
>>>>> https://repository.apache.org/content/repositories/orgapacheiceberg-1106/
>>>>>
>>>>> Please download, verify, and test.
>>>>>
>>>>> This release is based on the latest 0.14.1 release. It includes
>>>>> changes to remove deprecated APIs and the following additional bug fixes:
>>>>> * Increase metrics limit to 100 columns
>>>>> * Bump Spark patch versions for CVE-2022-33891
>>>>> * Exclude Scala from Spark runtime Jars
>>>>>
>>>>> Please vote in the next 72 hours.
>>>>>
>>>>> [ ] +1 Release this as Apache Iceberg 1.0.0
>>>>> [ ] +0
>>>>> [ ] -1 Do not release this because...
>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>>
>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>>
>>>