Hi Mich,
Spark 3.4.0 prebuilt with scala 2.13 is built with version 2.13.8. Since you
are using spark-core_2.13 and spark-sql_2.13, you should stick to the major(13)
and the minor version(8). Not using any of these may cause unexpected
behaviour(though scala claims compatibility among minor version changes, I've
encountered problem using the scala package with the same major version and
different minor version. That may due to bug fixes and upgrade of scala
itself.).
And although I did not encountered such problem, this can be a a pitfall for
you.
Best Regards!
...........................................................................
Lingzhe Sun
Hirain Technology
From: Mich Talebzadeh
Date: 2023-05-29 17:55
To: Bjørn Jørgensen
CC: user @spark
Subject: Re: maven with Spark 3.4.0 fails compilation
Thanks for your helpful comments Bjorn.
I managed to compile the code with maven but when it run it fails with
Application is ReduceByKey
Exception in thread "main" java.lang.NoSuchMethodError:
scala.package$.Seq()Lscala/collection/immutable/Seq$;
at ReduceByKey$.main(ReduceByKey.scala:23)
at ReduceByKey.main(ReduceByKey.scala)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I attach the pom.xml and the sample scala code is self contained and basic.
Again it runs with SBT with no issues.
FYI, my scala version on host is
scala -version
Scala code runner version 2.13.6 -- Copyright 2002-2021, LAMP/EPFL and
Lightbend, Inc.
I think I have a scala incompatible somewhere again
Cheers
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom
view my Linkedin profile
https://en.everybodywiki.com/Mich_Talebzadeh
Disclaimer: Use it at your own risk. Any and all responsibility for any loss,
damage or destruction of data or any other property which may arise from
relying on this email's technical content is explicitly disclaimed. The author
will in no case be liable for any monetary damages arising from such loss,
damage or destruction.
On Sun, 28 May 2023 at 20:29, Bjørn Jørgensen <[email protected]> wrote:
From chatgpt4
The problem appears to be that there is a mismatch between the version of Scala
used by the Scala Maven plugin and the version of the Scala library defined as
a dependency in your POM. You've defined your Scala version in your properties
as `2.12.17` but you're pulling in `scala-library` version `2.13.6` as a
dependency.
The Scala Maven plugin will be using the Scala version defined in the
`scala.version` property for compilation, but then it tries to load classes
from a different Scala version, hence the error.
To resolve this issue, make sure the `scala.version` property matches the
version of `scala-library` defined in your dependencies. In your case, you may
want to change `scala.version` to `2.13.6`.
Here's the corrected part of your POM:
```xml
<properties>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.13.6</scala.version> <!-- Here's the change -->
<maven-scala-plugin.version>2.15.2</maven-scala-plugin.version>
</properties>
```
Additionally, ensure that the Scala versions in the Spark dependencies match
the `scala.version` property as well. If you've updated the Scala version to
`2.13.6`, the artifactIds for Spark dependencies should be `spark-core_2.13`
and `spark-sql_2.13`.
Another thing to consider: your Java version defined in `maven.compiler.source`
and `maven.compiler.target` is `1.7`, which is quite outdated and might not be
compatible with the latest versions of these libraries. Consider updating to a
more recent version of Java, such as Java 8 or above, depending on the
requirements of the libraries you're using.
The same problem persists in this updated POM file - there's a mismatch in the
Scala version declared in the properties and the version used in your
dependencies. Here's what you need to update:
1. Update the Scala version in your properties to match the Scala library and
your Spark dependencies:
```xml
<properties>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.13.6</scala.version> <!-- updated Scala version -->
<maven-scala-plugin.version>2.15.2</maven-scala-plugin.version>
</properties>
```
2. Make sure all your Spark dependencies use the same Scala version. In this
case, I see `spark-streaming-kafka_2.11` which should be
`spark-streaming-kafka_2.13` if you're using Scala `2.13.6`.
```xml
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.13</artifactId> <!-- updated to 2.13 -->
<version>1.6.3</version> <!-- this might also need to be updated as this is
a very old version -->
<scope>provided</scope>
</dependency>
```
3. As mentioned in the previous message, your Java version
(`maven.compiler.source` and `maven.compiler.target`) is also quite outdated.
Depending on the requirements of the libraries you're using, you might want to
update this to a newer version, such as Java 8 or above.
Finally, ensure that the correct versions of these libraries exist in your
Maven repository or are available in the central Maven repository. If the
versions don't match, Maven will not be able to find and download the correct
dependencies, which can lead to problems.
Please note that it's crucial to maintain consistency in your Scala and Java
versions across your project and its dependencies to avoid these kinds of
issues. Mixing versions can lead to binary incompatibility errors, such as the
one you're seeing.
The differences in behavior between SBT and Maven when resolving Scala
dependencies can be attributed to how they each handle Scala binary versions.
SBT is specifically designed for Scala projects and has built-in support for
handling Scala's binary compatibility issues. When you define a Scala library
dependency in SBT, you can specify the Scala binary version as "_2.12",
"_2.13", etc. in the library artifact name. SBT will then automatically replace
this with the actual Scala binary version defined in your SBT configuration.
So, if you've defined different Scala versions for your project and for a
specific dependency, SBT will handle this gracefully.
Maven, on the other hand, is more generic and does not have the same level of
built-in support for Scala's binary versions. In Maven, the Scala version is
typically hardcoded in the artifact name, and if this doesn't match the actual
Scala version used in your project, it can lead to binary compatibility issues.
This difference between SBT and Maven means that a project can work fine when
built with SBT but fail when built with Maven, due to these Scala version
discrepancies. To avoid these issues in Maven, you need to ensure that the
Scala version is consistent across your project configuration and all your
dependencies.
Also, another reason for this can be because SBT often fetches dependencies on
the fly, while Maven requires them to be explicitly declared. SBT can download
and link the correct Scala library version even if it's not explicitly
declared, while Maven will only use the versions that have been specified in
the POM file.
To summarize, SBT is more flexible and Scala-oriented than Maven, which can
lead to different behavior when handling Scala dependencies. When using Maven
for Scala projects, it's essential to ensure that all Scala versions match
across the project.
søn. 28. mai 2023 kl. 21:03 skrev Mich Talebzadeh <[email protected]>:
This compilation works fine with SBT but fails with maven!
Spark version 3.4.0
Apache Maven 3.9.2 (c9616018c7a021c1c39be70fb2843d6f5f9b8a1c)
Java version: 11.0.1, vendor: Oracle Corporation, runtime: /opt/jdk-11.0.1
This from the pom.xml file
<project xmlns="https://maven.apache.org/POM/4.0.0"
xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://maven.apache.org/POM/4.0.0
https://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>spark</groupId>
<version>3.0</version>
<name>${project.artifactId}</name>
<properties>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.12.17</scala.version>
<maven-scala-plugin.version>2.15.2</maven-scala-plugin.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.13.6</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.13</artifactId>
<version>3.4.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.13</artifactId>
<version>3.4.0</version>
<scope>provided</scope>
The pom file is attached. These are the errors I am getting
[ERROR] error: error while loading package, class file
'/home/hduser/.m2/repository/org/scala-lang/scala-library/2.13.6/scala-library-2.13.6.jar(scala/reflect/package.class)'
is broken
[ERROR] (class java.lang.RuntimeException/error reading Scala signature of
package.class: Scala signature package has wrong version
[ERROR] error: error while loading package, class file
'/home/hduser/.m2/repository/org/scala-lang/scala-library/2.13.6/scala-library-2.13.6.jar(scala/package.class)'
is broken
[ERROR] (class java.lang.RuntimeException/error reading Scala signature of
package.class: Scala signature package has wrong version
[ERROR] error: error while loading package, class file
'/home/hduser/.m2/repository/org/scala-lang/scala-library/2.13.6/scala-library-2.13.6.jar(scala/collection/package.class)'
is broken
[ERROR] (class java.lang.RuntimeException/error reading Scala signature of
package.class: Scala signature package has wrong version
[ERROR] three errors found
[ERROR] Failed to execute goal
org.scala-tools:maven-scala-plugin:2.15.2:compile (default) on project scala:
wrap: org.apache.commons.exec.ExecuteException: Process exited with an error:
1(Exit value: 1) -> [Help 1]
Any ideas will be appreciated.
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom
view my Linkedin profile
https://en.everybodywiki.com/Mich_Talebzadeh
Disclaimer: Use it at your own risk. Any and all responsibility for any loss,
damage or destruction of data or any other property which may arise from
relying on this email's technical content is explicitly disclaimed. The author
will in no case be liable for any monetary damages arising from such loss,
damage or destruction.
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]
--
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge
+47 480 94 297