Re: [Spark Structured Streaming]: Dynamic Scaling of Executors

2023-05-29 Thread Aishwarya Panicker
Hi,


Thanks for your response.


I understand there is no explicit way to configure dynamic scaling for
Spark Structured Streaming as the ticket is still open for that. But is
there a way to manage dynamic scaling with the existing Batch Dynamic
scaling algorithm as this kicks in when Dynamic allocation is enabled with
Structured Streaming. The issue I’m facing with batch dynamic allocation is
that it requests executors based on pending/running tasks. And to have
parallelism we have set spark.sql.shuffle.partitions: "100"  due to which
100 partitions are getting created and thus 100 tasks which is causing more
executors to be requested(not scaling based on load). Is there mechanism to
control this autoscaling behaviour of executors based on data load?


Additionally, Spark Streaming dynamic allocation algorithm autoscales
executors based on the processing time/ batch interval ratio which would be
a preferred method for streaming use case. So is there a provision to use
the streaming configurations instead of the batch mode configurations with
structured streaming?


Any suggestions on the above would be helpful.


Thanks and Regards,

Aishwarya


On Thu, 25 May, 2023, 11:46 PM Mich Talebzadeh, 
wrote:

> Hi,
> Autoscaling is not compatible with Spark Structured Streaming
> 
>  since
> Spark Structured Streaming currently does not support dynamic allocation
> (see SPARK-24815: Structured Streaming should support dynamic allocation
> ).
>
> That ticket is still open
>
> HTH
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies Limited
> London
> United Kingdom
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 25 May 2023 at 18:44, Aishwarya Panicker <
> aishwaryapanicke...@gmail.com> wrote:
>
>> Hi Team,
>>
>> I have been working on Spark Structured Streaming and trying to autoscale
>> our application through dynamic allocation. But I couldn't find any
>> documentation or configurations that supports dynamic scaling in Spark
>> Structured Streaming, due to which I had been using Spark Batch mode
>> dynamic scaling which is not so efficient with streaming use case.
>>
>> I also tried with Spark streaming dynamic allocation configurations which
>> didn't work with structured streaming.
>>
>> Below are the configurations I tried for dynamic scaling of my Spark
>> Structured Streaming Application:
>>
>> With Batch Spark configurations:
>>
>> spark.dynamicAllocation.enabled: true
>> spark.dynamicAllocation.executorAllocationRatio: 0.5
>> spark.dynamicAllocation.minExecutors: 1
>> spark.dynamicAllocation.maxExecutors: 5
>>
>>
>> With Streaming Spark configurations:
>>
>> spark.dynamicAllocation.enabled: false
>> spark.streaming.dynamicAllocation.enabled: true
>> spark.streaming.dynamicAllocation.scaleUpRatio: 0.7
>> spark.streaming.dynamicAllocation.scaleDownRatio: 0.2
>> spark.streaming.dynamicAllocation.minExecutors: 1
>> spark.streaming.dynamicAllocation.maxExecutors: 5
>>
>> Kindly let me know if there is any configuration for the dynamic
>> allocation of Spark Structured Streaming which I'm missing due to which
>> autoscaling of my application is not working properly.
>>
>> Awaiting your response.
>>
>> Thanks and Regards,
>> Aishwarya
>>
>>
>>
>>
>>


Re: Re: maven with Spark 3.4.0 fails compilation

2023-05-29 Thread Bjørn Jørgensen
2.13.8


you must change 2.13.6 to 2.13.8


man. 29. mai 2023 kl. 18:02 skrev Mich Talebzadeh :

> Thanks everyone. Still not much progress :(. It is becoming a bit
> confusing as I am getting this error
>
> Compiling ReduceByKey
> [INFO] Scanning for projects...
> [INFO]
> [INFO] -< spark:ReduceByKey
> >--
> [INFO] Building ReduceByKey 3.0
> [INFO]   from pom.xml
> [INFO] [ jar
> ]-
> [INFO]
> [INFO] --- resources:3.3.0:resources (default-resources) @ ReduceByKey ---
> [WARNING] Using platform encoding (ANSI_X3.4-1968 actually) to copy
> filtered resources, i.e. build is platform dependent!
> [INFO] skip non existing resourceDirectory
> /data6/hduser/scala/ReduceByKey/src/main/resources
> [INFO]
> [INFO] --- compiler:3.10.1:compile (default-compile) @ ReduceByKey ---
> [INFO] Nothing to compile - all classes are up to date
> [INFO]
> [INFO] --- scala:2.15.2:compile (default) @ ReduceByKey ---
> [INFO] Checking for multiple versions of scala
> [WARNING]  Expected all dependencies to require Scala version: 2.13.8
> [WARNING]  spark:ReduceByKey:3.0 requires scala version: 2.13.8
> [WARNING]  org.scala-lang.modules:scala-parallel-collections_2.13:1.0.4
> requires scala version: 2.13.6
> [WARNING] Multiple versions of scala libraries detected!
> [INFO] includes = [**/*.java,**/*.scala,]
> [INFO] excludes = []
> [INFO] Nothing to compile - all classes are up to date
> [INFO]
> [INFO] --- resources:3.3.0:testResources (default-testResources) @
> ReduceByKey ---
> [WARNING] Using platform encoding (ANSI_X3.4-1968 actually) to copy
> filtered resources, i.e. build is platform dependent!
> [INFO] skip non existing resourceDirectory
> /data6/hduser/scala/ReduceByKey/src/test/resources
> [INFO]
> [INFO] --- compiler:3.10.1:testCompile (default-testCompile) @ ReduceByKey
> ---
> [INFO] No sources to compile
> [INFO]
> [INFO] --- surefire:3.0.0:test (default-test) @ ReduceByKey ---
> [INFO] No tests to run.
> [INFO]
> [INFO] --- jar:3.3.0:jar (default-jar) @ ReduceByKey ---
> [INFO] Building jar: /data6/hduser/scala/ReduceByKey/target/ReduceByKey.jar
> [INFO]
> [INFO] --- shade:1.6:shade (default) @ ReduceByKey ---
> [INFO] Including org.scala-lang:scala-library:jar:2.13.8 in the shaded jar.
> [INFO] Replacing original artifact with shaded artifact.
> [INFO] Replacing /data6/hduser/scala/ReduceByKey/target/ReduceByKey.jar
> with /data6/hduser/scala/ReduceByKey/target/ReduceByKey-3.0-shaded.jar
> [INFO]
> 
> [INFO] BUILD SUCCESS
> [INFO]
> 
> [INFO] Total time:  3.321 s
> [INFO] Finished at: 2023-05-29T16:54:47+01:00
> [INFO]
> 
> [WARNING]
> [WARNING] Plugin validation issues were detected in 4 plugin(s)
> [WARNING]
> [WARNING]  * org.scala-tools:maven-scala-plugin:2.15.2
> [WARNING]  * org.apache.maven.plugins:maven-compiler-plugin:3.10.1
> [WARNING]  * org.apache.maven.plugins:maven-shade-plugin:1.6
> [WARNING]  * org.apache.maven.plugins:maven-resources-plugin:3.3.0
> [WARNING]
> [WARNING] For more or less details, use 'maven.plugin.validation' property
> with one of the values (case insensitive): [BRIEF, DEFAULT, VERBOSE]
> [WARNING]
> Completed compiling
> Mon May 29 16:54:47 BST 2023 , Running in **local mode**
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
>
>  Application is ReduceByKey
>
> Exception in thread "main" java.lang.NoSuchMethodError:
> scala.package$.Seq()Lscala/collection/immutable/Seq$;
> at ReduceByKey$.main(ReduceByKey.scala:23)
> at ReduceByKey.main(ReduceByKey.scala)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
> at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
> at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
> at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
> at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
> at
> 

Re: Re: maven with Spark 3.4.0 fails compilation

2023-05-29 Thread Mich Talebzadeh
Thanks everyone. Still not much progress :(. It is becoming a bit confusing
as I am getting this error

Compiling ReduceByKey
[INFO] Scanning for projects...
[INFO]
[INFO] -< spark:ReduceByKey
>--
[INFO] Building ReduceByKey 3.0
[INFO]   from pom.xml
[INFO] [ jar
]-
[INFO]
[INFO] --- resources:3.3.0:resources (default-resources) @ ReduceByKey ---
[WARNING] Using platform encoding (ANSI_X3.4-1968 actually) to copy
filtered resources, i.e. build is platform dependent!
[INFO] skip non existing resourceDirectory
/data6/hduser/scala/ReduceByKey/src/main/resources
[INFO]
[INFO] --- compiler:3.10.1:compile (default-compile) @ ReduceByKey ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- scala:2.15.2:compile (default) @ ReduceByKey ---
[INFO] Checking for multiple versions of scala
[WARNING]  Expected all dependencies to require Scala version: 2.13.8
[WARNING]  spark:ReduceByKey:3.0 requires scala version: 2.13.8
[WARNING]  org.scala-lang.modules:scala-parallel-collections_2.13:1.0.4
requires scala version: 2.13.6
[WARNING] Multiple versions of scala libraries detected!
[INFO] includes = [**/*.java,**/*.scala,]
[INFO] excludes = []
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- resources:3.3.0:testResources (default-testResources) @
ReduceByKey ---
[WARNING] Using platform encoding (ANSI_X3.4-1968 actually) to copy
filtered resources, i.e. build is platform dependent!
[INFO] skip non existing resourceDirectory
/data6/hduser/scala/ReduceByKey/src/test/resources
[INFO]
[INFO] --- compiler:3.10.1:testCompile (default-testCompile) @ ReduceByKey
---
[INFO] No sources to compile
[INFO]
[INFO] --- surefire:3.0.0:test (default-test) @ ReduceByKey ---
[INFO] No tests to run.
[INFO]
[INFO] --- jar:3.3.0:jar (default-jar) @ ReduceByKey ---
[INFO] Building jar: /data6/hduser/scala/ReduceByKey/target/ReduceByKey.jar
[INFO]
[INFO] --- shade:1.6:shade (default) @ ReduceByKey ---
[INFO] Including org.scala-lang:scala-library:jar:2.13.8 in the shaded jar.
[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing /data6/hduser/scala/ReduceByKey/target/ReduceByKey.jar
with /data6/hduser/scala/ReduceByKey/target/ReduceByKey-3.0-shaded.jar
[INFO]

[INFO] BUILD SUCCESS
[INFO]

[INFO] Total time:  3.321 s
[INFO] Finished at: 2023-05-29T16:54:47+01:00
[INFO]

[WARNING]
[WARNING] Plugin validation issues were detected in 4 plugin(s)
[WARNING]
[WARNING]  * org.scala-tools:maven-scala-plugin:2.15.2
[WARNING]  * org.apache.maven.plugins:maven-compiler-plugin:3.10.1
[WARNING]  * org.apache.maven.plugins:maven-shade-plugin:1.6
[WARNING]  * org.apache.maven.plugins:maven-resources-plugin:3.3.0
[WARNING]
[WARNING] For more or less details, use 'maven.plugin.validation' property
with one of the values (case insensitive): [BRIEF, DEFAULT, VERBOSE]
[WARNING]
Completed compiling
Mon May 29 16:54:47 BST 2023 , Running in **local mode**
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.

 Application is ReduceByKey

Exception in thread "main" java.lang.NoSuchMethodError:
scala.package$.Seq()Lscala/collection/immutable/Seq$;
at ReduceByKey$.main(ReduceByKey.scala:23)
at ReduceByKey.main(ReduceByKey.scala)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org
$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
at
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)



Part of pom.xml is here


https://maven.apache.org/POM/4.0.0; xmlns:xsi="
https://www.w3.org/2001/XMLSchema-instance;
xsi:schemaLocation="https://maven.apache.org/POM/4.0.0
https://maven.apache.org/maven-v4_0_0.xsd;>
4.0.0
spark
3.0
ReduceByKey

Re: Re: maven with Spark 3.4.0 fails compilation

2023-05-29 Thread Bjørn Jørgensen
Change



org.scala-lang
scala-library
2.13.11-M2



to



org.scala-lang
scala-library
${scala.version}


man. 29. mai 2023 kl. 13:20 skrev Lingzhe Sun :

> Hi Mich,
>
> Spark 3.4.0 prebuilt with scala 2.13 is built with version 2.13.8
> .
> Since you are using spark-core_2.13 and spark-sql_2.13, you should stick to
> the major(13) and the minor version(8). Not using any of these may cause
> unexpected behaviour(though scala claims compatibility among minor version
> changes, I've encountered problem using the scala package with the same
> major version and different minor version. That may due to bug fixes and
> upgrade of scala itself.).
> And although I did not encountered such problem, this
> can be a a pitfall for you.
>
> --
> Best Regards!
> ...
> Lingzhe Sun
> Hirain Technology
>
>
> *From:* Mich Talebzadeh 
> *Date:* 2023-05-29 17:55
> *To:* Bjørn Jørgensen 
> *CC:* user @spark 
> *Subject:* Re: maven with Spark 3.4.0 fails compilation
> Thanks for your helpful comments Bjorn.
>
> I managed to compile the code with maven but when it run it fails with
>
>   Application is ReduceByKey
>
> Exception in thread "main" java.lang.NoSuchMethodError:
> scala.package$.Seq()Lscala/collection/immutable/Seq$;
> at ReduceByKey$.main(ReduceByKey.scala:23)
> at ReduceByKey.main(ReduceByKey.scala)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> at
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
> at org.apache.spark.deploy.SparkSubmit.org
> $apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
> at
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
> at
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
> at
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
> at
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:)
> at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> I attach the pom.xml and the sample scala code is self contained and
> basic. Again it runs with SBT with no issues.
>
> FYI, my scala version on host is
>
>  scala -version
> Scala code runner version 2.13.6 -- Copyright 2002-2021, LAMP/EPFL and
> Lightbend, Inc.
>
> I think I have a scala  incompatible somewhere again
>
> Cheers
>
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies Limited
> London
> United Kingdom
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Sun, 28 May 2023 at 20:29, Bjørn Jørgensen 
> wrote:
>
>> From chatgpt4
>>
>>
>> The problem appears to be that there is a mismatch between the version of
>> Scala used by the Scala Maven plugin and the version of the Scala library
>> defined as a dependency in your POM. You've defined your Scala version in
>> your properties as `2.12.17` but you're pulling in `scala-library` version
>> `2.13.6` as a dependency.
>>
>> The Scala Maven plugin will be using the Scala version defined in the
>> `scala.version` property for compilation, but then it tries to load classes
>> from a different Scala version, hence the error.
>>
>> To resolve this issue, make sure the `scala.version` property matches the
>> version of `scala-library` defined in your dependencies. In your case, you
>> may want to change `scala.version` to `2.13.6`.
>>
>> Here's the corrected part of your POM:
>>
>> ```xml
>> 
>>   1.7
>>   1.7
>>   UTF-8
>>   2.13.6 
>>   2.15.2
>> 
>> ```
>>
>> Additionally, ensure that the Scala versions in the Spark dependencies
>> match the `scala.version` property as well. If you've updated the Scala
>> version to `2.13.6`, the artifactIds for Spark dependencies should be
>> `spark-core_2.13` and `spark-sql_2.13`.
>>
>> Another thing to consider: your Java version defined in
>> `maven.compiler.source` and `maven.compiler.target` is `1.7`, 

Re: JDK version support information

2023-05-29 Thread Sean Owen
Per docs, it is Java 8. It's possible Java 11 partly works with 2.x but not
supported. But then again 2.x is not supported either.

On Mon, May 29, 2023, 6:43 AM Poorna Murali  wrote:

> We are currently using JDK 11 and spark 2.4.5.1 is working fine with that.
> So, we wanted to check the maximum JDK version supported for 2.4.5.1.
>
> On Mon, 29 May, 2023, 5:03 pm Aironman DirtDiver, 
> wrote:
>
>> Spark version 2.4.5.1 is based on Apache Spark 2.4.5. According to the
>> official Spark documentation for version 2.4.5, the maximum supported JDK
>> (Java Development Kit) version is JDK 8 (Java 8).
>>
>> Spark 2.4.5 is not compatible with JDK versions higher than Java 8.
>> Therefore, you should use JDK 8 to ensure compatibility and avoid any
>> potential issues when running Spark 2.4.5.
>>
>> El lun, 29 may 2023 a las 13:28, Poorna Murali ()
>> escribió:
>>
>>> Hi,
>>>
>>> We are using spark version 2.4.5.1. We would like to know the maximum
>>> JDK version supported for the same.
>>>
>>> Thanks,
>>> Poorna
>>>
>>
>>
>> --
>> Alonso Isidoro Roman
>> [image: https://]about.me/alonso.isidoro.roman
>>
>> 
>>
>


Re: JDK version support information

2023-05-29 Thread Poorna Murali
We are currently using JDK 11 and spark 2.4.5.1 is working fine with that.
So, we wanted to check the maximum JDK version supported for 2.4.5.1.

On Mon, 29 May, 2023, 5:03 pm Aironman DirtDiver, 
wrote:

> Spark version 2.4.5.1 is based on Apache Spark 2.4.5. According to the
> official Spark documentation for version 2.4.5, the maximum supported JDK
> (Java Development Kit) version is JDK 8 (Java 8).
>
> Spark 2.4.5 is not compatible with JDK versions higher than Java 8.
> Therefore, you should use JDK 8 to ensure compatibility and avoid any
> potential issues when running Spark 2.4.5.
>
> El lun, 29 may 2023 a las 13:28, Poorna Murali ()
> escribió:
>
>> Hi,
>>
>> We are using spark version 2.4.5.1. We would like to know the maximum JDK
>> version supported for the same.
>>
>> Thanks,
>> Poorna
>>
>
>
> --
> Alonso Isidoro Roman
> [image: https://]about.me/alonso.isidoro.roman
>
> 
>


Re: JDK version support information

2023-05-29 Thread Aironman DirtDiver
Spark version 2.4.5.1 is based on Apache Spark 2.4.5. According to the
official Spark documentation for version 2.4.5, the maximum supported JDK
(Java Development Kit) version is JDK 8 (Java 8).

Spark 2.4.5 is not compatible with JDK versions higher than Java 8.
Therefore, you should use JDK 8 to ensure compatibility and avoid any
potential issues when running Spark 2.4.5.

El lun, 29 may 2023 a las 13:28, Poorna Murali ()
escribió:

> Hi,
>
> We are using spark version 2.4.5.1. We would like to know the maximum JDK
> version supported for the same.
>
> Thanks,
> Poorna
>


-- 
Alonso Isidoro Roman
[image: https://]about.me/alonso.isidoro.roman



JDK version support information

2023-05-29 Thread Poorna Murali
Hi,

We are using spark version 2.4.5.1. We would like to know the maximum JDK
version supported for the same.

Thanks,
Poorna


Re: Re: maven with Spark 3.4.0 fails compilation

2023-05-29 Thread Lingzhe Sun
Hi Mich,

Spark 3.4.0 prebuilt with scala 2.13 is built with version 2.13.8. Since you 
are using spark-core_2.13 and spark-sql_2.13, you should stick to the major(13) 
and the minor version(8). Not using any of these may cause unexpected 
behaviour(though scala claims compatibility among minor version changes, I've 
encountered problem using the scala package with the same major version and 
different minor version. That may due to bug fixes and upgrade of scala 
itself.). 
And although I did not encountered such problem, this can be a a pitfall for 
you.



Best Regards!
...
Lingzhe Sun 
Hirain Technology

 
From: Mich Talebzadeh
Date: 2023-05-29 17:55
To: Bjørn Jørgensen
CC: user @spark
Subject: Re: maven with Spark 3.4.0 fails compilation
Thanks for your helpful comments Bjorn.

I managed to compile the code with maven but when it run it fails with

  Application is ReduceByKey

Exception in thread "main" java.lang.NoSuchMethodError: 
scala.package$.Seq()Lscala/collection/immutable/Seq$;
at ReduceByKey$.main(ReduceByKey.scala:23)
at ReduceByKey.main(ReduceByKey.scala)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I attach the pom.xml and the sample scala code is self contained and basic. 
Again it runs with SBT with no issues.

FYI, my scala version on host is

 scala -version
Scala code runner version 2.13.6 -- Copyright 2002-2021, LAMP/EPFL and 
Lightbend, Inc.

I think I have a scala  incompatible somewhere again

Cheers


Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom

   view my Linkedin profile

 https://en.everybodywiki.com/Mich_Talebzadeh
 
Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
damage or destruction of data or any other property which may arise from 
relying on this email's technical content is explicitly disclaimed. The author 
will in no case be liable for any monetary damages arising from such loss, 
damage or destruction. 
 


On Sun, 28 May 2023 at 20:29, Bjørn Jørgensen  wrote:
From chatgpt4 


The problem appears to be that there is a mismatch between the version of Scala 
used by the Scala Maven plugin and the version of the Scala library defined as 
a dependency in your POM. You've defined your Scala version in your properties 
as `2.12.17` but you're pulling in `scala-library` version `2.13.6` as a 
dependency.

The Scala Maven plugin will be using the Scala version defined in the 
`scala.version` property for compilation, but then it tries to load classes 
from a different Scala version, hence the error.

To resolve this issue, make sure the `scala.version` property matches the 
version of `scala-library` defined in your dependencies. In your case, you may 
want to change `scala.version` to `2.13.6`.

Here's the corrected part of your POM:

```xml

  1.7
  1.7
  UTF-8
  2.13.6 
  2.15.2

```

Additionally, ensure that the Scala versions in the Spark dependencies match 
the `scala.version` property as well. If you've updated the Scala version to 
`2.13.6`, the artifactIds for Spark dependencies should be `spark-core_2.13` 
and `spark-sql_2.13`. 

Another thing to consider: your Java version defined in `maven.compiler.source` 
and `maven.compiler.target` is `1.7`, which is quite outdated and might not be 
compatible with the latest versions of these libraries. Consider updating to a 
more recent version of Java, such as Java 8 or above, depending on the 
requirements of the libraries you're using.



The same problem persists in this updated POM file - there's a mismatch in the 
Scala version declared in the properties and the version used in your 
dependencies. Here's what you need to update:

1. Update the Scala version in your properties to match the Scala library and 
your Spark dependencies:

```xml

1.7
1.7
UTF-8
2.13.6 
2.15.2

```

2. Make sure all your Spark 

Re: maven with Spark 3.4.0 fails compilation

2023-05-29 Thread Mich Talebzadeh
Thanks for your helpful comments Bjorn.

I managed to compile the code with maven but when it run it fails with

  Application is ReduceByKey

Exception in thread "main" java.lang.NoSuchMethodError:
scala.package$.Seq()Lscala/collection/immutable/Seq$;
at ReduceByKey$.main(ReduceByKey.scala:23)
at ReduceByKey.main(ReduceByKey.scala)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org
$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
at
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I attach the pom.xml and the sample scala code is self contained and basic.
Again it runs with SBT with no issues.

FYI, my scala version on host is

 scala -version
Scala code runner version 2.13.6 -- Copyright 2002-2021, LAMP/EPFL and
Lightbend, Inc.

I think I have a scala  incompatible somewhere again

Cheers


Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sun, 28 May 2023 at 20:29, Bjørn Jørgensen 
wrote:

> From chatgpt4
>
>
> The problem appears to be that there is a mismatch between the version of
> Scala used by the Scala Maven plugin and the version of the Scala library
> defined as a dependency in your POM. You've defined your Scala version in
> your properties as `2.12.17` but you're pulling in `scala-library` version
> `2.13.6` as a dependency.
>
> The Scala Maven plugin will be using the Scala version defined in the
> `scala.version` property for compilation, but then it tries to load classes
> from a different Scala version, hence the error.
>
> To resolve this issue, make sure the `scala.version` property matches the
> version of `scala-library` defined in your dependencies. In your case, you
> may want to change `scala.version` to `2.13.6`.
>
> Here's the corrected part of your POM:
>
> ```xml
> 
>   1.7
>   1.7
>   UTF-8
>   2.13.6 
>   2.15.2
> 
> ```
>
> Additionally, ensure that the Scala versions in the Spark dependencies
> match the `scala.version` property as well. If you've updated the Scala
> version to `2.13.6`, the artifactIds for Spark dependencies should be
> `spark-core_2.13` and `spark-sql_2.13`.
>
> Another thing to consider: your Java version defined in
> `maven.compiler.source` and `maven.compiler.target` is `1.7`, which is
> quite outdated and might not be compatible with the latest versions of
> these libraries. Consider updating to a more recent version of Java, such
> as Java 8 or above, depending on the requirements of the libraries you're
> using.
>
>
>
> The same problem persists in this updated POM file - there's a mismatch in
> the Scala version declared in the properties and the version used in your
> dependencies. Here's what you need to update:
>
> 1. Update the Scala version in your properties to match the Scala library
> and your Spark dependencies:
>
> ```xml
> 
> 1.7
> 1.7
> UTF-8
> 2.13.6 
> 2.15.2
> 
> ```
>
> 2. Make sure all your Spark dependencies use the same Scala version. In
> this case, I see `spark-streaming-kafka_2.11` which should be
> `spark-streaming-kafka_2.13` if you're using Scala `2.13.6`.
>
> ```xml
> 
> org.apache.spark
> spark-streaming-kafka_2.13 
> 1.6.3 
> provided
> 
> ```
>
> 3. As mentioned in the previous message, your Java version
> (`maven.compiler.source` and `maven.compiler.target`) is also quite
> outdated. Depending on the requirements of the libraries you're using, you
> might want to update this to a newer version, such as Java 8 or above.
>
> Finally, ensure that the correct versions of these libraries exist in your
> Maven repository or are available in the central