Re: DataStreamReader cleanSource option

2022-02-03 Thread Jungtaek Lim
Hi, Could you please set the config "spark.sql.streaming.fileSource.cleaner.numThreads" to 0 and see whether it works? (NOTE: will slow down your process since the cleaning phase will happen in the foreground. The default is background with 1 thread. You can try out more threads than 1.) If it

Re: Spark 3.1.2 full thread dumps

2022-02-03 Thread Maksim Grinman
It's actually on AWS EMR. The job bootstraps and runs fine -- the autoscaling group is to bring up a service that spark will be calling. Some code waits for the autoscaling group to come up before continuing processing in Spark, since the Spark cluster will need to make requests to the service in

Re: Spark 3.1.2 full thread dumps

2022-02-03 Thread Mich Talebzadeh
Sounds like you are running this on Google Dataproc cluster (spark 3.1.2) with auto scaling policy? Can you describe if this happens before Spark starts a new job on the cluster or somehow half way through processing an existing job? Also is the job involved doing Spark Structured Streaming?

Spark 3.1.2 full thread dumps

2022-02-03 Thread Maksim Grinman
We've got a spark task that, after some processing, starts an autoscaling group and waits for it to be up before continuing processing. While waiting for the autoscaling group, spark starts throwing full thread dumps, presumably at the spark.executor.heartbeat interval. Is there a way to prevent

Re: Spark 3.1 Json4s-native jar compatibility

2022-02-03 Thread Sean Owen
You can look it up: https://github.com/apache/spark/blob/branch-3.2/pom.xml#L916 3.7.0-M11 On Thu, Feb 3, 2022 at 1:57 PM Amit Sharma wrote: > Hello, everyone. I am migrating my spark stream to spark version 3.1. I > also upgraded json version as below > > libraryDependencies += "org.json4s"

Spark 3.1 Json4s-native jar compatibility

2022-02-03 Thread Amit Sharma
Hello, everyone. I am migrating my spark stream to spark version 3.1. I also upgraded json version as below libraryDependencies += "org.json4s" %% "json4s-native" % "3.7.0-M5" While running the job I getting an error for the below code where I am serializing the given inputs. implicit val