Hi,
Could you please set the config
"spark.sql.streaming.fileSource.cleaner.numThreads"
to 0 and see whether it works? (NOTE: will slow down your process since the
cleaning phase will happen in the foreground. The default is background
with 1 thread. You can try out more threads than 1.)
If it
It's actually on AWS EMR. The job bootstraps and runs fine -- the
autoscaling group is to bring up a service that spark will be calling. Some
code waits for the autoscaling group to come up before continuing
processing in Spark, since the Spark cluster will need to make requests to
the service in
Sounds like you are running this on Google Dataproc cluster (spark 3.1.2)
with auto scaling policy?
Can you describe if this happens before Spark starts a new job on the
cluster or somehow half way through processing an existing job?
Also is the job involved doing Spark Structured Streaming?
We've got a spark task that, after some processing, starts an autoscaling
group and waits for it to be up before continuing processing. While waiting
for the autoscaling group, spark starts throwing full thread dumps,
presumably at the spark.executor.heartbeat interval. Is there a way to
prevent
You can look it up:
https://github.com/apache/spark/blob/branch-3.2/pom.xml#L916
3.7.0-M11
On Thu, Feb 3, 2022 at 1:57 PM Amit Sharma wrote:
> Hello, everyone. I am migrating my spark stream to spark version 3.1. I
> also upgraded json version as below
>
> libraryDependencies += "org.json4s"
Hello, everyone. I am migrating my spark stream to spark version 3.1. I
also upgraded json version as below
libraryDependencies += "org.json4s" %% "json4s-native" % "3.7.0-M5"
While running the job I getting an error for the below code where I am
serializing the given inputs.
implicit val