spark.sparkContext.textFile("s3a://a_bucket/models/random_forest_zepp/bestModel/metadata",
1).getNumPartitions()
when i run above code, i get below error. Can advice how to troubleshoot? i'
using spark 3.3.0. the above file path exist.
--
Hey everyone,
I wanted to share my latest paper, "A Grey Literature Review on Data Stream
Processing Applications Testing," in the Journal of Systems and Software
(JSS), Elsevier.
This paper provides unique industry insights, addresses the challenges
faced in Data Stream Processing (DSP) applicat
Just to correct the last sentence, if we end up starting a new instance of
Spark, I don't think it will be able to read the shuffle data from storage
from another instance, I stand corrected.
Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Ki
Hi Maksym.
Let us understand the basics here first
My thoughtsSpark replicates the partitions among multiple nodes. If one
executor fails, it moves the processing over to the other executor.
However, if the data is lost, it re-executes the processing that generated
the data,
and might have to go b
Hey vaquar,
The link does't explain the crucial detail we're interested in - does executor
re-use the data that exists on a node from previous executor and if not, how
can we configure it to do so?
We are not running on kubernetes, so EKS/Kubernetes-specific advice isn't
very relevant.
We are ru