To: Anfernee Xu
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>"
Subject: Re: Custom Hadoop InputSplit, Spark partitions, spark executors/task
and Yarn containers
Hi Anfernee,
That's correct that each InputSplit will map to exactly a Spark partition.
On YARN, each Spa
dy Ryza
> Date: Thursday, September 24, 2015 at 2:43 AM
> To: Anfernee Xu
> Cc: "user@spark.apache.org"
> Subject: Re: Custom Hadoop InputSplit, Spark partitions, spark
> executors/task and Yarn containers
>
> Hi Anfernee,
>
> That's correct that each InputSplit w
Hi Spark experts,
I'm coming across these terminologies and having some confusions, could you
please help me understand them better?
For instance I have implemented a Hadoop InputFormat to load my external
data in Spark, in turn my custom InputFormat will create a bunch of
InputSplit's, my
Hi Anfernee,
That's correct that each InputSplit will map to exactly a Spark partition.
On YARN, each Spark executor maps to a single YARN container. Each
executor can run multiple tasks over its lifetime, both parallel and
sequentially.
If you enable dynamic allocation, after the stage