I am fairly new to spark. I configured 3 machines(2 slaves) on a standalone
cluster. I just wanted to know what exactly is the meaning of:
[Stage 0:==(25
+ 4) / 500]
This gets printed to the terminal when I submit my app. I understand
I realize that there are a lot of ways to configure my application in spark.
The part that is not clear is that how do I decide say for example in how
many partitions should I divide my data or how much ram should I have or how
many workers should one initialize?
--
View this message in
How do I decide in how many partitions I break up my data into, how many
executors should I have? I guess memory and cores will be allocated based on
the number of executors I have.
Thanks
--
View this message in context: