Another aspect to keep in mind is JVM above 8-10GB starts to misbehave.
Typically better to split up ~ 15GB intervals.
if you are choosing machines 10GB/Core is a approx to maintain.
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi
Hi There,
I am new to Spark and I was wondering when you have so much memory on each
machine of the cluster, is it better to run multiple workers with limited
memory on each machine or is it better to run a single worker with access
to the majority of the machine memory? If the answer is it