I've been using the standalone cluster all this time and it worked fine.
Recently I'm using another Spark cluster that is based on YARN and I've not
experience with YARN.

The YARN cluster has 10 nodes and a total memory of 480G.

I'm having trouble starting the spark-shell with enough memory.
I'm doing a very simple operation - reading a file 100GB from HDFS and
running a count on it. This fails due to out of memory on the executors.

Can someone point to the command line parameters that I should use for
spark-shell so that it?


Thanks
-Soumya

Reply via email to