Hello,

Currently I work on a project in which:

I spawn a standalone Apache Spark MLlib job in Standalone mode, from a
running Java Process.

In the code of the Spark Job I have the following code:

SparkConf sparkConf = new SparkConf().setAppName("SparkParallelLoad");
sparkConf.set("spark.executor.memory", "8g");
JavaSparkContext sc = new JavaSparkContext(sparkConf);

...

Also, in my ~/spark/conf/spark-env.sh I have the following values:

SPARK_WORKER_CORES=1
export SPARK_WORKER_CORES=1
SPARK_WORKER_MEMORY=2g
export SPARK_WORKER_MEMORY=2g
SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.spark.executor.memory=4g"
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.spark.executor.memory=4g"

During runtime I receive a Java OutOfMemory exception and a Core dump. My
dataset is less than 1 GB and I want to make sure that I cache it all in
memory for my ML task.

Am I increasing the JVM Heap Memory correctly? Am I doing something wrong?

Thank you,

Nick

Reply via email to