Few more details I would like to provide (Sorry as I should have provided
with the previous post):
*- Spark Version = 0.9.1 (using pre-built spark-0.9.1-bin-hadoop2)
- Hadoop Version = 2.4.0 (Hortonworks)
- I am trying to execute a Spark Streaming program*
Because I am using Hortornworks
I was actually able to get this to work. I was NOT setting the classpath
properly originally.
Simply running
java -cp /etc/hadoop/conf/:yarn, hadoop jars com.domain.JobClass
and setting yarn-client as the spark master worked for me. Originally I
had not put the configuration on the classpath.
I'm assuming you're running Spark 0.9.x, because in the latest version of
Spark you shouldn't have to add the HADOOP_CONF_DIR to the java class path
manually. I tested this out on my own YARN cluster and was able to confirm
that.
In Spark 1.0, SPARK_MEM is deprecated and should not be used.
Yes, we are on Spark 0.9.0 so that explains the first piece, thanks!
Also, yes, I meant SPARK_WORKER_MEMORY. Thanks for the hierarchy.
Similarly is there some best practice on setting SPARK_WORKER_INSTANCES and
spark.default.parallelism?
Thanks,
Arun
On Tue, May 20, 2014 at 3:04 PM, Andrew Or
I am encountering the same thing. Basic yarn apps work as does the SparkPi
example, but my custom application gives this result. I am using
compute-classpath to create the proper classpath for my application, same
with SparkPi - was there a resolution to this issue?
Thanks,
Arun
On Wed, Feb