When running in standalone cluster mode, how can I verify that more than just
one worker is being utilized? I can see that multiple workers are being started
up, from the files in $SCALA_HOME/logs, but I don't see any difference in
execution time when I specify 1 worker versus 4, which surprises me.
Here's my SparkConf as defined in my application:
val conf = new SparkConf().setAppName("My
App").setMaster("local").setSparkHome("/home/me/spark-1.3.0-bin-hadoop2.4").setJars(List("target/scala-2.10/my-app-project_2.10-1.0.jar"))
And here's how I'm launching the master and slaves and submitting the job:
start-all.sh
spark-submit --class MyApp --master spark://localhost:7077
target/scala-2.10/my-app-project_2.10-1.0.jar
stop-all.sh
In particular, I wonder if there's a problem because I specify "local" as the
master inside of my application; perhaps that keeps the driver from sending any
work to workers, even if they're registered? (Looking at the master and worker
logs, the workers and the master do find each other: I get messages saying
"Registering worker" and "Successfully registered with master", etc.)
Cheers,
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]