Dear Spark developers,

I have created a simple Spark application for spark submit. It calls a machine 
learning library from Spark MLlib that is executed in a number of iterations 
that correspond to the same number of task in Spark. It seems that Spark 
creates an executor for each task and then removes it. The following messages 
indicate this in my log:

15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24463 is now RUNNING
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24463 is now EXITED (Command exited with code 1)
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Executor 
app-20150929120924-0000/24463 removed: Command exited with code 1
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 24463
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor added: 
app-20150929120924-0000/24464 on worker-20150929120330-16.111.35.101-46374 
(16.111.35.101:46374) with 12 cores
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150929120924-0000/24464 on hostPort 16.111.35.101:46374 with 12 cores, 
30.0 GB RAM
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24464 is now LOADING
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24464 is now RUNNING
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24464 is now EXITED (Command exited with code 1)
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Executor 
app-20150929120924-0000/24464 removed: Command exited with code 1
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 24464
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor added: 
app-20150929120924-0000/24465 on worker-20150929120330-16.111.35.101-46374 
(16.111.35.101:46374) with 12 cores
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150929120924-0000/24465 on hostPort 16.111.35.101:46374 with 12 cores, 
30.0 GB RAM
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24465 is now LOADING
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24465 is now EXITED (Command exited with code 1)
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Executor 
app-20150929120924-0000/24465 removed: Command exited with code 1
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 24465
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor added: 
app-20150929120924-0000/24466 on worker-20150929120330-16.111.35.101-46374 
(16.111.35.101:46374) with 12 cores
15/09/29 12:21:02 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150929120924-0000/24466 on hostPort 16.111.35.101:46374 with 12 cores, 
30.0 GB RAM
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24466 is now LOADING
15/09/29 12:21:02 INFO AppClient$ClientEndpoint: Executor updated: 
app-20150929120924-0000/24466 is now RUNNING

It end up creating and removing thousands of executors. Is this a normal 
behavior?

If I run the same code within spark-shell, this does not happen. Could you 
suggest what might be wrong in my setting?

Best regards, Alexander

Reply via email to