Few more questions to Marcelo. Sorry Marcelo, for very long question list. I'd really appreciate your kind help and answer to these questions in order to fully understand design decision and architecture you have in mind while implementing very helpful SparkLauncher.
*Scenario*: Spark job is launched via SparkLauncher#startApplication() inside the Map task in Cluster. So i launch Hadoop Map only job, and inside that Map job I launch Spark job. These are the related processes launched when Spark job is launched from the map task: 40588 YarnChild (this is map task container process) 40550 MRAppMaster(this is MR APP MASTER container) *Spark Related processes:* 40602 SparkSubmit 40875 CoarseGrainedExecutorBackend 40846 CoarseGrainedExecutorBackend 40815 ExecutorLauncher When Spark app is started via SparkLauncher#startApplication(), Spark driver (inside SparkSubmit) is started as child process - new JVM process started. 1) This child process lives outside map task YARN Container & JVM process, but on the same machine, right ? Child process (SparkSubmit) will have its own JVM, right ? As shown in the process list above SparkSubmit is separate process. 2) As everything is external to the JVM of map task - Spark app/driver (inside SparkSubmit) will be running in its own JVM on the same machine where Map container is running, you use Process API offers the destroy() and destroyForcibly() methods, which apply the appropriate platform specific process stopping procedures. *In order to keep parent-child process tie, and make sure child process will die when parent process dies or killed (even not gracefully), you used this technique*: You created a thread with an server-side socket in accept mode on the parent with port. When the child starts, pass that port number as a parameter (environment variable). Have it create a thread and open that socket. The have the thread sit on the socket forever. If the connection ever drops, then the child exit. Marcelo, *please correct me if i am wrong*. Is this how you make sure child process is also killed when parent process is killed ? 3) Let's say I kill the map task forcefully or using hadoopClient kill job by jobId, which spans Spark job using appHandle.startApplication(), a) Spark Driver (SparkSubmit process) will also be killed , right ? Even if the code will not have a chance call appHandle.stop() and appHandle.kill(), child process will die too because of parent-child relationship i described above. Is this correct ? b) Assuming (3a) is correct, driver was killed due to parent-child relationship, *without* appHandle.stop() and appHandle.kill() commands executed, will Executors clean the environment (remove temp files) before stopping ? 4) To add another level of improvement, is it good idea to attach ShutDownHook (Runtime.getRuntime().addShutdownHook(new ShutdownThread());) to the map task, and inside that call these 2 functions: appHandle.stop(); apphandle.kill(); Thanks. P.S: *In the below thread you will find design decisions of appHandle.kill() implementation replied by Marcelo (thanks a lot) - which is interesting to know.* On Thu, Nov 10, 2016 at 9:22 AM Marcelo Vanzin <van...@cloudera.com> wrote: > Hi Elkhan, > > I'd prefer if these questions were asked in the mailing list. > > The launcher code cannot call YARN APIs directly because Spark > supports more than just YARN. So its API and implementation has to be > cluster-agnostic. > > As for kill, this is what the docs say: > > """ > This will not send a {@link #stop()} message to the application, so > it's recommended that users first try to > stop the application cleanly and only resort to this method if that fails. > """ > > So if you want to stop the application first, call stop(). > > > On Thu, Nov 10, 2016 at 12:55 AM, Elkhan Dadashov <elkhan8...@gmail.com> > wrote: > > Hi Marcelo, > > > > I have few more questions related to SparkLauncher. Will be glad and > > thankful if you could answer them. > > > > It seems SparkLauncher Yarn-client or Yarn-Cluster deploy mode does not > > matter much, as even in yarn-cluster mode the client that launches the > > application must remain alive for the duration of the application (or > until > > the app handle is disconnected) which is described in LauncherServer.java > > JavaDoc. > > > > 1) In yarn-cluster mode, if the client dies, then will only the appHandle > > will be lost, or the Spark application will also die ? > > > > 2) May i know why did you prefer implementing appHandle.kill() with > killing > > process instead of let's say : > > > > a) yarn application -kill <application_ID> > > b) ./bin/spark-class org.apache.spark.deploy.Client kill <driverId> > > > > 3) If Spark Driver is killed (by killing the process, not gracefully), > will > > Executors clean the environment (remove temp files) ? > > > > Thanks a lot. > > > > > > > > -- > Marcelo >