Few more questions to Marcelo.

Sorry Marcelo, for very long question list. I'd really appreciate your kind
help and answer to these questions in order to fully understand design
decision and architecture you have in mind while implementing very helpful
SparkLauncher.

*Scenario*: Spark job is launched via SparkLauncher#startApplication()
inside the Map task in Cluster. So i launch Hadoop Map only job, and inside
that Map job I launch Spark job.

These are the related processes launched when Spark job is launched from
the map task:

40588 YarnChild (this is map task container process)

40550 MRAppMaster(this is MR APP MASTER container)


*Spark Related processes:*

40602 SparkSubmit

40875 CoarseGrainedExecutorBackend

40846 CoarseGrainedExecutorBackend
40815 ExecutorLauncher

When Spark app is started via SparkLauncher#startApplication(), Spark
driver (inside SparkSubmit) is started as child process - new JVM process
started.

1) This child process lives outside map task YARN Container & JVM process,
but on the same machine, right ?
     Child process (SparkSubmit) will have its own JVM, right ?

     As shown in the process list above SparkSubmit is separate process.

2) As everything is external to the JVM of map task - Spark app/driver
(inside SparkSubmit) will be running in its own JVM on the same machine
where Map container is running, you use Process API offers the destroy()
and destroyForcibly() methods, which apply the appropriate platform
specific process stopping procedures.

*In order to keep parent-child process tie, and make sure child process
will die when parent process dies or killed (even not gracefully), you used
this technique*:

You created a thread with an server-side socket in accept mode on the
parent with port. When the child starts, pass that port number as a
parameter (environment variable). Have it create a thread and open that
socket. The have the thread sit on the socket forever. If the connection
ever drops, then the child exit.

Marcelo, *please correct me if i am wrong*. Is this how you make sure child
process is also killed when parent process is killed ?

3) Let's say I kill the map task forcefully or using hadoopClient kill job
by jobId, which spans Spark job using appHandle.startApplication(),

    a) Spark Driver (SparkSubmit process) will also be killed , right ?
Even if the code will not have a chance call appHandle.stop() and
appHandle.kill(), child process will die too because of parent-child
relationship i described above. Is this correct ?

    b) Assuming (3a) is correct, driver was killed due to parent-child
relationship, *without* appHandle.stop() and appHandle.kill() commands
executed, will Executors clean the environment (remove temp files)
before stopping ?

4) To add another level of improvement, is it good idea to attach
ShutDownHook (Runtime.getRuntime().addShutdownHook(new ShutdownThread());)
to the map task, and inside that call these 2 functions:

     appHandle.stop();
     apphandle.kill();

Thanks.

P.S: *In the below thread you will find design decisions of
appHandle.kill() implementation replied by Marcelo  (thanks a lot) - which
is interesting to know.*

On Thu, Nov 10, 2016 at 9:22 AM Marcelo Vanzin <van...@cloudera.com> wrote:

> Hi Elkhan,
>
> I'd prefer if these questions were asked in the mailing list.
>
> The launcher code cannot call YARN APIs directly because Spark
> supports more than just YARN. So its API and implementation has to be
> cluster-agnostic.
>
> As for kill, this is what the docs say:
>
> """
> This will not send a {@link #stop()} message to the application, so
> it's recommended that users first try to
> stop the application cleanly and only resort to this method if that fails.
> """
>
> So if you want to stop the application first, call stop().
>
>
> On Thu, Nov 10, 2016 at 12:55 AM, Elkhan Dadashov <elkhan8...@gmail.com>
> wrote:
> > Hi Marcelo,
> >
> > I have few more questions related to SparkLauncher. Will be glad and
> > thankful if you could answer them.
> >
> > It seems SparkLauncher Yarn-client or Yarn-Cluster deploy mode does not
> > matter much, as even in yarn-cluster mode  the client that launches the
> > application must remain alive for the duration of the application (or
> until
> > the app handle is disconnected) which is described in LauncherServer.java
> > JavaDoc.
> >
> > 1) In yarn-cluster mode, if the client dies, then will only the appHandle
> > will be lost, or the Spark application will also die ?
> >
> > 2) May i know why did you prefer implementing appHandle.kill() with
> killing
> > process instead of let's say :
> >
> >     a) yarn application -kill <application_ID>
> >     b) ./bin/spark-class org.apache.spark.deploy.Client kill <driverId>
> >
> > 3) If Spark Driver is killed (by killing the process, not gracefully),
> will
> > Executors clean the environment (remove temp files) ?
> >
> > Thanks a lot.
> >
> >
>
>
>
> --
> Marcelo
>

Reply via email to