Yes, your understanding about the separation of cluster manager and
the application driver (that creates the SparkContext) is correct.
There are HA solutions for both. Let me explain assuming the cluster
manager in Spark Standalone. The master of the Spark standalone
cluster manager can be made HA by running multiple master on different
node and using Zookeeper to coordinate. Additionally, the driver of
the application can be made HA by running it in "cluster mode" of
Spark Standalone cluster manager with "--supervise" enabled. The
Standalone master will launch the driver in one of the worker machines
in the cluster and continuously monitor it. If the driver exits with
non-zero exit code (or the worker machine fails), then the master will
automatically restart the driver application. If you streaming
application is written such that it can be restarted, then the driver
HA will be achieved.

This is discussed in the updated spark documentation.

Cluster mode + supervise - See
https://github.com/apache/spark/blob/master/docs/spark-standalone.md#launching-spark-applications
Writing streamign application suitable for driver HA - See
http://people.apache.org/~tdas/spark-1.2-temp/streaming-programming-guide.html#deploying-applications

YARN and Mesos probably also have similar functionalities - both
cluster manager HA, and application driver HA (by automatic restarts).

TD


On Fri, Dec 12, 2014 at 11:47 PM, twizansk <twiza...@gmail.com> wrote:
> Thanks for the reply.  I might be misunderstanding something basic.    As far
> as I can tell, the cluster manager (e.g. Mesos) manages the master and
> worker nodes but not the drivers or receivers, those are external to the
> spark cluster:
>
> http://spark.apache.org/docs/latest/cluster-overview.html
>
>
> I know that the spark-submit script has a "--deploy-mode cluster" option.
> Does this mean that the receiver will be managed on the cluster?
>
> Thanks
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-in-Production-tp20644p20662.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to