perhaps the author is referring to Spark Streaming applications? they're examples of long-running applications.
the application/domain-level protocol still needs to be implemented yourself, as sandy pointed out. On Wed, Jul 9, 2014 at 11:03 AM, John Omernik <j...@omernik.com> wrote: > So how do I do the "long-lived server continually satisfying requests" in > the Cloudera application? I am very confused by that at this point. > > > On Wed, Jul 9, 2014 at 12:49 PM, Sandy Ryza <sandy.r...@cloudera.com> > wrote: > >> Spark doesn't currently offer you anything special to do this. I.e. if >> you want to write a Spark application that fires off jobs on behalf of >> remote processes, you would need to implement the communication between >> those remote processes and your Spark application code yourself. >> >> >> On Wed, Jul 9, 2014 at 10:41 AM, John Omernik <j...@omernik.com> wrote: >> >>> Thank you for the link. In that link the following is written: >>> >>> For those familiar with the Spark API, an application corresponds to an >>> instance of the SparkContext class. An application can be used for a >>> single batch job, an interactive session with multiple jobs spaced apart, >>> or a long-lived server continually satisfying requests >>> >>> So, if I wanted to use "a long-lived server continually satisfying >>> requests" and then start a shell that connected to that context, how would >>> I do that in Yarn? That's the problem I am having right now, I just want >>> there to be that long lived service that I can utilize. >>> >>> Thanks! >>> >>> >>> On Wed, Jul 9, 2014 at 11:14 AM, Sandy Ryza <sandy.r...@cloudera.com> >>> wrote: >>> >>>> To add to Ron's answer, this post explains what it means to run Spark >>>> against a YARN cluster, the difference between yarn-client and yarn-cluster >>>> mode, and the reason spark-shell only works in yarn-client mode. >>>> >>>> http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/ >>>> >>>> -Sandy >>>> >>>> >>>> On Wed, Jul 9, 2014 at 9:09 AM, Ron Gonzalez <zlgonza...@yahoo.com> >>>> wrote: >>>> >>>>> The idea behind YARN is that you can run different application types >>>>> like MapReduce, Storm and Spark. >>>>> >>>>> I would recommend that you build your spark jobs in the main method >>>>> without specifying how you deploy it. Then you can use spark-submit to >>>>> tell >>>>> Spark how you would want to deploy to it using yarn-cluster as the master. >>>>> The key point here is that once you have YARN setup, the spark client >>>>> connects to it using the $HADOOP_CONF_DIR that contains the resource >>>>> manager address. In particular, this needs to be accessible from the >>>>> classpath of the submitter since it implicitly uses this when it >>>>> instantiates a YarnConfiguration instance. If you want more details, read >>>>> org.apache.spark.deploy.yarn.Client.scala. >>>>> >>>>> You should be able to download a standalone YARN cluster from any of >>>>> the Hadoop providers like Cloudera or Hortonworks. Once you have that, the >>>>> spark programming guide describes what I mention above in sufficient >>>>> detail >>>>> for you to proceed. >>>>> >>>>> Thanks, >>>>> Ron >>>>> >>>>> Sent from my iPad >>>>> >>>>> > On Jul 9, 2014, at 8:31 AM, John Omernik <j...@omernik.com> wrote: >>>>> > >>>>> > I am trying to get my head around using Spark on Yarn from a >>>>> perspective of a cluster. I can start a Spark Shell no issues in Yarn. >>>>> Works easily. This is done in yarn-client mode and it all works well. >>>>> > >>>>> > In multiple examples, I see instances where people have setup Spark >>>>> Clusters in Stand Alone mode, and then in the examples they "connect" to >>>>> this cluster in Stand Alone mode. This is done often times using the >>>>> spark:// string for connection. Cool. s >>>>> > But what I don't understand is how do I setup a Yarn instance that I >>>>> can "connect" to? I.e. I tried running Spark Shell in yarn-cluster mode >>>>> and >>>>> it gave me an error, telling me to use yarn-client. I see information on >>>>> using spark-class or spark-submit. But what I'd really like is a instance >>>>> I can connect a spark-shell too, and have the instance stay up. I'd like >>>>> to >>>>> be able run other things on that instance etc. Is that possible with Yarn? >>>>> I know there may be long running job challenges with Yarn, but I am just >>>>> testing, I am just curious if I am looking at something completely bonkers >>>>> here, or just missing something simple. >>>>> > >>>>> > Thanks! >>>>> > >>>>> > >>>>> >>>> >>>> >>> >> >