It used to be that you have to read Spark code to figure this information out. However, Spark team has recently published this info here: http://spark.incubator.apache.org/docs/latest/cluster-overview.html
On Feb 17, 2014, at 11:35 AM, purav aggarwal <puravaggarwal...@gmail.com> wrote: > Sorry for the incorrect information. Where can I pick up these > architectural/design concepts for Spark? > I seem to have misunderstood the responsibilities of the master and the > driver. > > > On Mon, Feb 17, 2014 at 10:51 PM, Michael (Bach) Bui <free...@adatao.com> > wrote: > Spark has the concept of Driver and Master > > Driver is your the spark program that you run in your local machine. > SparkContext resides in the driver together with the DAG scheduler. > Master is responsible for managing cluster resources, e.g. giving the Driver > the workers that it needed. The Master can be either Mesos master (for Mesos > cluster), or Spark master (for Spark standalone cluster), or ResourceManager > (for Hadoop cluster) > Given the resources assigned by Master, Driver will user DAG to assign tasks > to workers. > > So yes, the result of spark's actions will be sent back to driver, which is > your local console. > > > On Feb 17, 2014, at 10:54 AM, David Thomas <dt5434...@gmail.com> wrote: > >> So if I do a spark action, say, collect, will I be able to see the result on >> my local console? Or would it be only available only on the cluster master? >> >> >> On Mon, Feb 17, 2014 at 9:50 AM, purav aggarwal <puravaggarwal...@gmail.com> >> wrote: >> Your local machine simply submits your job (in the form of jar) to the >> cluster. >> The master node is where the SparkContext object is created, a DAG of your >> job is formed and tasks (stages) are assigned to different workers - which >> are not aware of anything but computation of task being assigned. >> >> >> On Mon, Feb 17, 2014 at 10:07 PM, David Thomas <dt5434...@gmail.com> wrote: >> Where is the SparkContext object created then? On my local machine or on the >> master node in the cluster? >> >> >> On Mon, Feb 17, 2014 at 4:17 AM, Nhan Vu Lam Chi <nhani...@adatao.com> wrote: >> Your local app will be called "driver program", which creates jobs and >> submits them to the cluster for running. >> >> >> On Mon, Feb 17, 2014 at 9:19 AM, David Thomas <dt5434...@gmail.com> wrote: >> From docs: >> Connecting an Application to the Cluster >> >> To run an application on the Spark cluster, simply pass the spark://IP:PORT >> URL of the master as to the SparkContext constructor. >> >> Could someone enlighten me on what happens if I run the app, from say, >> Eclipse on my local machine, but use the url of the master node which is on >> cloud. What role does my local JVM play then? >> >> >> >> > >