It used to be that you have to read Spark code to figure this information out.
However, Spark team has recently published this info here: 
http://spark.incubator.apache.org/docs/latest/cluster-overview.html






On Feb 17, 2014, at 11:35 AM, purav aggarwal <puravaggarwal...@gmail.com> wrote:

> Sorry for the incorrect information. Where can I pick up these 
> architectural/design concepts for Spark?
> I seem to have misunderstood the responsibilities of the master and the 
> driver.
> 
> 
> On Mon, Feb 17, 2014 at 10:51 PM, Michael (Bach) Bui <free...@adatao.com> 
> wrote:
> Spark has the concept of  Driver and Master
> 
> Driver is your the spark program that you run in your local machine. 
> SparkContext resides in the driver together with the DAG scheduler.
> Master is responsible for managing cluster resources, e.g. giving the Driver 
> the workers that it needed. The Master can be either Mesos master (for Mesos 
> cluster), or Spark master (for Spark standalone cluster), or ResourceManager 
> (for Hadoop cluster)
> Given the resources assigned by Master, Driver will user DAG to assign tasks 
> to workers.
> 
> So yes, the result of spark's actions will be sent back to driver, which is 
> your local console.
> 
> 
> On Feb 17, 2014, at 10:54 AM, David Thomas <dt5434...@gmail.com> wrote:
> 
>> So if I do a spark action, say, collect, will I be able to see the result on 
>> my local console? Or would it be only available only on the cluster master?
>> 
>> 
>> On Mon, Feb 17, 2014 at 9:50 AM, purav aggarwal <puravaggarwal...@gmail.com> 
>> wrote:
>> Your local machine simply submits your job (in the form of jar) to the 
>> cluster.
>> The master node is where the SparkContext object is created, a DAG of your 
>> job is formed and tasks (stages) are assigned to different workers - which 
>> are not aware of anything but computation of task being assigned.
>> 
>> 
>> On Mon, Feb 17, 2014 at 10:07 PM, David Thomas <dt5434...@gmail.com> wrote:
>> Where is the SparkContext object created then? On my local machine or on the 
>> master node in the cluster?
>> 
>> 
>> On Mon, Feb 17, 2014 at 4:17 AM, Nhan Vu Lam Chi <nhani...@adatao.com> wrote:
>> Your local app will be called "driver program", which creates jobs and 
>> submits them to the cluster for running.
>> 
>> 
>> On Mon, Feb 17, 2014 at 9:19 AM, David Thomas <dt5434...@gmail.com> wrote:
>> From docs:
>> Connecting an Application to the Cluster
>> 
>> To run an application on the Spark cluster, simply pass the spark://IP:PORT 
>> URL of the master as to the SparkContext constructor.
>> 
>> Could someone enlighten me on what happens if I run the app, from say, 
>> Eclipse on my local machine, but use the url of the master node which is on 
>> cloud. What role does my local JVM play then?
>> 
>> 
>> 
>> 
> 
> 

Reply via email to