Re: Distributed running in Spark Interactive shell

giive chen Wed, 26 Mar 2014 14:31:34 -0700

This response is for Sai

The easiest way to verify your current Spark-Shell setting is just type
"sc.master"


IF your setting is correct, it should return
scala> sc.master
res0: String = spark://master.ip.url.com:5050

If your SPARK_MASTER_IP is not correct setting, it will response
scala> sc.master
res0: String = local

That means your spark-shell is running on local mode.

You can also check on Spark master's web ui. You should have a Spark-Shell
program running on master's application list.

Wisely Chen







On Wed, Mar 26, 2014 at 10:12 PM, Nan Zhu <zhunanmcg...@gmail.com> wrote:

>  and, yes, I think that picture is a bit misleading, though in the
> following paragraph it has mentioned that
>
> "
> Because the driver *schedules* tasks on the cluster, it should be run
> close to the worker nodes, preferably on the same local area network. If
> you'd like to send requests to the cluster remotely, it's better to open an
> RPC to the driver and have it submit operations from nearby than to run a
> driver far away from the worker nodes.
> "
>
> --
> Nan Zhu
>
> On Wednesday, March 26, 2014 at 9:59 AM, Nan Zhu wrote:
>
>  master does more work than that actually, I just explained why he should
> set MASTER_IP correctly
>
> a simplified list:
>
> 1. maintain the  worker status
>
> 2. maintain in-cluster driver status
>
> 3. maintain executor status (the worker tells master what happened on the
> executor,
>
>
>
> --
> Nan Zhu
>
>
> On Wednesday, March 26, 2014 at 9:46 AM, Yana Kadiyska wrote:
>
> Nan (or anyone who feels they understand the cluster architecture well),
> can you clarify something for me.
>
> From reading this user group and your explanation above it appears that
> the cluster master is only involved in this during application startup --
> to allocate executors(from what you wrote sounds like the driver itself
> passes the job/tasks to  the executors). From there onwards all computation
> is done on the executors, who communicate results directly to the driver if
> certain actions (say collect) are performed. Is that right? The only
> description of the cluster I've seen came from here:
> https://spark.apache.org/docs/0.9.0/cluster-overview.html but that
> picture suggests there is no direct communication between driver and
> executors, which I believe is wrong (unless I am misreading the picture --
> I believe Master and "Cluster Manager" refer to the same thing?).
>
> The very short form of my question is, does the master do anything other
> than executor allocation?
>
>
> On Wed, Mar 26, 2014 at 9:23 AM, Nan Zhu <zhunanmcg...@gmail.com> wrote:
>
>  what you only need to do is ensure your spark cluster is running well,
> (you can check by access the Spark UI to see if all workers are displayed)
>
> then, you have to set correct SPARK_MASTER_IP in the machine where you run
> spark-shell
>
> The more details are :
>
> when you run bin/spark-shell, it will start the driver program in that
> machine, interacting with the Master to start the application (in this
> case, it is spark-shell)
>
> the Master tells Workers to start executors for your application, and the
> executors will try to register with your driver,
>
> then your driver can distribute tasks to the executors, i.e. run in a
> distributed fashion
>
>
> Best,
>
> --
> Nan Zhu
>
> On Wednesday, March 26, 2014 at 9:01 AM, Sai Prasanna wrote:
>
> Nan Zhu, its the later, I want to distribute the tasks to the cluster
> [machines available.]
>
> If i set the SPARK_MASTER_IP at the other machines and set the slaves-IP
> in the /conf/slaves at the master node, will the interactive shell code run
> at the master get distributed across multiple machines ???
>
>
>
>
>
> On Wed, Mar 26, 2014 at 6:32 PM, Nan Zhu <zhunanmcg...@gmail.com> wrote:
>
>  what do you mean by run across the cluster?
>
> you want to start the spark-shell across the cluster or you want to
> distribute tasks to multiple machines?
>
> if the former case, yes, as long as you indicate the right master URL
>
> if the later case, also yes, you can observe the distributed task in the
> Spark UI
>
> --
> Nan Zhu
>
> On Wednesday, March 26, 2014 at 8:54 AM, Sai Prasanna wrote:
>
> Is it possible to run across cluster using Spark Interactive Shell ?
>
> To be more explicit, is the procedure similar to running standalone
> master-slave spark.
>
> I want to execute my code in  the interactive shell in the master-node,
> and it should run across the cluster [say 5 node]. Is the procedure similar
> ???
>
>
>
>
>
> --
> *Sai Prasanna. AN*
> *II M.Tech (CS), SSSIHL*
>
>
> *Entire water in the ocean can never sink a ship, Unless it gets inside.
> All the pressures of life can never hurt you, Unless you let them in.*
>
>
>
>
>
> --
> *Sai Prasanna. AN*
> *II M.Tech (CS), SSSIHL*
>
>
> *Entire water in the ocean can never sink a ship, Unless it gets inside.
> All the pressures of life can never hurt you, Unless you let them in.*
>
>
>
>
>
>

Re: Distributed running in Spark Interactive shell

Reply via email to