Re: Can't submit job to stand alone cluster

2015-12-30 Thread SparkUser
Sorry need to clarify: When you say: /When the docs say //"If your application is launched through Spark submit, then the application jar is automatically distributed to all worker nodes,"//it is actually saying that your executors get their jars from the driver. This is true

Re: Can't submit job to stand alone cluster

2015-12-30 Thread Andrew Or
Hi Jim, Just to clarify further: - *Driver *is the process with SparkContext. A driver represents an application (e.g. spark-shell, SparkPi) so there is exactly one driver in each application. - *Executor *is the process that runs the tasks scheduled by the driver. There should

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Andrew Or
Hi Greg, It's actually intentional for standalone cluster mode to not upload jars. One of the reasons why YARN takes at least 10 seconds before running any simple application is because there's a lot of random overhead (e.g. putting jars in HDFS). If this missing functionality is not documented

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Daniel Valdivia
That makes things more clear! Thanks Issue resolved Sent from my iPhone > On Dec 29, 2015, at 2:43 PM, Annabel Melongo > wrote: > > Thanks Andrew for this awesome explanation > > > On Tuesday, December 29, 2015 5:30 PM, Andrew Or >

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Andrew Or
http://spark.apache.org/docs/latest/spark-standalone.html#launching-spark-applications 2015-12-29 11:48 GMT-08:00 Annabel Melongo : > Greg, > > Can you please send me a doc describing the standalone cluster mode? > Honestly, I never heard about it. > > The three

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Annabel Melongo
Thanks Andrew for this awesome explanation  On Tuesday, December 29, 2015 5:30 PM, Andrew Or wrote: Let me clarify a few things for everyone: There are three cluster managers: standalone, YARN, and Mesos. Each cluster manager can run in two deploy modes, client

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Andrew Or
> > The confusion here is the expression "standalone cluster mode". Either > it's stand-alone or it's cluster mode but it can't be both. @Annabel That's not true. There *is* a standalone cluster mode where driver runs on one of the workers instead of on the client machine. What you're describing

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Annabel Melongo
Greg, The confusion here is the expression "standalone cluster mode". Either it's stand-alone or it's cluster mode but it can't be both.  With this in mind, here's how jars are uploaded:    1. Spark Stand-alone mode: client and driver run on the same machine; use --packages option to submit a

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Annabel Melongo
Greg, Can you please send me a doc describing the standalone cluster mode? Honestly, I never heard about it. The three different modes, I've listed appear in the last paragraph of this doc: Running Spark Applications |   | |   |   |   |   |   | | Running Spark Applications--class The FQCN of the

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Annabel Melongo
Andrew, Now I see where the confusion lays. Standalone cluster mode, your link, is nothing but a combination of client-mode and standalone mode, my link, without YARN. But I'm confused by this paragraph in your link:         If your application is launched through Spark submit, then the

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Andrew Or
Let me clarify a few things for everyone: There are three *cluster managers*: standalone, YARN, and Mesos. Each cluster manager can run in two *deploy modes*, client or cluster. In client mode, the driver runs on the machine that submitted the application (the client). In cluster mode, the driver

Re: Can't submit job to stand alone cluster

2015-12-29 Thread Greg Hill
On 12/28/15, 5:16 PM, "Daniel Valdivia" wrote: >Hi, > >I'm trying to submit a job to a small spark cluster running in stand >alone mode, however it seems like the jar file I'm submitting to the >cluster is "not found" by the workers nodes. > >I might have understood

Re: Can't submit job to stand alone cluster

2015-12-28 Thread Ted Yu
Have you verified that the following file does exist ? /home/hadoop/git/scalaspark/./target/scala-2.10/cluster- incidents_2.10-1.0.jar Thanks On Mon, Dec 28, 2015 at 3:16 PM, Daniel Valdivia wrote: > Hi, > > I'm trying to submit a job to a small spark cluster running

Re: Can't submit job to stand alone cluster

2015-12-28 Thread vivek.meghanathan
+ if exists whether it has read permission for the user who tries to run the job. Regards Vivek On Tue, Dec 29, 2015 at 6:56 am, Ted Yu > wrote: Have you verified that the following file does exist ?

Can't submit job to stand alone cluster

2015-12-28 Thread Daniel Valdivia
Hi, I'm trying to submit a job to a small spark cluster running in stand alone mode, however it seems like the jar file I'm submitting to the cluster is "not found" by the workers nodes. I might have understood wrong, but I though the Driver node would send this jar file to the worker nodes,