Are you able to share the error you're getting?
On Wed, Jul 9, 2014 at 9:25 AM, Jerry Lam <chiling...@gmail.com> wrote: > Sandy, I experienced the similar behavior as Koert just mentioned. I don't > understand why there is a difference between using spark-submit and > programmatic execution. Maybe there is something else we need to add to the > spark conf/spark context in order to launch spark jobs programmatically > that are not needed before? > > > > On Wed, Jul 9, 2014 at 12:14 PM, Koert Kuipers <ko...@tresata.com> wrote: > >> sandy, that makes sense. however i had trouble doing programmatic >> execution on yarn in client mode as well. the application-master in yarn >> came up but then bombed because it was looking for jars that dont exist (it >> was looking in the original file paths on the driver side, which are not >> available on the yarn node). my guess is that spark-submit is changing some >> settings (perhaps preparing the distributed cache and modifying settings >> accordingly), which makes it harder to run things programmatically. i could >> be wrong however. i gave up debugging and resorted to using spark-submit >> for now. >> >> >> >> On Wed, Jul 9, 2014 at 12:05 PM, Sandy Ryza <sandy.r...@cloudera.com> >> wrote: >> >>> Spark still supports the ability to submit jobs programmatically without >>> shell scripts. >>> >>> Koert, >>> The main reason that the unification can't be a part of SparkContext is >>> that YARN and standalone support deploy modes where the driver runs in a >>> managed process on the cluster. In this case, the SparkContext is created >>> on a remote node well after the application is launched. >>> >>> >>> On Wed, Jul 9, 2014 at 8:34 AM, Andrei <faithlessfri...@gmail.com> >>> wrote: >>> >>>> One another +1. For me it's a question of embedding. With >>>> SparkConf/SparkContext I can easily create larger projects with Spark as a >>>> separate service (just like MySQL and JDBC, for example). With spark-submit >>>> I'm bound to Spark as a main framework that defines how my application >>>> should look like. In my humble opinion, using Spark as embeddable library >>>> rather than main framework and runtime is much easier. >>>> >>>> >>>> >>>> >>>> On Wed, Jul 9, 2014 at 5:14 PM, Jerry Lam <chiling...@gmail.com> wrote: >>>> >>>>> +1 as well for being able to submit jobs programmatically without >>>>> using shell script. >>>>> >>>>> we also experience issues of submitting jobs programmatically without >>>>> using spark-submit. In fact, even in the Hadoop World, I rarely used >>>>> "hadoop jar" to submit jobs in shell. >>>>> >>>>> >>>>> >>>>> On Wed, Jul 9, 2014 at 9:47 AM, Robert James <srobertja...@gmail.com> >>>>> wrote: >>>>> >>>>>> +1 to be able to do anything via SparkConf/SparkContext. Our app >>>>>> worked fine in Spark 0.9, but, after several days of wrestling with >>>>>> uber jars and spark-submit, and so far failing to get Spark 1.0 >>>>>> working, we'd like to go back to doing it ourself with SparkConf. >>>>>> >>>>>> As the previous poster said, a few scripts should be able to give us >>>>>> the classpath and any other params we need, and be a lot more >>>>>> transparent and debuggable. >>>>>> >>>>>> On 7/9/14, Surendranauth Hiraman <suren.hira...@velos.io> wrote: >>>>>> > Are there any gaps beyond convenience and code/config separation in >>>>>> using >>>>>> > spark-submit versus SparkConf/SparkContext if you are willing to >>>>>> set your >>>>>> > own config? >>>>>> > >>>>>> > If there are any gaps, +1 on having parity within >>>>>> SparkConf/SparkContext >>>>>> > where possible. In my use case, we launch our jobs >>>>>> programmatically. In >>>>>> > theory, we could shell out to spark-submit but it's not the best >>>>>> option for >>>>>> > us. >>>>>> > >>>>>> > So far, we are only using Standalone Cluster mode, so I'm not >>>>>> knowledgeable >>>>>> > on the complexities of other modes, though. >>>>>> > >>>>>> > -Suren >>>>>> > >>>>>> > >>>>>> > >>>>>> > On Wed, Jul 9, 2014 at 8:20 AM, Koert Kuipers <ko...@tresata.com> >>>>>> wrote: >>>>>> > >>>>>> >> not sure I understand why unifying how you submit app for different >>>>>> >> platforms and dynamic configuration cannot be part of SparkConf and >>>>>> >> SparkContext? >>>>>> >> >>>>>> >> for classpath a simple script similar to "hadoop classpath" that >>>>>> shows >>>>>> >> what needs to be added should be sufficient. >>>>>> >> >>>>>> >> on spark standalone I can launch a program just fine with just >>>>>> SparkConf >>>>>> >> and SparkContext. not on yarn, so the spark-launch script must be >>>>>> doing a >>>>>> >> few things extra there I am missing... which makes things more >>>>>> difficult >>>>>> >> because I am not sure its realistic to expect every application >>>>>> that >>>>>> >> needs >>>>>> >> to run something on spark to be launched using spark-submit. >>>>>> >> On Jul 9, 2014 3:45 AM, "Patrick Wendell" <pwend...@gmail.com> >>>>>> wrote: >>>>>> >> >>>>>> >>> It fulfills a few different functions. The main one is giving >>>>>> users a >>>>>> >>> way to inject Spark as a runtime dependency separately from their >>>>>> >>> program and make sure they get exactly the right version of >>>>>> Spark. So >>>>>> >>> a user can bundle an application and then use spark-submit to >>>>>> send it >>>>>> >>> to different types of clusters (or using different versions of >>>>>> Spark). >>>>>> >>> >>>>>> >>> It also unifies the way you bundle and submit an app for Yarn, >>>>>> Mesos, >>>>>> >>> etc... this was something that became very fragmented over time >>>>>> before >>>>>> >>> this was added. >>>>>> >>> >>>>>> >>> Another feature is allowing users to set configuration values >>>>>> >>> dynamically rather than compile them inside of their program. >>>>>> That's >>>>>> >>> the one you mention here. You can choose to use this feature or >>>>>> not. >>>>>> >>> If you know your configs are not going to change, then you don't >>>>>> need >>>>>> >>> to set them with spark-submit. >>>>>> >>> >>>>>> >>> >>>>>> >>> On Wed, Jul 9, 2014 at 10:22 AM, Robert James < >>>>>> srobertja...@gmail.com> >>>>>> >>> wrote: >>>>>> >>> > What is the purpose of spark-submit? Does it do anything >>>>>> outside of >>>>>> >>> > the standard val conf = new SparkConf ... val sc = new >>>>>> SparkContext >>>>>> >>> > ... ? >>>>>> >>> >>>>>> >> >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > >>>>>> > SUREN HIRAMAN, VP TECHNOLOGY >>>>>> > Velos >>>>>> > Accelerating Machine Learning >>>>>> > >>>>>> > 440 NINTH AVENUE, 11TH FLOOR >>>>>> > NEW YORK, NY 10001 >>>>>> > O: (917) 525-2466 ext. 105 >>>>>> > F: 646.349.4063 >>>>>> > E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io >>>>>> > W: www.velos.io >>>>>> > >>>>>> >>>>> >>>>> >>>> >>> >> >