Re: Purpose of spark-submit?

Sandy Ryza Wed, 09 Jul 2014 09:29:16 -0700

Are you able to share the error you're getting?


On Wed, Jul 9, 2014 at 9:25 AM, Jerry Lam <chiling...@gmail.com> wrote:

> Sandy, I experienced the similar behavior as Koert just mentioned. I don't
> understand why there is a difference between using spark-submit and
> programmatic execution. Maybe there is something else we need to add to the
> spark conf/spark context in order to launch spark jobs programmatically
> that are not needed before?
>
>
>
> On Wed, Jul 9, 2014 at 12:14 PM, Koert Kuipers <ko...@tresata.com> wrote:
>
>> sandy, that makes sense. however i had trouble doing programmatic
>> execution on yarn in client mode as well. the application-master in yarn
>> came up but then bombed because it was looking for jars that dont exist (it
>> was looking in the original file paths on the driver side, which are not
>> available on the yarn node). my guess is that spark-submit is changing some
>> settings (perhaps preparing the distributed cache and modifying settings
>> accordingly), which makes it harder to run things programmatically. i could
>> be wrong however. i gave up debugging and resorted to using spark-submit
>> for now.
>>
>>
>>
>> On Wed, Jul 9, 2014 at 12:05 PM, Sandy Ryza <sandy.r...@cloudera.com>
>> wrote:
>>
>>> Spark still supports the ability to submit jobs programmatically without
>>> shell scripts.
>>>
>>> Koert,
>>> The main reason that the unification can't be a part of SparkContext is
>>> that YARN and standalone support deploy modes where the driver runs in a
>>> managed process on the cluster.  In this case, the SparkContext is created
>>> on a remote node well after the application is launched.
>>>
>>>
>>> On Wed, Jul 9, 2014 at 8:34 AM, Andrei <faithlessfri...@gmail.com>
>>> wrote:
>>>
>>>> One another +1. For me it's a question of embedding. With
>>>> SparkConf/SparkContext I can easily create larger projects with Spark as a
>>>> separate service (just like MySQL and JDBC, for example). With spark-submit
>>>> I'm bound to Spark as a main framework that defines how my application
>>>> should look like. In my humble opinion, using Spark as embeddable library
>>>> rather than main framework and runtime is much easier.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Jul 9, 2014 at 5:14 PM, Jerry Lam <chiling...@gmail.com> wrote:
>>>>
>>>>> +1 as well for being able to submit jobs programmatically without
>>>>> using shell script.
>>>>>
>>>>> we also experience issues of submitting jobs programmatically without
>>>>> using spark-submit. In fact, even in the Hadoop World, I rarely used
>>>>> "hadoop jar" to submit jobs in shell.
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 9, 2014 at 9:47 AM, Robert James <srobertja...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> +1 to be able to do anything via SparkConf/SparkContext.  Our app
>>>>>> worked fine in Spark 0.9, but, after several days of wrestling with
>>>>>> uber jars and spark-submit, and so far failing to get Spark 1.0
>>>>>> working, we'd like to go back to doing it ourself with SparkConf.
>>>>>>
>>>>>> As the previous poster said, a few scripts should be able to give us
>>>>>> the classpath and any other params we need, and be a lot more
>>>>>> transparent and debuggable.
>>>>>>
>>>>>> On 7/9/14, Surendranauth Hiraman <suren.hira...@velos.io> wrote:
>>>>>> > Are there any gaps beyond convenience and code/config separation in
>>>>>> using
>>>>>> > spark-submit versus SparkConf/SparkContext if you are willing to
>>>>>> set your
>>>>>> > own config?
>>>>>> >
>>>>>> > If there are any gaps, +1 on having parity within
>>>>>> SparkConf/SparkContext
>>>>>> > where possible. In my use case, we launch our jobs
>>>>>> programmatically. In
>>>>>> > theory, we could shell out to spark-submit but it's not the best
>>>>>> option for
>>>>>> > us.
>>>>>> >
>>>>>> > So far, we are only using Standalone Cluster mode, so I'm not
>>>>>> knowledgeable
>>>>>> > on the complexities of other modes, though.
>>>>>> >
>>>>>> > -Suren
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Wed, Jul 9, 2014 at 8:20 AM, Koert Kuipers <ko...@tresata.com>
>>>>>> wrote:
>>>>>> >
>>>>>> >> not sure I understand why unifying how you submit app for different
>>>>>> >> platforms and dynamic configuration cannot be part of SparkConf and
>>>>>> >> SparkContext?
>>>>>> >>
>>>>>> >> for classpath a simple script similar to "hadoop classpath" that
>>>>>> shows
>>>>>> >> what needs to be added should be sufficient.
>>>>>> >>
>>>>>> >> on spark standalone I can launch a program just fine with just
>>>>>> SparkConf
>>>>>> >> and SparkContext. not on yarn, so the spark-launch script must be
>>>>>> doing a
>>>>>> >> few things extra there I am missing... which makes things more
>>>>>> difficult
>>>>>> >> because I am not sure its realistic to expect every application
>>>>>> that
>>>>>> >> needs
>>>>>> >> to run something on spark to be launched using spark-submit.
>>>>>> >>  On Jul 9, 2014 3:45 AM, "Patrick Wendell" <pwend...@gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >>> It fulfills a few different functions. The main one is giving
>>>>>> users a
>>>>>> >>> way to inject Spark as a runtime dependency separately from their
>>>>>> >>> program and make sure they get exactly the right version of
>>>>>> Spark. So
>>>>>> >>> a user can bundle an application and then use spark-submit to
>>>>>> send it
>>>>>> >>> to different types of clusters (or using different versions of
>>>>>> Spark).
>>>>>> >>>
>>>>>> >>> It also unifies the way you bundle and submit an app for Yarn,
>>>>>> Mesos,
>>>>>> >>> etc... this was something that became very fragmented over time
>>>>>> before
>>>>>> >>> this was added.
>>>>>> >>>
>>>>>> >>> Another feature is allowing users to set configuration values
>>>>>> >>> dynamically rather than compile them inside of their program.
>>>>>> That's
>>>>>> >>> the one you mention here. You can choose to use this feature or
>>>>>> not.
>>>>>> >>> If you know your configs are not going to change, then you don't
>>>>>> need
>>>>>> >>> to set them with spark-submit.
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On Wed, Jul 9, 2014 at 10:22 AM, Robert James <
>>>>>> srobertja...@gmail.com>
>>>>>> >>> wrote:
>>>>>> >>> > What is the purpose of spark-submit? Does it do anything
>>>>>> outside of
>>>>>> >>> > the standard val conf = new SparkConf ... val sc = new
>>>>>> SparkContext
>>>>>> >>> > ... ?
>>>>>> >>>
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> >
>>>>>> > SUREN HIRAMAN, VP TECHNOLOGY
>>>>>> > Velos
>>>>>> > Accelerating Machine Learning
>>>>>> >
>>>>>> > 440 NINTH AVENUE, 11TH FLOOR
>>>>>> > NEW YORK, NY 10001
>>>>>> > O: (917) 525-2466 ext. 105
>>>>>> > F: 646.349.4063
>>>>>> > E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io
>>>>>> > W: www.velos.io
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Purpose of spark-submit?

Reply via email to