Funny, someone from my team talked to me about that idea yesterday.
We use SparkLauncher, but it just calls spark-submit that calls other
scripts that starts a new Java program that tries to submit (in our case in
cluster mode - driver is started in the Spark cluster) and exit.
That make it a challenge to troubleshoot cases where submit fails,
especially when users tries our app on their own spark environment. He
hoped to get a more decent / specific exception if submit failed, or be
able to debug it in an IDE (the actual calling to the master, its response
etc).

Ofir Manor

Co-Founder & CTO | Equalum

Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io

On Mon, Oct 10, 2016 at 9:13 PM, Russell Spitzer <russell.spit...@gmail.com>
wrote:

> Just folks who don't want to use spark-submit, no real use-cases I've seen
> yet.
>
> I didn't know about SparkLauncher myself and I don't think there are any
> official docs on that or launching spark as an embedded library for tests.
>
> On Mon, Oct 10, 2016 at 11:09 AM Matei Zaharia <matei.zaha...@gmail.com>
> wrote:
>
>> What are the main use cases you've seen for this? Maybe we can add a page
>> to the docs about how to launch Spark as an embedded library.
>>
>> Matei
>>
>> On Oct 10, 2016, at 10:21 AM, Russell Spitzer <russell.spit...@gmail.com>
>> wrote:
>>
>> I actually had not seen SparkLauncher before, that looks pretty great :)
>>
>> On Mon, Oct 10, 2016 at 10:17 AM Russell Spitzer <
>> russell.spit...@gmail.com> wrote:
>>
>>> I'm definitely only talking about non-embedded uses here as I also use
>>> embedded Spark (cassandra, and kafka) to run tests. This is almost always
>>> safe since everything is in the same JVM. It's only once we get to
>>> launching against a real distributed env do we end up with issues.
>>>
>>> Since Pyspark uses spark submit in the java gateway i'm not sure if that
>>> matters :)
>>>
>>> The cases I see are usually usually going through main directly, adding
>>> jars programatically.
>>>
>>> Usually ends up with classpath errors (Spark not on the CP, their jar
>>> not on the CP, dependencies not on the cp),
>>> conf errors (executors have the incorrect environment, executor
>>> classpath broken, not understanding spark-defaults won't do anything),
>>> Jar version mismatches
>>> Etc ...
>>>
>>> On Mon, Oct 10, 2016 at 10:05 AM Sean Owen <so...@cloudera.com> wrote:
>>>
>>>> I have also 'embedded' a Spark driver without much trouble. It isn't
>>>> that it can't work.
>>>>
>>>> The Launcher API is ptobably the recommended way to do that though.
>>>> spark-submit is the way to go for non programmatic access.
>>>>
>>>> If you're not doing one of those things and it is not working, yeah I
>>>> think people would tell you you're on your own. I think that's consistent
>>>> with all the JIRA discussions I have seen over time.
>>>>
>>>>
>>>> On Mon, Oct 10, 2016, 17:33 Russell Spitzer <russell.spit...@gmail.com>
>>>> wrote:
>>>>
>>>>> I've seen a variety of users attempting to work around using Spark
>>>>> Submit with at best middling levels of success. I think it would be 
>>>>> helpful
>>>>> if the project had a clear statement that submitting an application 
>>>>> without
>>>>> using Spark Submit is truly for experts only or is unsupported entirely.
>>>>>
>>>>> I know this is a pretty strong stance and other people have had
>>>>> different experiences than me so please let me know what you think :)
>>>>>
>>>>
>>

Reply via email to