Re: Sharing Spark Drivers

John Omernik Tue, 24 Feb 2015 09:49:36 -0800

I am aware of that, but two things are working against me here with
spark-kernel. Python is our language, and we are really looking for a
supported way to approach this for the enterprise.  I like the
concept, it just doesn't work for us given our constraints.


This does raise an interesting point though, if side projects are
spinning up to support this, why not make this a feature of the main
project or is it just that esoteric that it's not important for the
main project to be looking into it?



On Tue, Feb 24, 2015 at 9:25 AM, Chip Senkbeil <chip.senkb...@gmail.com> wrote:
> Hi John,
>
> This would be a potential application for the Spark Kernel project
> (https://github.com/ibm-et/spark-kernel). The Spark Kernel serves as your
> driver application, allowing you to feed it snippets of code (or load up
> entire jars via magics) in Scala to execute against a Spark cluster.
>
> Although not technically supported, you can connect multiple applications to
> the same Spark Kernel instance to use the same resources (both on the
> cluster and on the driver).
>
> If you're curious, you can find a getting started section here:
> https://github.com/ibm-et/spark-kernel/wiki/Getting-Started-with-the-Spark-Kernel
>
> Signed,
> Chip Senkbeil
>
> On Tue Feb 24 2015 at 8:04:08 AM John Omernik <j...@omernik.com> wrote:
>>
>> I have been posting on the Mesos list, as I am looking to see if it
>> it's possible or not to share spark drivers.  Obviously, in stand
>> alone cluster mode, the Master handles requests, and you can
>> instantiate a new sparkcontext to a currently running master. However
>> in Mesos (and perhaps Yarn) I don't see how this is possible.
>>
>> I guess I am curious on why? It could make quite a bit of sense to
>> have one driver act as a master, running as a certain user, (ideally
>> running out in the Mesos cluster, which I believe Tim Chen is working
>> on).   That driver could belong to a user, and be used as a long term
>> resource controlled instance that the user could use for adhoc
>> queries.  While running many little ones out on the cluster seems to
>> be a waste of driver resources, as each driver would be using the same
>> resources, and rarely would many be used at once (if they were for a
>> users adhoc environment). Additionally, the advantages of the shared
>> driver seem to play out for a user as they come back to the
>> environment over and over again.
>>
>> Does this make sense? I really want to try to understand how looking
>> at this way is wrong, either from a Spark paradigm perspective of a
>> technological perspective.  I will grant, that I am coming from a
>> traditional background, so some of the older ideas for how to set
>> things up may be creeping into my thinking, but if that's the case,
>> I'd love to understand better.
>>
>> Thanks1
>>
>> John
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Sharing Spark Drivers

Reply via email to