Re: SparkContext singleton get w/o create?

Andrew Melo Mon, 27 Aug 2018 12:09:29 -0700

Hello Sean, others -

Just to confirm, is it OK for client applications to access
SparkContext._active_spark_context, if it wraps the accesses in `with
SparkContext._lock:`?


If that's acceptable to Spark, I'll implement the modifications in the
Jupyter extensions.

thanks!
Andrew

On Tue, Aug 7, 2018 at 5:52 PM, Andrew Melo <andrew.m...@gmail.com> wrote:
> Hi Sean,
>
> On Tue, Aug 7, 2018 at 5:44 PM, Sean Owen <sro...@gmail.com> wrote:
>> Ah, python.  How about SparkContext._active_spark_context then?
>
> Ah yes, that looks like the right member, but I'm a bit wary about
> depending on functionality of objects with leading underscores. I
> assumed that was "private" and subject to change. Is that something I
> should be unconcerned about.
>
> The other thought is that the accesses with SparkContext are protected
> by "SparkContext._lock" -- should I also use that lock?
>
> Thanks for your help!
> Andrew
>
>>
>> On Tue, Aug 7, 2018 at 5:34 PM Andrew Melo <andrew.m...@gmail.com> wrote:
>>>
>>> Hi Sean,
>>>
>>> On Tue, Aug 7, 2018 at 5:16 PM, Sean Owen <sro...@gmail.com> wrote:
>>> > Is SparkSession.getActiveSession what you're looking for?
>>>
>>> Perhaps -- though there's not a corresponding python function, and I'm
>>> not exactly sure how to call the scala getActiveSession without first
>>> instantiating the python version and causing a JVM to start.
>>>
>>> Is there an easy way to call getActiveSession that doesn't start a JVM?
>>>
>>> Cheers
>>> Andrew
>>>
>>> >
>>> > On Tue, Aug 7, 2018 at 5:11 PM Andrew Melo <andrew.m...@gmail.com>
>>> > wrote:
>>> >>
>>> >> Hello,
>>> >>
>>> >> One pain point with various Jupyter extensions [1][2] that provide
>>> >> visual feedback about running spark processes is the lack of a public
>>> >> API to introspect the web URL. The notebook server needs to know the
>>> >> URL to find information about the current SparkContext.
>>> >>
>>> >> Simply looking for "localhost:4040" works most of the time, but fails
>>> >> if multiple spark notebooks are being run on the same host -- spark
>>> >> increments the port for each new context, leading to confusion when
>>> >> the notebooks are trying to probe the web interface for information.
>>> >>
>>> >> I'd like to implement an analog to SparkContext.getOrCreate(), perhaps
>>> >> called "getIfExists()" that returns the current singleton if it
>>> >> exists, or None otherwise. The Jupyter code would then be able to use
>>> >> this entrypoint to query Spark for an active Spark context, which it
>>> >> could use to probe the web URL.
>>> >>
>>> >> It's a minor change, but this would be my first contribution to Spark,
>>> >> and I want to make sure my plan was kosher before I implemented it.
>>> >>
>>> >> Thanks!
>>> >> Andrew
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> [1] https://krishnan-r.github.io/sparkmonitor/
>>> >>
>>> >> [2] https://github.com/mozilla/jupyter-spark
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>> >>
>>> >

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: SparkContext singleton get w/o create?

Reply via email to