Sure, I don't think you should wait on that being merged in. If you want to
take the JIRA go ahead (although if you're already familiar with the Spark
code base it might make sense to leave it as a starter issue for someone
who is just getting started).

On Mon, Aug 27, 2018 at 12:18 PM Andrew Melo <andrew.m...@gmail.com> wrote:

> Hi Holden,
>
> I'm agnostic to the approach (though it seems cleaner to have an
> explicit API for it). If you would like, I can take that JIRA and
> implement it (should be a 3-line function).
>
> Cheers
> Andrew
>
> On Mon, Aug 27, 2018 at 2:14 PM, Holden Karau <hol...@pigscanfly.ca>
> wrote:
> > Seems reasonable. We should probably add `getActiveSession` to the
> PySpark
> > API (filed a starter JIRA
> https://issues.apache.org/jira/browse/SPARK-25255
> > )
> >
> > On Mon, Aug 27, 2018 at 12:09 PM Andrew Melo <andrew.m...@gmail.com>
> wrote:
> >>
> >> Hello Sean, others -
> >>
> >> Just to confirm, is it OK for client applications to access
> >> SparkContext._active_spark_context, if it wraps the accesses in `with
> >> SparkContext._lock:`?
> >>
> >> If that's acceptable to Spark, I'll implement the modifications in the
> >> Jupyter extensions.
> >>
> >> thanks!
> >> Andrew
> >>
> >> On Tue, Aug 7, 2018 at 5:52 PM, Andrew Melo <andrew.m...@gmail.com>
> wrote:
> >> > Hi Sean,
> >> >
> >> > On Tue, Aug 7, 2018 at 5:44 PM, Sean Owen <sro...@gmail.com> wrote:
> >> >> Ah, python.  How about SparkContext._active_spark_context then?
> >> >
> >> > Ah yes, that looks like the right member, but I'm a bit wary about
> >> > depending on functionality of objects with leading underscores. I
> >> > assumed that was "private" and subject to change. Is that something I
> >> > should be unconcerned about.
> >> >
> >> > The other thought is that the accesses with SparkContext are protected
> >> > by "SparkContext._lock" -- should I also use that lock?
> >> >
> >> > Thanks for your help!
> >> > Andrew
> >> >
> >> >>
> >> >> On Tue, Aug 7, 2018 at 5:34 PM Andrew Melo <andrew.m...@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> Hi Sean,
> >> >>>
> >> >>> On Tue, Aug 7, 2018 at 5:16 PM, Sean Owen <sro...@gmail.com> wrote:
> >> >>> > Is SparkSession.getActiveSession what you're looking for?
> >> >>>
> >> >>> Perhaps -- though there's not a corresponding python function, and
> I'm
> >> >>> not exactly sure how to call the scala getActiveSession without
> first
> >> >>> instantiating the python version and causing a JVM to start.
> >> >>>
> >> >>> Is there an easy way to call getActiveSession that doesn't start a
> >> >>> JVM?
> >> >>>
> >> >>> Cheers
> >> >>> Andrew
> >> >>>
> >> >>> >
> >> >>> > On Tue, Aug 7, 2018 at 5:11 PM Andrew Melo <andrew.m...@gmail.com
> >
> >> >>> > wrote:
> >> >>> >>
> >> >>> >> Hello,
> >> >>> >>
> >> >>> >> One pain point with various Jupyter extensions [1][2] that
> provide
> >> >>> >> visual feedback about running spark processes is the lack of a
> >> >>> >> public
> >> >>> >> API to introspect the web URL. The notebook server needs to know
> >> >>> >> the
> >> >>> >> URL to find information about the current SparkContext.
> >> >>> >>
> >> >>> >> Simply looking for "localhost:4040" works most of the time, but
> >> >>> >> fails
> >> >>> >> if multiple spark notebooks are being run on the same host --
> spark
> >> >>> >> increments the port for each new context, leading to confusion
> when
> >> >>> >> the notebooks are trying to probe the web interface for
> >> >>> >> information.
> >> >>> >>
> >> >>> >> I'd like to implement an analog to SparkContext.getOrCreate(),
> >> >>> >> perhaps
> >> >>> >> called "getIfExists()" that returns the current singleton if it
> >> >>> >> exists, or None otherwise. The Jupyter code would then be able to
> >> >>> >> use
> >> >>> >> this entrypoint to query Spark for an active Spark context, which
> >> >>> >> it
> >> >>> >> could use to probe the web URL.
> >> >>> >>
> >> >>> >> It's a minor change, but this would be my first contribution to
> >> >>> >> Spark,
> >> >>> >> and I want to make sure my plan was kosher before I implemented
> it.
> >> >>> >>
> >> >>> >> Thanks!
> >> >>> >> Andrew
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >>
> >> >>> >> [1] https://krishnan-r.github.io/sparkmonitor/
> >> >>> >>
> >> >>> >> [2] https://github.com/mozilla/jupyter-spark
> >> >>> >>
> >> >>> >>
> >> >>> >>
> ---------------------------------------------------------------------
> >> >>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >> >>> >>
> >> >>> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>
> >
> >
> > --
> > Twitter: https://twitter.com/holdenkarau
> > Books (Learning Spark, High Performance Spark, etc.):
> > https://amzn.to/2MaRAG9
> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Reply via email to