Seems reasonable. We should probably add `getActiveSession` to the PySpark
API (filed a starter JIRA https://issues.apache.org/jira/browse/SPARK-25255
)

On Mon, Aug 27, 2018 at 12:09 PM Andrew Melo <andrew.m...@gmail.com> wrote:

> Hello Sean, others -
>
> Just to confirm, is it OK for client applications to access
> SparkContext._active_spark_context, if it wraps the accesses in `with
> SparkContext._lock:`?
>
> If that's acceptable to Spark, I'll implement the modifications in the
> Jupyter extensions.
>
> thanks!
> Andrew
>
> On Tue, Aug 7, 2018 at 5:52 PM, Andrew Melo <andrew.m...@gmail.com> wrote:
> > Hi Sean,
> >
> > On Tue, Aug 7, 2018 at 5:44 PM, Sean Owen <sro...@gmail.com> wrote:
> >> Ah, python.  How about SparkContext._active_spark_context then?
> >
> > Ah yes, that looks like the right member, but I'm a bit wary about
> > depending on functionality of objects with leading underscores. I
> > assumed that was "private" and subject to change. Is that something I
> > should be unconcerned about.
> >
> > The other thought is that the accesses with SparkContext are protected
> > by "SparkContext._lock" -- should I also use that lock?
> >
> > Thanks for your help!
> > Andrew
> >
> >>
> >> On Tue, Aug 7, 2018 at 5:34 PM Andrew Melo <andrew.m...@gmail.com>
> wrote:
> >>>
> >>> Hi Sean,
> >>>
> >>> On Tue, Aug 7, 2018 at 5:16 PM, Sean Owen <sro...@gmail.com> wrote:
> >>> > Is SparkSession.getActiveSession what you're looking for?
> >>>
> >>> Perhaps -- though there's not a corresponding python function, and I'm
> >>> not exactly sure how to call the scala getActiveSession without first
> >>> instantiating the python version and causing a JVM to start.
> >>>
> >>> Is there an easy way to call getActiveSession that doesn't start a JVM?
> >>>
> >>> Cheers
> >>> Andrew
> >>>
> >>> >
> >>> > On Tue, Aug 7, 2018 at 5:11 PM Andrew Melo <andrew.m...@gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> Hello,
> >>> >>
> >>> >> One pain point with various Jupyter extensions [1][2] that provide
> >>> >> visual feedback about running spark processes is the lack of a
> public
> >>> >> API to introspect the web URL. The notebook server needs to know the
> >>> >> URL to find information about the current SparkContext.
> >>> >>
> >>> >> Simply looking for "localhost:4040" works most of the time, but
> fails
> >>> >> if multiple spark notebooks are being run on the same host -- spark
> >>> >> increments the port for each new context, leading to confusion when
> >>> >> the notebooks are trying to probe the web interface for information.
> >>> >>
> >>> >> I'd like to implement an analog to SparkContext.getOrCreate(),
> perhaps
> >>> >> called "getIfExists()" that returns the current singleton if it
> >>> >> exists, or None otherwise. The Jupyter code would then be able to
> use
> >>> >> this entrypoint to query Spark for an active Spark context, which it
> >>> >> could use to probe the web URL.
> >>> >>
> >>> >> It's a minor change, but this would be my first contribution to
> Spark,
> >>> >> and I want to make sure my plan was kosher before I implemented it.
> >>> >>
> >>> >> Thanks!
> >>> >> Andrew
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> [1] https://krishnan-r.github.io/sparkmonitor/
> >>> >>
> >>> >> [2] https://github.com/mozilla/jupyter-spark
> >>> >>
> >>> >>
> ---------------------------------------------------------------------
> >>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>> >>
> >>> >
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Reply via email to