Hm, interesting! I don't think many of us have used SparkSession.builder.getOrCreate repeatedly in the same process. What happens if you manually stop the spark session first, (session.stop() <https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=sparksession#pyspark.sql.SparkSession.stop>?) or maybe try to explicitly create a new session via newSession() <https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=sparksession#pyspark.sql.SparkSession.newSession> ?
On Thu, Feb 6, 2020 at 7:31 PM Neil Shah-Quinn <nshahqu...@wikimedia.org> wrote: > Hi Luca! > > Those were separate Yarn jobs I started later. When I got this error, I > found that the Yarn job corresponding to the SparkContext was marked as > "successful", but I still couldn't get SparkSession.builder.getOrCreate to > open a new one. > > Any idea what might have caused that or how I could recover without > restarting the notebook, which could mean losing a lot of in-progress work? > I had already restarted that kernel so I don't know if I'll encounter this > problem again. If I do, I'll file a task. > > On Wed, 5 Feb 2020 at 23:24, Luca Toscano <ltosc...@wikimedia.org> wrote: > >> Hey Neil, >> >> there were two Yarn jobs running related to your notebooks, I just killed >> them, let's see if it solves the problem (you might need to restart again >> your notebook). If not, let's open a task and investigate :) >> >> Luca >> >> Il giorno gio 6 feb 2020 alle ore 02:08 Neil Shah-Quinn < >> nshahqu...@wikimedia.org> ha scritto: >> >>> Whoa—I just got the same stopped SparkContext error on the query even >>> after restarting the notebook, without an intermediate Java heap space >>> error. That seems very strange to me. >>> >>> On Wed, 5 Feb 2020 at 16:09, Neil Shah-Quinn <nshahqu...@wikimedia.org> >>> wrote: >>> >>>> Hey there! >>>> >>>> I was running SQL queries via PySpark (using the wmfdata package >>>> <https://github.com/neilpquinn/wmfdata/blob/master/wmfdata/hive.py>) >>>> on SWAP when one of my queries failed with "java.lang.OutofMemoryError: >>>> Java heap space". >>>> >>>> After that, when I tried to call the spark.sql function again (via >>>> wmfdata.hive.run), it failed with "java.lang.IllegalStateException: Cannot >>>> call methods on a stopped SparkContext." >>>> >>>> When I tried to create a new Spark context using >>>> SparkSession.builder.getOrCreate (whether using wmfdata.spark.get_session >>>> or directly), it returned a SparkContent object properly, but calling the >>>> object's sql function still gave the "stopped SparkContext error". >>>> >>>> Any idea what's going on? I assume restarting the notebook kernel would >>>> take care of the problem, but it seems like there has to be a better way to >>>> recover. >>>> >>>> Thank you! >>>> >>>> >>>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics >
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics