Hm, interesting!  I don't think many of us have used
SparkSession.builder.getOrCreate
repeatedly in the same process.  What happens if you manually stop the
spark session first, (session.stop()
<https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=sparksession#pyspark.sql.SparkSession.stop>?)
or maybe try to explicitly create a new session via newSession()
<https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=sparksession#pyspark.sql.SparkSession.newSession>
?

On Thu, Feb 6, 2020 at 7:31 PM Neil Shah-Quinn <nshahqu...@wikimedia.org>
wrote:

> Hi Luca!
>
> Those were separate Yarn jobs I started later. When I got this error, I
> found that the Yarn job corresponding to the SparkContext was marked as
> "successful", but I still couldn't get SparkSession.builder.getOrCreate to
> open a new one.
>
> Any idea what might have caused that or how I could recover without
> restarting the notebook, which could mean losing a lot of in-progress work?
> I had already restarted that kernel so I don't know if I'll encounter this
> problem again. If I do, I'll file a task.
>
> On Wed, 5 Feb 2020 at 23:24, Luca Toscano <ltosc...@wikimedia.org> wrote:
>
>> Hey Neil,
>>
>> there were two Yarn jobs running related to your notebooks, I just killed
>> them, let's see if it solves the problem (you might need to restart again
>> your notebook). If not, let's open a task and investigate :)
>>
>> Luca
>>
>> Il giorno gio 6 feb 2020 alle ore 02:08 Neil Shah-Quinn <
>> nshahqu...@wikimedia.org> ha scritto:
>>
>>> Whoa—I just got the same stopped SparkContext error on the query even
>>> after restarting the notebook, without an intermediate Java heap space
>>> error. That seems very strange to me.
>>>
>>> On Wed, 5 Feb 2020 at 16:09, Neil Shah-Quinn <nshahqu...@wikimedia.org>
>>> wrote:
>>>
>>>> Hey there!
>>>>
>>>> I was running SQL queries via PySpark (using the wmfdata package
>>>> <https://github.com/neilpquinn/wmfdata/blob/master/wmfdata/hive.py>)
>>>> on SWAP when one of my queries failed with "java.lang.OutofMemoryError:
>>>> Java heap space".
>>>>
>>>> After that, when I tried to call the spark.sql function again (via
>>>> wmfdata.hive.run), it failed with "java.lang.IllegalStateException: Cannot
>>>> call methods on a stopped SparkContext."
>>>>
>>>> When I tried to create a new Spark context using
>>>> SparkSession.builder.getOrCreate (whether using wmfdata.spark.get_session
>>>> or directly), it returned a SparkContent object properly, but calling the
>>>> object's sql function still gave the "stopped SparkContext error".
>>>>
>>>> Any idea what's going on? I assume restarting the notebook kernel would
>>>> take care of the problem, but it seems like there has to be a better way to
>>>> recover.
>>>>
>>>> Thank you!
>>>>
>>>>
>>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to