Adam Binford created SPARK-32589:
------------------------------------

             Summary: NoSuchElementException: None.get for 
needsUnsafeRowConversion
                 Key: SPARK-32589
                 URL: https://issues.apache.org/jira/browse/SPARK-32589
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Adam Binford


I have run into an error somewhat non-deterministically where a query fails 
with 

{{NoSuchElementException: None.get}}

which happens at 
[https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L182]

getActiveSession apparently is returning None. I only use Pyspark, and I think 
this is a threading issue, since the active session comes from an 
InheritableThreadLocal. I encounter this both when I manually use threading to 
run multiple jobs at the same time, as well as occasionally when I have 
multiple streams active at the same time. I tried using the PYSPARK_PIN_THREAD 
flag but it didn't seem to help. For the former case I hacked around it in my 
manual threading code by doing

{{spark._jvm.SparkSession.setActiveSession(spark._jvm.SparkSession.builder().getOrCreate())}}

at the start of each new thread, and this sometimes doesn't work reliably 
either.

I see this was mentioned in this 
[issue|https://issues.apache.org/jira/browse/SPARK-21418?focusedCommentId=16174642&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16174642]

I'm not sure if the problem/solution is something to do with Python threads, or 
adding a default value or some other way of updating this function. One other 
note is that I started encountering this when using Delta Lake OSS, which reads 
parquet files as part of the transaction log, which is when this error always 
happens. It doesn't seem like anything specific to that library though that 
would be doing something incorrectly that would cause this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to