[ https://issues.apache.org/jira/browse/SPARK-24334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16486832#comment-16486832 ]
Mateusz Pieniak commented on SPARK-24334: ----------------------------------------- [~icexelloss] I didn't notice exception in my UDF when I tried to collect 1000 rows to driver and display dataframe. However when I tried to dump it to Parquet, I noticed some executors failed because of the exception in my UDF and others because of exception related to this issue. I made sure that there is no exception thrown in my UDF. Right now my executors fail because of different issue. I don't know whether it's related. {code:java} ExecutorLostFailure (executor 1 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 390732 ms{code} > Race condition in ArrowPythonRunner causes unclean shutdown of Arrow memory > allocator > ------------------------------------------------------------------------------------- > > Key: SPARK-24334 > URL: https://issues.apache.org/jira/browse/SPARK-24334 > Project: Spark > Issue Type: Sub-task > Components: PySpark > Affects Versions: 2.3.0 > Reporter: Li Jin > Priority: Major > > Currently, ArrowPythonRunner has two thread that frees the Arrow vector > schema root and allocator - The main writer thread and task completion > listener thread. > Having both thread doing the clean up leads to weird case (e.g., negative ref > cnt, NPE, and memory leak exception) when an exceptions are thrown from the > user function. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org