[ https://issues.apache.org/jira/browse/SPARK-28485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiao Li updated SPARK-28485: ---------------------------- Description: How to reproduce: Run "bin/pyspark --master local[1,1] --conf spark.memory.fraction=0.0001" to start PySpark and run the following codes: {code:java} b = sc.broadcast([100]) sc.parallelize([0],1).map(lambda x: 0 if x == 0 else b.value[0]).collect() sc._jvm.java.lang.System.gc() import time time.sleep(5) sc._jvm.java.lang.System.gc() time.sleep(5) sc.parallelize([1],1).map(lambda x: 0 if x == 0 else b.value[0]).collect(){code} was: How to reproduce: Run "bin/pyspark --master local[1,1] --conf spark.memory.fraction=0.0001" to start PySpark and run the following codes: {code:java} b = sc.broadcast([100]) sc.parallelize([0],1).map(lambda x: 0 if x == 0 else b.value[0]).collect() sc._jvm.java.lang.System.gc() import time time.sleep(5) sc._jvm.java.lang.System.gc() time.sleep(5) sc.parallelize([1],1).map(lambda x: 0 if x == 0 else b.value[0]).collect(){code} > PythonBroadcast may delete the broadcast file while a Python worker still > needs it > ---------------------------------------------------------------------------------- > > Key: SPARK-28485 > URL: https://issues.apache.org/jira/browse/SPARK-28485 > Project: Spark > Issue Type: Improvement > Components: PySpark > Affects Versions: 3.0.0 > Reporter: Xiao Li > Assignee: wuyi > Priority: Major > > How to reproduce: > Run "bin/pyspark --master local[1,1] --conf spark.memory.fraction=0.0001" to > start PySpark and run the following codes: > {code:java} > b = sc.broadcast([100]) > sc.parallelize([0],1).map(lambda x: 0 if x == 0 else b.value[0]).collect() > sc._jvm.java.lang.System.gc() > import time time.sleep(5) > sc._jvm.java.lang.System.gc() > time.sleep(5) > sc.parallelize([1],1).map(lambda x: 0 if x == 0 else > b.value[0]).collect(){code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org