have you deactivated the spark.ui ? I have read several thread explaining the ui can lead to OOM because it stores 1000 dags by default
On Sun, Oct 20, 2019 at 03:18:20AM -0700, Paul Wais wrote: > Dear List, > > I've observed some sort of memory leak when using pyspark to run ~100 > jobs in local mode. Each job is essentially a create RDD -> create DF > -> write DF sort of flow. The RDD and DFs go out of scope after each > job completes, hence I call this issue a "memory leak." Here's > pseudocode: > > ``` > row_rdds = [] > for i in range(100): > row_rdd = spark.sparkContext.parallelize([{'a': i} for i in range(1000)]) > row_rdds.append(row_rdd) > > for row_rdd in row_rdds: > df = spark.createDataFrame(row_rdd) > df.persist() > print(df.count()) > df.write.save(...) # Save parquet > df.unpersist() > > # Does not help: > # del df > # del row_rdd > ``` > > In my real application: > * rows are much larger, perhaps 1MB each > * row_rdds are sized to fit available RAM > > I observe that after 100 or so iterations of the second loop (each of > which creates a "job" in the Spark WebUI), the following happens: > * pyspark workers have fairly stable resident and virtual RAM usage > * java process eventually approaches resident RAM cap (8GB standard) > but virtual RAM usage keeps ballooning. > > Eventually the machine runs out of RAM and the linux OOM killer kills > the java process, resulting in an "IndexError: pop from an empty > deque" error from py4j/java_gateway.py . > > > Does anybody have any ideas about what's going on? Note that this is > local mode. I have personally run standalone masters and submitted a > ton of jobs and never seen something like this over time. Those were > very different jobs, but perhaps this issue is bespoke to local mode? > > Emphasis: I did try to del the pyspark objects and run python GC. > That didn't help at all. > > pyspark 2.4.4 on java 1.8 on ubuntu bionic (tensorflow docker image) > > 12-core i7 with 16GB of ram and 22GB swap file (swap is *on*). > > Cheers, > -Paul > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > -- nicolas --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org