Hi,

Recently, I want to build a system that can continuously process spark jobs.
Under the hood, I keep a spark-shell alive so I can utilize RDD caching to
cache spark jobs’ input (sometimes our jobs have the same input data). It
works well until we find a problem about PermGen Space: After 500 job runs
in spark-shell, my spark driver throws the Java OOM Exception: PermGen
Space.

At first, I thought maybe there are some memory leaks in my code. After I
dived deeper, I realized it might not be the case. Every time I send a
command to spark-shell, the permGen space increases. Here is what I did to
measure the spark-shell driver’s permGen space:

I launch a spark-shell and call a simple command multi times: scala> for (i
<- 1 to 50) { val rdd = sc.binaryFiles("/share/HIGGS") } 
I can see the PermGen space keep increasing (See the PU column). Even with
explicit GC ( scala> for (i <- 1 to 50) { System.gc() } ), the PermGen space
still increases:

[dev@sandbox ~]$ jstat -gc 20581
S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC    
PU    YGC     YGCT    FGC    FGCT     GCT
2560.0 2560.0  0.0   2541.0 344576.0 81066.0   699392.0   206814.0  97280.0
96796.2    207    0.803  198    37.506   38.309
[dev@sandbox ~]$ jstat -gc 20581
S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC    
PU    YGC     YGCT    FGC    FGCT     GCT
2560.0 8704.0 2545.4  0.0   332288.0 331671.6  699392.0   214443.6  97280.0
96851.0    208    0.813  198    37.506   38.319

On the graph I attach, it shows the heapdump comparison. It seems a lot of
scala reflection classes are loaded. Not sure whether it is the root cause. 

Moreover, it seems that even calling the garbage collector itself increases
the PermGen usage.
We could try to increase PermGen size to mitigate the problem, but that will
only postpone the problem: we want to keep spark-shell as long as possible
at with ever-increasing PermGen usage, we will run out of memory.

1.      Is there any interdependence between garbage collector and Spark shell?
Can Spark shell stop garbage collector from cleaning PermSpace
2.      We don’t want a full Spark restart because it takes too long: is there a
way to clean PermGen space in Spark? 

This also happens with these swtiches:
./bin/spark-shell --conf
"spark.executor.extraJavaOptions=-XX:+CMSClassUnloadingEnabled
-XX:+CMSPermGenSweepingEnabled"

Appreciate any advices. Thank you.

Thanks,
Yunshan 

<http://apache-spark-user-list.1001560.n3.nabble.com/file/n25713/heap-comparison_%2800000002%29.png>
 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-keep-long-running-spark-shell-but-avoid-hitting-Java-Out-of-Memory-Exception-PermGen-Space-tp25713.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to