Hi,

The off-heap memory usage of the 3 Spark executor processes keeps increasing 
constantly until the boundaries of the physical RAM are hit. This happened two 
weeks ago, at which point the system comes to a grinding halt, because it's 
unable to spawn new processes. At such a moment restarting Spark is the obvious 
solution. In the collectd memory usage graph in the link below (1) we see two 
moments that we've restarted Spark: last week when we upgraded Spark from 1.4.1 
to 1.5.1 and two weeks ago when the physical memory was exhausted.
(1) http://i.stack.imgur.com/P4DE3.png

As can be seen at the bottom of this mail (2), the Spark executor process uses 
approx. 62GB of memory, while the heap size max is set to 20GB. This means the 
off-heap memory usage is approx. 42GB.

Some info:
 - We use Spark Streaming lib.
 - Our code is written in Java.
 - We run Oracle Java v1.7.0_76
 - Data is read from Kafka (Kafka runs on different boxes).
 - Data is written to Cassandra (Cassandra runs on different boxes).
 - 1 Spark master and 3 Spark executors/workers, running on 4 separate boxes.
 - We recently upgraded Spark to 1.4.1 and 1.5.1 and the memory usage pattern 
is identical on all those versions.

What can be the cause of this ever-increasing off-heap memory use?

PS: I've posted this question on StackOverflow yesterday: 
http://stackoverflow.com/questions/33668035/spark-executors-off-heap-memory-usage-keeps-increasing

(2)
$ ps aux | grep 40724
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
apache-+ 40724  140 47.1 75678780 62181644 ?   Sl   Nov06 11782:27 
/usr/lib/jvm/java-7-oracle/jre/bin/java -cp 
/opt/spark-1.5.1-bin-hadoop2.4/conf/:/opt/spark-1.5.1-bin-hadoop2.4/lib/spark-assembly-1.5.1-hadoop2.4.0.jar:/opt/spark-1.5.1-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.5.1-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.5.1-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar
 -Xms20480M -Xmx20480M -Dspark.driver.port=7201 -Dspark.blockManager.port=7206 
-Dspark.executor.port=7202 -Dspark.broadcast.port=7204 
-Dspark.fileserver.port=7203 -Dspark.replClassServer.port=7205 
-XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend 
--driver-url 
akka.tcp://sparkdri...@xxx.xxx.xxx.xxx:7201/user/CoarseGrainedScheduler 
--executor-id 2 --hostname xxx.xxx.xxx.xxx --cores 10 --app-id 
app-20151106125547-0000 --worker-url 
akka.tcp://sparkwor...@xxx.xxx.xxx.xxx:7200/user/Worker
$ sudo -u apache-spark jps
40724 CoarseGrainedExecutorBackend
40517 Worker
30664 Jps
$ sudo -u apache-spark jstat -gc 40724
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU 
   YGC     YGCT    FGC    FGCT     GCT
158720.0 157184.0 110339.8  0.0   6674944.0 1708036.1 13981184.0 2733206.2  
59904.0 59551.9  41944 1737.864  39     13.464 1751.328
$ sudo -u apache-spark jps -v
40724 CoarseGrainedExecutorBackend -Xms20480M -Xmx20480M 
-Dspark.driver.port=7201 -Dspark.blockManager.port=7206 
-Dspark.executor.port=7202 -Dspark.broadcast.port=7204 
-Dspark.fileserver.port=7203 -Dspark.replClassServer.port=7205 
-XX:MaxPermSize=256m
40517 Worker -Xms2048m -Xmx2048m -XX:MaxPermSize=256m
10693 Jps -Dapplication.home=/usr/lib/jvm/java-7-oracle -Xms8m

Kind regards,

Balthasar Schopman
Software Developer
LeaseWeb Technologies B.V.

T: +31 20 316 0232
M:
E: b.schop...@tech.leaseweb.com
W: http://www.leaseweb.com

Luttenbergweg 8, 1101 EC Amsterdam, Netherlands



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to