I noticed that, by default, in CDH-5.1 (Spark 1.0.0), in both, StandAlone and Yarn mode - no GC options are set when an executor is launched. The only options passed in StandAlone mode are "-XX:MaxPermSize=128m -Xms16384M -Xmx16384M" (when I give each executor 16G).
In Yarn mode, even fewer JVM options are set - "-server -XX:OnOutOfMemoryError=kill %p -Xms16384m -Xmx16384m" Monitoring OS and heap usage side-by-side (using top and jmap), I see that my physical memory usage is anywhere between 2x-5x of the heap usage (all heap, not just live objects). So I set this, SPARK_JAVA_OPTS="-XX:MaxPermSize=128m -XX:NewSize=1024m -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70" I am still monitoring but I think my app is more stable now, in standalone mode, whereas earlier, under Yarn, the container would get killed for too much memory usage. How do I get Yarn to enforce SPARK_JAVA_OPTS? Setting "spark.executor.extrajavaoptions" doesn't seem to work. On Thu, Sep 11, 2014 at 1:50 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > Which version of spark are you running? > > If you are running the latest one, then could try running not a window but a > simple event count on every 2 second batch, and see if you are still running > out of memory? > > TD > > > On Thu, Sep 11, 2014 at 10:34 AM, Aniket Bhatnagar > <aniket.bhatna...@gmail.com> wrote: >> >> I did change it to be 1 gb. It still ran out of memory but a little later. >> >> The streaming job isnt handling a lot of data. In every 2 seconds, it >> doesn't get more than 50 records. Each record size is not more than 500 >> bytes. >> >> On Sep 11, 2014 10:54 PM, "Bharat Venkat" <bvenkat.sp...@gmail.com> wrote: >>> >>> You could set "spark.executor.memory" to something bigger than the >>> default (512mb) >>> >>> >>> On Thu, Sep 11, 2014 at 8:31 AM, Aniket Bhatnagar >>> <aniket.bhatna...@gmail.com> wrote: >>>> >>>> I am running a simple Spark Streaming program that pulls in data from >>>> Kinesis at a batch interval of 10 seconds, windows it for 10 seconds, maps >>>> data and persists to a store. >>>> >>>> The program is running in local mode right now and runs out of memory >>>> after a while. I am yet to investigate heap dumps but I think Spark isn't >>>> releasing memory after processing is complete. I have even tried changing >>>> storage level to disk only. >>>> >>>> Help! >>>> >>>> Thanks, >>>> Aniket >>> >>> > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org