Spark on EMR is configured to use CMS GC, specifically following flags, spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p'
Regards, Keith. http://keith-chapman.com On Mon, Jun 11, 2018 at 8:22 PM, ankit jain <ankitjain....@gmail.com> wrote: > Hi, > Does anybody know if Yarn uses a different Garbage Collector from Spark > standalone? > > We migrated our application recently from EMR to K8(not using native spark > on k8 yet) and see quite a bit of performance degradation. > > Diving further it seems garbage collection is running too often, up-to 50% > of task time even with small amount of data - PFA Spark UI screenshot. > > I have updated GC to G1GC and it has helped a bit - GC time have come down > from 50-30%, still too high though. > > Also enabled -verbose:gc, so will be much more metrics to play with but > any pointers meanwhile will be appreciated. > > > -- > Thanks & Regards, > Ankit. > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >