Thank you for the response. Yes, I am sure because the driver was working fine. Only 2 workers went down with OOM.
Regards, Behroz On Fri, Mar 24, 2017 at 2:12 PM, Yong Zhang <java8...@hotmail.com> wrote: > I am not 100% sure, but normally "dispatcher-event-loop" OOM means the > driver OOM. Are you sure your workers OOM? > > > Yong > > > ------------------------------ > *From:* bsikander <behro...@gmail.com> > *Sent:* Friday, March 24, 2017 5:48 AM > *To:* user@spark.apache.org > *Subject:* [Worker Crashing] OutOfMemoryError: GC overhead limit execeeded > > Spark version: 1.6.2 > Hadoop: 2.6.0 > > Cluster: > All VMS are deployed on AWS. > 1 Master (t2.large) > 1 Secondary Master (t2.large) > 5 Workers (m4.xlarge) > Zookeeper (t2.large) > > Recently, 2 of our workers went down with out of memory exception. > java.lang.OutOfMemoryError: GC overhead limit exceeded (max heap: 1024 MB) > > Both of these worker processes were in hanged state. We restarted them to > bring them back to normal state. > > Here is the complete exception > https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91 > <https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91> > Worker crashing > <https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91> > gist.github.com > Worker crashing > > > > Master's spark-default.conf file: > https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d > <https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d> > Default Configuration file for MASTER > <https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d> > gist.github.com > Default Configuration file for MASTER > > > > Master's spark-env.sh > https://gist.github.com/bsikander/42f76d7a8e4079098d8a2df3cdee8ee0 > > Slave's spark-default.conf file: > https://gist.github.com/bsikander/54264349b49e6227c6912eb14d344b8c > > So, what could be the reason of our workers crashing due to OutOfMemory ? > How can we avoid that in future. > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Worker-Crashing-OutOfMemoryError-GC- > overhead-limit-execeeded-tp28535.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >