Thank you for the response.

Yes, I am sure because the driver was working fine. Only 2 workers went
down with OOM.

Regards,
Behroz

On Fri, Mar 24, 2017 at 2:12 PM, Yong Zhang <java8...@hotmail.com> wrote:

> I am not 100% sure, but normally "dispatcher-event-loop" OOM means the
> driver OOM. Are you sure your workers OOM?
>
>
> Yong
>
>
> ------------------------------
> *From:* bsikander <behro...@gmail.com>
> *Sent:* Friday, March 24, 2017 5:48 AM
> *To:* user@spark.apache.org
> *Subject:* [Worker Crashing] OutOfMemoryError: GC overhead limit execeeded
>
> Spark version: 1.6.2
> Hadoop: 2.6.0
>
> Cluster:
> All VMS are deployed on AWS.
> 1 Master (t2.large)
> 1 Secondary Master (t2.large)
> 5 Workers (m4.xlarge)
> Zookeeper (t2.large)
>
> Recently, 2 of our workers went down with out of memory exception.
> java.lang.OutOfMemoryError: GC overhead limit exceeded (max heap: 1024 MB)
>
> Both of these worker processes were in hanged state. We restarted them to
> bring them back to normal state.
>
> Here is the complete exception
> https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91
> <https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91>
> Worker crashing
> <https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91>
> gist.github.com
> Worker crashing
>
>
>
> Master's spark-default.conf file:
> https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d
> <https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d>
> Default Configuration file for MASTER
> <https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d>
> gist.github.com
> Default Configuration file for MASTER
>
>
>
> Master's spark-env.sh
> https://gist.github.com/bsikander/42f76d7a8e4079098d8a2df3cdee8ee0
>
> Slave's spark-default.conf file:
> https://gist.github.com/bsikander/54264349b49e6227c6912eb14d344b8c
>
> So, what could be the reason of our workers crashing due to OutOfMemory ?
> How can we avoid that in future.
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Worker-Crashing-OutOfMemoryError-GC-
> overhead-limit-execeeded-tp28535.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Reply via email to