Yea we also didn't find anything related to this online.

Are you aware of any memory leaks in worker in 1.6.2 spark which might be
causing this ?
Do you know of any documentation which explains all the tasks that a worker
is performing ? Maybe we can get some clue from there.

Regards,
Behroz

On Fri, Mar 24, 2017 at 2:21 PM, Yong Zhang <java8...@hotmail.com> wrote:

> I never experienced worker OOM or very rarely see this online. So my guess
> that you have to generate the heap dump file to analyze it.
>
>
> Yong
>
>
> ------------------------------
> *From:* Behroz Sikander <behro...@gmail.com>
> *Sent:* Friday, March 24, 2017 9:15 AM
> *To:* Yong Zhang
> *Cc:* user@spark.apache.org
> *Subject:* Re: [Worker Crashing] OutOfMemoryError: GC overhead limit
> execeeded
>
> Thank you for the response.
>
> Yes, I am sure because the driver was working fine. Only 2 workers went
> down with OOM.
>
> Regards,
> Behroz
>
> On Fri, Mar 24, 2017 at 2:12 PM, Yong Zhang <java8...@hotmail.com> wrote:
>
>> I am not 100% sure, but normally "dispatcher-event-loop" OOM means the
>> driver OOM. Are you sure your workers OOM?
>>
>>
>> Yong
>>
>>
>> ------------------------------
>> *From:* bsikander <behro...@gmail.com>
>> *Sent:* Friday, March 24, 2017 5:48 AM
>> *To:* user@spark.apache.org
>> *Subject:* [Worker Crashing] OutOfMemoryError: GC overhead limit
>> execeeded
>>
>> Spark version: 1.6.2
>> Hadoop: 2.6.0
>>
>> Cluster:
>> All VMS are deployed on AWS.
>> 1 Master (t2.large)
>> 1 Secondary Master (t2.large)
>> 5 Workers (m4.xlarge)
>> Zookeeper (t2.large)
>>
>> Recently, 2 of our workers went down with out of memory exception.
>> java.lang.OutOfMemoryError: GC overhead limit exceeded (max heap: 1024 MB)
>>
>> Both of these worker processes were in hanged state. We restarted them to
>> bring them back to normal state.
>>
>> Here is the complete exception
>> https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91
>> <https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91>
>> Worker crashing
>> <https://gist.github.com/bsikander/84f1a0f3cc831c7a120225a71e435d91>
>> gist.github.com
>> Worker crashing
>>
>>
>>
>> Master's spark-default.conf file:
>> https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d
>> <https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d>
>> Default Configuration file for MASTER
>> <https://gist.github.com/bsikander/4027136f6a6c91eabad576495c4d797d>
>> gist.github.com
>> Default Configuration file for MASTER
>>
>>
>>
>> Master's spark-env.sh
>> https://gist.github.com/bsikander/42f76d7a8e4079098d8a2df3cdee8ee0
>>
>> Slave's spark-default.conf file:
>> https://gist.github.com/bsikander/54264349b49e6227c6912eb14d344b8c
>>
>> So, what could be the reason of our workers crashing due to OutOfMemory ?
>> How can we avoid that in future.
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Worker-Crashing-OutOfMemoryError-GC-ov
>> erhead-limit-execeeded-tp28535.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>

Reply via email to