Re: Spark DStream application memory leak debugging

Sean Owen Thu, 07 Oct 2021 16:29:01 -0700

This much isn't related to Spark. Are you looking at /tmp on the executor?
You can use jmap against running JVMs too.


Not sure what error you are facing installing Java

On Thu, Oct 7, 2021, 6:20 PM Kiran Biswal <biswalki...@gmail.com> wrote:

> Hello Sean, thanks for the pointer
>
> I followed the guidelines in the link you sent and added options to
> generate the dump. Now when I get into the bash shell of the executor, I
> see those options
>
> nobody@ndra-writer-deployment-6db79c4b67-k9qhf-1073a07c5456bbc9-exec-6:/opt/app/ni-kafka-to-cassandra-writer$
> ps -aux
>
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>
> nobody       1  0.0  0.0   2288   740 ?        Ss   Oct06   0:03
> /usr/bin/tini -s -- /usr/local/openjdk-8/bin/java
> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heapdump.bin
> -XX:+UseG1GC -Dlog4j.con
>
> nobody      16 37.9  0.9 15574616 5188712 ?    Sl   Oct06 744:14
> /usr/local/openjdk-8/bin/java -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=/tmp/heapdump.bin -XX:+UseG1GC
> -Dlog4j.configurationFile=log4j
>
> nobody    3832  0.4  0.0   5756  3560 pts/0    Ss   23:12   0:00 bash
>
> nobody    3872  0.0  0.0   9396  2984 pts/0    R+   23:12   0:00 ps -aux
>
> nobody@ndra-writer-deployment-6db79c4b67-k9qhf-1073a07c5456bbc9-exec-6
> :/opt/app/ni-kafka-to-cassandra-writer$
>
>
>
> However, even when the executor crashes, no heap dump is seen in /tmp. If
> previosu executor dies and new one starts, is this expected?
>
>
> In that how do we retain the heapdump?
>
>
>
> Secondly enabling JDK in the dockerfile has not worked for me yet,
> wondering if you might have pointers?
>
>
> RUN apt-get update && \
>     apt-install software-properties-common && \
>     add-apt-repository ppa:openjdk-r/ppa && \
>     apt-get install -y openjdk-8-jdk && \
>
>
> Thanks a lot
>
> -Kiran
>
>
>
>
> On Mon, Sep 27, 2021 at 12:16 PM Sean Owen <sro...@gmail.com> wrote:
>
>> This isn't specific to Spark, just use any standard java approach, for
>> example:
>> https://dzone.com/articles/how-to-capture-java-heap-dumps-7-options
>>
>> You need the JDK installed to use jmap
>>
>>
>>
>> On Mon, Sep 27, 2021 at 1:41 PM Kiran Biswal <biswalki...@gmail.com>
>> wrote:
>>
>>> Thanks Sean.
>>>
>>> When executors has only 2gb, executors restarted every 2/3 hours
>>> with OOMkilled errors
>>>
>>> When I increased executir memory to 12 GB and number of cores to 12 (2
>>> executors, 6 cores per executor), the OOMKilled is stopped and restart
>>> happens but the meory usage peaks to 14GB after few hours and stays there
>>>
>>> Does this indicate it was memory allocation issue and we should use this
>>> higher memory configuration?
>>>
>>> -EXEC_CORES=2
>>> -TOTAL_CORES=4
>>> -EXEC_MEMORY=2G
>>> +EXEC_CORES=6
>>> +TOTAL_CORES=12
>>>
>>> +EXEC_MEMORY=12G
>>>
>>>
>>> Could you provide some exact steps and documentation  how to collect the
>>> heap dump and install all the right packages/environment?
>>>
>>> I tried below steps on the bash shell of the executor. jmap command does
>>> not work
>>>
>>> Mem: 145373940K used, 118435384K free, 325196K shrd, 4452968K buff,
>>> 20344056K cached
>>>
>>> CPU:   7% usr   2% sys   0% nic  89% idle   0% io   0% irq   0% sirq
>>>
>>> Load average: 9.57 10.67 12.05 24/21360 7741
>>>
>>>   PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
>>>
>>>    16     1 nobody   S    3809m   0%  26   0% /usr/bin/java
>>> -Dlog4j.configurationFile=log4j.properties -Dspark.driver.port=34137 -Xms2G
>>> -Xmx2G -cp /opt/hadoop/etc/hadoop::/opt/sp
>>>
>>>  7705     0 nobody   S     2620   0%  41   0% bash
>>>
>>>  7731  7705 nobody   R     1596   0%  22   0% top
>>>
>>>     1     0 nobody   S      804   0%  74   0% /sbin/tini -s --
>>> /usr/bin/java -Dlog4j.configurationFile=log4j.properties
>>> -Dspark.driver.port=34137 -Xms2G -Xmx2G -cp /opt/hadoop/et
>>>
>>> bash-5.1$ jmap -dump:live,format=b,file=application_heap_dump.bin 16
>>>
>>> bash: jmap: command not found
>>>
>>> bash-5.1$ jmap
>>>
>>> bash: jmap: command not found
>>>
>>>
>>> Thanks
>>>
>>> Kiran
>>>
>>> On Sat, Sep 25, 2021 at 5:28 AM Sean Owen <sro...@gmail.com> wrote:
>>>
>>>> It could be 'normal' - executors won't GC unless they need to.
>>>> It could be state in your application, if you're storing state.
>>>> You'd want to dump the heap to take a first look
>>>>
>>>> On Sat, Sep 25, 2021 at 7:24 AM Kiran Biswal <biswalki...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello Experts
>>>>>
>>>>> I have a spark streaming application(DStream). I use spark 3.0.2,
>>>>> scala 2.12 This application reads about 20 different kafka topics and
>>>>> produces a single stream and I filter the RDD per topic and store in
>>>>> cassandra
>>>>>
>>>>> I see that there is a steady increase in executor memory over the
>>>>> hours until it reaches max allocated memory and then it stays  at that
>>>>> value. No matter how high I allocate to the executor this pattern is seen.
>>>>> I suspect memory leak
>>>>>
>>>>> Any guidance you may be able provide as to how to debug will be highly
>>>>> appreciated
>>>>>
>>>>> Thanks in advance
>>>>> Regards
>>>>> Kiran
>>>>>
>>>>

Re: Spark DStream application memory leak debugging

Reply via email to