This much isn't related to Spark. Are you looking at /tmp on the executor? You can use jmap against running JVMs too.
Not sure what error you are facing installing Java On Thu, Oct 7, 2021, 6:20 PM Kiran Biswal <biswalki...@gmail.com> wrote: > Hello Sean, thanks for the pointer > > I followed the guidelines in the link you sent and added options to > generate the dump. Now when I get into the bash shell of the executor, I > see those options > > nobody@ndra-writer-deployment-6db79c4b67-k9qhf-1073a07c5456bbc9-exec-6:/opt/app/ni-kafka-to-cassandra-writer$ > ps -aux > > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > > nobody 1 0.0 0.0 2288 740 ? Ss Oct06 0:03 > /usr/bin/tini -s -- /usr/local/openjdk-8/bin/java > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heapdump.bin > -XX:+UseG1GC -Dlog4j.con > > nobody 16 37.9 0.9 15574616 5188712 ? Sl Oct06 744:14 > /usr/local/openjdk-8/bin/java -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/tmp/heapdump.bin -XX:+UseG1GC > -Dlog4j.configurationFile=log4j > > nobody 3832 0.4 0.0 5756 3560 pts/0 Ss 23:12 0:00 bash > > nobody 3872 0.0 0.0 9396 2984 pts/0 R+ 23:12 0:00 ps -aux > > nobody@ndra-writer-deployment-6db79c4b67-k9qhf-1073a07c5456bbc9-exec-6 > :/opt/app/ni-kafka-to-cassandra-writer$ > > > > However, even when the executor crashes, no heap dump is seen in /tmp. If > previosu executor dies and new one starts, is this expected? > > > In that how do we retain the heapdump? > > > > Secondly enabling JDK in the dockerfile has not worked for me yet, > wondering if you might have pointers? > > > RUN apt-get update && \ > apt-install software-properties-common && \ > add-apt-repository ppa:openjdk-r/ppa && \ > apt-get install -y openjdk-8-jdk && \ > > > Thanks a lot > > -Kiran > > > > > On Mon, Sep 27, 2021 at 12:16 PM Sean Owen <sro...@gmail.com> wrote: > >> This isn't specific to Spark, just use any standard java approach, for >> example: >> https://dzone.com/articles/how-to-capture-java-heap-dumps-7-options >> >> You need the JDK installed to use jmap >> >> >> >> On Mon, Sep 27, 2021 at 1:41 PM Kiran Biswal <biswalki...@gmail.com> >> wrote: >> >>> Thanks Sean. >>> >>> When executors has only 2gb, executors restarted every 2/3 hours >>> with OOMkilled errors >>> >>> When I increased executir memory to 12 GB and number of cores to 12 (2 >>> executors, 6 cores per executor), the OOMKilled is stopped and restart >>> happens but the meory usage peaks to 14GB after few hours and stays there >>> >>> Does this indicate it was memory allocation issue and we should use this >>> higher memory configuration? >>> >>> -EXEC_CORES=2 >>> -TOTAL_CORES=4 >>> -EXEC_MEMORY=2G >>> +EXEC_CORES=6 >>> +TOTAL_CORES=12 >>> >>> +EXEC_MEMORY=12G >>> >>> >>> Could you provide some exact steps and documentation how to collect the >>> heap dump and install all the right packages/environment? >>> >>> I tried below steps on the bash shell of the executor. jmap command does >>> not work >>> >>> Mem: 145373940K used, 118435384K free, 325196K shrd, 4452968K buff, >>> 20344056K cached >>> >>> CPU: 7% usr 2% sys 0% nic 89% idle 0% io 0% irq 0% sirq >>> >>> Load average: 9.57 10.67 12.05 24/21360 7741 >>> >>> PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND >>> >>> 16 1 nobody S 3809m 0% 26 0% /usr/bin/java >>> -Dlog4j.configurationFile=log4j.properties -Dspark.driver.port=34137 -Xms2G >>> -Xmx2G -cp /opt/hadoop/etc/hadoop::/opt/sp >>> >>> 7705 0 nobody S 2620 0% 41 0% bash >>> >>> 7731 7705 nobody R 1596 0% 22 0% top >>> >>> 1 0 nobody S 804 0% 74 0% /sbin/tini -s -- >>> /usr/bin/java -Dlog4j.configurationFile=log4j.properties >>> -Dspark.driver.port=34137 -Xms2G -Xmx2G -cp /opt/hadoop/et >>> >>> bash-5.1$ jmap -dump:live,format=b,file=application_heap_dump.bin 16 >>> >>> bash: jmap: command not found >>> >>> bash-5.1$ jmap >>> >>> bash: jmap: command not found >>> >>> >>> Thanks >>> >>> Kiran >>> >>> On Sat, Sep 25, 2021 at 5:28 AM Sean Owen <sro...@gmail.com> wrote: >>> >>>> It could be 'normal' - executors won't GC unless they need to. >>>> It could be state in your application, if you're storing state. >>>> You'd want to dump the heap to take a first look >>>> >>>> On Sat, Sep 25, 2021 at 7:24 AM Kiran Biswal <biswalki...@gmail.com> >>>> wrote: >>>> >>>>> Hello Experts >>>>> >>>>> I have a spark streaming application(DStream). I use spark 3.0.2, >>>>> scala 2.12 This application reads about 20 different kafka topics and >>>>> produces a single stream and I filter the RDD per topic and store in >>>>> cassandra >>>>> >>>>> I see that there is a steady increase in executor memory over the >>>>> hours until it reaches max allocated memory and then it stays at that >>>>> value. No matter how high I allocate to the executor this pattern is seen. >>>>> I suspect memory leak >>>>> >>>>> Any guidance you may be able provide as to how to debug will be highly >>>>> appreciated >>>>> >>>>> Thanks in advance >>>>> Regards >>>>> Kiran >>>>> >>>>