Re: [kubernetes-users] Need some guidance/help: howto diagnose an oomkill
You can also run jstatd in a running pod and then attach JVisualVM. I haven't done it myself, but the general procedure is: - kubectl exec into the pod - Write the policy file to disk: echo 'grant codebase "file:${java.home}/../lib/tools.jar" { permission java.security.AllPermission; };' > all.policy - Start jstatd. This is a daemon process that exposes information on all JVMs running on the host: jstatd -p 1099 -J-Djava.security.policy=all.policy - connect JVisualVM using the pod IP (kubectl get pod -o wide; this may be tricky if you can't reach pod IPs directly, e.g. because of an overlay. I think kubectl can help you proxy to it) /MR On Thu, Sep 28, 2017 at 12:30 AM Evan Jones wrote: > Its been a while since I've dealt with this sort of issue, but there are > various libraries that use "native" memory outside the Java heap. The -Xmx > flag only limits the Java heap, so it isn't surprising that some processes > may need a way higher container memory limit than the Java GC heap limit. > > However, if the memory usage increases over time without limit, you might > have some sort of native memory leak due to not closing things (e.g. direct > ByteBuffers, GZIP streams, many others). You can watch the container memory > usage of the pod over time, and if it seems to increase without bound this > may be what is happening. The JVM's native memory tracking summary > statistics can also be useful: > https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html > > I've had success tracking down native memory leaks using jemalloc's > profiling: http://www.evanjones.ca/java-native-leak-bug.html > > Hope this helps, good luck! > > Evan > > > On Tuesday, September 26, 2017 at 8:50:30 PM UTC-4, John VanRyn wrote: > >> helps some... we made the kube pods have almost twice as much memory as >> we are allocating the jvm.. and it seems to get us out of the woods >> but it totally means we need to look into a jdk upgrade from 8. >> >> Thanks >> > On Tue, Sep 26, 2017 at 7:50 AM, Davanum Srinivas >> wrote: >> > John, >>> >>> Does this help? >>> https://developers.redhat.com/blog/2017/03/14/java-inside-docker/ >>> >>> There are some details here as well: >>> https://github.com/moby/moby/issues/15020 >>> >>> Thanks, >>> Dims >>> >> >>> On Tue, Sep 26, 2017 at 7:37 AM, John VanRyn wrote: >>> > I have a kube cluster running on n1-highmem-16 (16 vCPUs, 104 GB >>> memory), >>> > using the unmodified cos-stable-60-9592-84-0 image. >>> > >>> > I have a java app running under wildfly >>> > >>> > >>> > apiVersion: extensions/v1beta1 >>> > kind: Deployment >>> > metadata: >>> > name: cas-unicas-ws >>> > labels: >>> > name: cas-unicas-ws >>> > model: cas >>> > spec: >>> > replicas: 1 >>> > template: >>> > metadata: >>> > labels: >>> > name: cas-unicas-ws >>> > model: cas >>> > spec: >>> > containers: >>> > - name: cas-unicas-ws >>> > image: liaisonintl/cas-unicas-ws:__CAS_TAG__ >>> > imagePullPolicy: Always >>> > ports: >>> > - containerPort: 8080 >>> > readinessProbe: >>> > periodSeconds: 20 >>> > timeoutSeconds: 5 >>> > successThreshold: 1 >>> > failureThreshold: 3 >>> > httpGet: >>> > path: /services/getPdfServiceConfig >>> > port: 8080 >>> > resources: >>> > limits: >>> > memory: "1M" >>> > requests: >>> > memory: "1M" >>> > env: >>> > - name: JAVA_MEM >>> > value: -Xms9000m -Xmx9000m -XX:+UseG1GC >>> > -XX:+UseStringDeduplication -XX:+AlwaysPreTouch >>> > - name: SPRING_PROFILE >>> > value: __SPRING_PROFILE__ >>> > command: ["/bin/bash","-ic"] >>> > args: >>> > - "set -xeo pipefail ; source /interpolate ; exec >>> > /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0" >>> > >>> > >>> > Here is the important parts of the dockerFile >>> > >>> > >>> > FROM liaisonintl/docker-cas-base:master >>> > MAINTAINER John VanRyn >>> > >>> > EXPOSE 8080 >>> > EXPOSE 9990 >>> > >>> > LABEL "GITHASH"="__GIT_HASH__" >>> > ENV WILDFLY_HOME /opt/jboss/wildfly >>> > ENV PATH $WILDFLY_HOME/bin:$PATH >>> > >>> > ADD *.war ${WILDFLY_HOME}/standalone/deployments/ >>> > >>> > ## App config >>> > # >>> > ADD config/ ${WILDFLY_HOME}/appConfigTemplate/ >>> > >>> > ## Temporary fix just to see things working >>> > ADD config/gen.unicas-ws.docker >>> ${WILDFLY_HOME}/appConfig/unicas-ws.docker >>> > >>> > USER root >>> > ENV CAS_CONFIGS ${WILDFLY_HOME}/appConfig >>> > ENV SPRING_PROFILE QA >>> > >>> > ENV JAVA_OPTS="${JAVA_OPTS} ${JAVA_MEM} -XX:+UseG1GC >>> > -XX:+UseStringDeduplication -DCAS_CONFIGS=${CAS_CONFIGS} >>> > -Dspring.profiles.active=${SPRING_PROFILE}" >>> > >>> > RUN \ >>> > mkdir -p $CAS_CONFIGS && \ >>> > chmod 777 ${WILDFLY_HOME}/appConfig && \ >>> > chmod 777 ${WILDFLY_HOME}/appConfigTemplate &&
Re: [kubernetes-users] Need some guidance/help: howto diagnose an oomkill
Its been a while since I've dealt with this sort of issue, but there are various libraries that use "native" memory outside the Java heap. The -Xmx flag only limits the Java heap, so it isn't surprising that some processes may need a way higher container memory limit than the Java GC heap limit. However, if the memory usage increases over time without limit, you might have some sort of native memory leak due to not closing things (e.g. direct ByteBuffers, GZIP streams, many others). You can watch the container memory usage of the pod over time, and if it seems to increase without bound this may be what is happening. The JVM's native memory tracking summary statistics can also be useful: https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html I've had success tracking down native memory leaks using jemalloc's profiling: http://www.evanjones.ca/java-native-leak-bug.html Hope this helps, good luck! Evan On Tuesday, September 26, 2017 at 8:50:30 PM UTC-4, John VanRyn wrote: > > helps some... we made the kube pods have almost twice as much memory as > we are allocating the jvm.. and it seems to get us out of the woods > but it totally means we need to look into a jdk upgrade from 8. > > Thanks > > On Tue, Sep 26, 2017 at 7:50 AM, Davanum Srinivas > wrote: > >> John, >> >> Does this help? >> https://developers.redhat.com/blog/2017/03/14/java-inside-docker/ >> >> There are some details here as well: >> https://github.com/moby/moby/issues/15020 >> >> Thanks, >> Dims >> >> On Tue, Sep 26, 2017 at 7:37 AM, John VanRyn > > wrote: >> > I have a kube cluster running on n1-highmem-16 (16 vCPUs, 104 GB >> memory), >> > using the unmodified cos-stable-60-9592-84-0 image. >> > >> > I have a java app running under wildfly >> > >> > >> > apiVersion: extensions/v1beta1 >> > kind: Deployment >> > metadata: >> > name: cas-unicas-ws >> > labels: >> > name: cas-unicas-ws >> > model: cas >> > spec: >> > replicas: 1 >> > template: >> > metadata: >> > labels: >> > name: cas-unicas-ws >> > model: cas >> > spec: >> > containers: >> > - name: cas-unicas-ws >> > image: liaisonintl/cas-unicas-ws:__CAS_TAG__ >> > imagePullPolicy: Always >> > ports: >> > - containerPort: 8080 >> > readinessProbe: >> > periodSeconds: 20 >> > timeoutSeconds: 5 >> > successThreshold: 1 >> > failureThreshold: 3 >> > httpGet: >> > path: /services/getPdfServiceConfig >> > port: 8080 >> > resources: >> > limits: >> > memory: "1M" >> > requests: >> > memory: "1M" >> > env: >> > - name: JAVA_MEM >> > value: -Xms9000m -Xmx9000m -XX:+UseG1GC >> > -XX:+UseStringDeduplication -XX:+AlwaysPreTouch >> > - name: SPRING_PROFILE >> > value: __SPRING_PROFILE__ >> > command: ["/bin/bash","-ic"] >> > args: >> > - "set -xeo pipefail ; source /interpolate ; exec >> > /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0" >> > >> > >> > Here is the important parts of the dockerFile >> > >> > >> > FROM liaisonintl/docker-cas-base:master >> > MAINTAINER John VanRyn >> > >> > EXPOSE 8080 >> > EXPOSE 9990 >> > >> > LABEL "GITHASH"="__GIT_HASH__" >> > ENV WILDFLY_HOME /opt/jboss/wildfly >> > ENV PATH $WILDFLY_HOME/bin:$PATH >> > >> > ADD *.war ${WILDFLY_HOME}/standalone/deployments/ >> > >> > ## App config >> > # >> > ADD config/ ${WILDFLY_HOME}/appConfigTemplate/ >> > >> > ## Temporary fix just to see things working >> > ADD config/gen.unicas-ws.docker >> ${WILDFLY_HOME}/appConfig/unicas-ws.docker >> > >> > USER root >> > ENV CAS_CONFIGS ${WILDFLY_HOME}/appConfig >> > ENV SPRING_PROFILE QA >> > >> > ENV JAVA_OPTS="${JAVA_OPTS} ${JAVA_MEM} -XX:+UseG1GC >> > -XX:+UseStringDeduplication -DCAS_CONFIGS=${CAS_CONFIGS} >> > -Dspring.profiles.active=${SPRING_PROFILE}" >> > >> > RUN \ >> > mkdir -p $CAS_CONFIGS && \ >> > chmod 777 ${WILDFLY_HOME}/appConfig && \ >> > chmod 777 ${WILDFLY_HOME}/appConfigTemplate && \ >> > /opt/jboss/wildfly/bin/add-user.sh admin REDACTED --silent >> > >> > # Add REVISION FILE FOR GITHASH Reporting >> > ADD config/REVISION REVISION >> > >> > CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-b", "0.0.0.0"] >> > >> > >> > Log looks like this.. >> > >> > + exec /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0 >> > JAVA_OPTS already set in environment; overriding default settings with >> > values: -XX:+UseG1GC -XX:+UseStringDeduplication >> > -DCAS_CONFIGS=/opt/jboss/wildfly/appConfig -Dspring.profiles.active=QA >> > -Xms9000m -Xmx9000m -XX:+UseG1GC -XX:+UseStringDeduplication >> > -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+UseStringDeduplication >> > >> = >> > >> > JBoss Bootstrap Environment >> > >> > JBOSS_HOME: /opt/jbos
Re: [kubernetes-users] Need some guidance/help: howto diagnose an oomkill
helps some... we made the kube pods have almost twice as much memory as we are allocating the jvm.. and it seems to get us out of the woods but it totally means we need to look into a jdk upgrade from 8. Thanks On Tue, Sep 26, 2017 at 7:50 AM, Davanum Srinivas wrote: > John, > > Does this help? > https://developers.redhat.com/blog/2017/03/14/java-inside-docker/ > > There are some details here as well: > https://github.com/moby/moby/issues/15020 > > Thanks, > Dims > > On Tue, Sep 26, 2017 at 7:37 AM, John VanRyn wrote: > > I have a kube cluster running on n1-highmem-16 (16 vCPUs, 104 GB memory), > > using the unmodified cos-stable-60-9592-84-0 image. > > > > I have a java app running under wildfly > > > > > > apiVersion: extensions/v1beta1 > > kind: Deployment > > metadata: > > name: cas-unicas-ws > > labels: > > name: cas-unicas-ws > > model: cas > > spec: > > replicas: 1 > > template: > > metadata: > > labels: > > name: cas-unicas-ws > > model: cas > > spec: > > containers: > > - name: cas-unicas-ws > > image: liaisonintl/cas-unicas-ws:__CAS_TAG__ > > imagePullPolicy: Always > > ports: > > - containerPort: 8080 > > readinessProbe: > > periodSeconds: 20 > > timeoutSeconds: 5 > > successThreshold: 1 > > failureThreshold: 3 > > httpGet: > > path: /services/getPdfServiceConfig > > port: 8080 > > resources: > > limits: > > memory: "1M" > > requests: > > memory: "1M" > > env: > > - name: JAVA_MEM > > value: -Xms9000m -Xmx9000m -XX:+UseG1GC > > -XX:+UseStringDeduplication -XX:+AlwaysPreTouch > > - name: SPRING_PROFILE > > value: __SPRING_PROFILE__ > > command: ["/bin/bash","-ic"] > > args: > > - "set -xeo pipefail ; source /interpolate ; exec > > /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0" > > > > > > Here is the important parts of the dockerFile > > > > > > FROM liaisonintl/docker-cas-base:master > > MAINTAINER John VanRyn > > > > EXPOSE 8080 > > EXPOSE 9990 > > > > LABEL "GITHASH"="__GIT_HASH__" > > ENV WILDFLY_HOME /opt/jboss/wildfly > > ENV PATH $WILDFLY_HOME/bin:$PATH > > > > ADD *.war ${WILDFLY_HOME}/standalone/deployments/ > > > > ## App config > > # > > ADD config/ ${WILDFLY_HOME}/appConfigTemplate/ > > > > ## Temporary fix just to see things working > > ADD config/gen.unicas-ws.docker ${WILDFLY_HOME}/appConfig/ > unicas-ws.docker > > > > USER root > > ENV CAS_CONFIGS ${WILDFLY_HOME}/appConfig > > ENV SPRING_PROFILE QA > > > > ENV JAVA_OPTS="${JAVA_OPTS} ${JAVA_MEM} -XX:+UseG1GC > > -XX:+UseStringDeduplication -DCAS_CONFIGS=${CAS_CONFIGS} > > -Dspring.profiles.active=${SPRING_PROFILE}" > > > > RUN \ > > mkdir -p $CAS_CONFIGS && \ > > chmod 777 ${WILDFLY_HOME}/appConfig && \ > > chmod 777 ${WILDFLY_HOME}/appConfigTemplate && \ > > /opt/jboss/wildfly/bin/add-user.sh admin REDACTED --silent > > > > # Add REVISION FILE FOR GITHASH Reporting > > ADD config/REVISION REVISION > > > > CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-b", "0.0.0.0"] > > > > > > Log looks like this.. > > > > + exec /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0 > > JAVA_OPTS already set in environment; overriding default settings with > > values: -XX:+UseG1GC -XX:+UseStringDeduplication > > -DCAS_CONFIGS=/opt/jboss/wildfly/appConfig -Dspring.profiles.active=QA > > -Xms9000m -Xmx9000m -XX:+UseG1GC -XX:+UseStringDeduplication > > -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+UseStringDeduplication > > > = > > > > JBoss Bootstrap Environment > > > > JBOSS_HOME: /opt/jboss/wildfly > > > > JAVA: /usr/lib/jvm/java/bin/java > > > > JAVA_OPTS: -server -XX:+UseCompressedOops -server > -XX:+UseCompressedOops > > -XX:+UseG1GC -XX:+UseStringDeduplication > > -DCAS_CONFIGS=/opt/jboss/wildfly/appConfig -Dspring.profiles.active=QA > > -Xms9000m -Xmx9000m -XX:+UseG1GC -XX:+UseStringDeduplication > > -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+UseStringDeduplication > > > > > = > > > > 11:28:24,386 INFO [org.jboss.modules] (main) JBoss Modules version > > 1.3.3.Final > > 11:28:24,539 INFO [org.jboss.msc] (main) JBoss MSC version 1.2.2.Final > > 11:28:24,602 INFO [org.jboss.as] (MSC service thread 1-6) JBAS015899: > > WildFly 8.2.1.Final "Tweek" starting > > 11:28:25,430 INFO [org.jboss.as.controller.management-deprecated] > > (Controller Boot Thread) JBAS014627: Attribute any-ipv4-address is > > deprecated, and it might be removed in future version! > > 11:28:25,479 INFO [org.jboss.as.server] (Controller Boot Thread) > > JBAS015888: Creating http management service using socket-binding > > (management-http) > > 11:28:25,498 INFO [org.xnio] (MSC service thread 1-1
Re: [kubernetes-users] Need some guidance/help: howto diagnose an oomkill
John, Does this help? https://developers.redhat.com/blog/2017/03/14/java-inside-docker/ There are some details here as well: https://github.com/moby/moby/issues/15020 Thanks, Dims On Tue, Sep 26, 2017 at 7:37 AM, John VanRyn wrote: > I have a kube cluster running on n1-highmem-16 (16 vCPUs, 104 GB memory), > using the unmodified cos-stable-60-9592-84-0 image. > > I have a java app running under wildfly > > > apiVersion: extensions/v1beta1 > kind: Deployment > metadata: > name: cas-unicas-ws > labels: > name: cas-unicas-ws > model: cas > spec: > replicas: 1 > template: > metadata: > labels: > name: cas-unicas-ws > model: cas > spec: > containers: > - name: cas-unicas-ws > image: liaisonintl/cas-unicas-ws:__CAS_TAG__ > imagePullPolicy: Always > ports: > - containerPort: 8080 > readinessProbe: > periodSeconds: 20 > timeoutSeconds: 5 > successThreshold: 1 > failureThreshold: 3 > httpGet: > path: /services/getPdfServiceConfig > port: 8080 > resources: > limits: > memory: "1M" > requests: > memory: "1M" > env: > - name: JAVA_MEM > value: -Xms9000m -Xmx9000m -XX:+UseG1GC > -XX:+UseStringDeduplication -XX:+AlwaysPreTouch > - name: SPRING_PROFILE > value: __SPRING_PROFILE__ > command: ["/bin/bash","-ic"] > args: > - "set -xeo pipefail ; source /interpolate ; exec > /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0" > > > Here is the important parts of the dockerFile > > > FROM liaisonintl/docker-cas-base:master > MAINTAINER John VanRyn > > EXPOSE 8080 > EXPOSE 9990 > > LABEL "GITHASH"="__GIT_HASH__" > ENV WILDFLY_HOME /opt/jboss/wildfly > ENV PATH $WILDFLY_HOME/bin:$PATH > > ADD *.war ${WILDFLY_HOME}/standalone/deployments/ > > ## App config > # > ADD config/ ${WILDFLY_HOME}/appConfigTemplate/ > > ## Temporary fix just to see things working > ADD config/gen.unicas-ws.docker ${WILDFLY_HOME}/appConfig/unicas-ws.docker > > USER root > ENV CAS_CONFIGS ${WILDFLY_HOME}/appConfig > ENV SPRING_PROFILE QA > > ENV JAVA_OPTS="${JAVA_OPTS} ${JAVA_MEM} -XX:+UseG1GC > -XX:+UseStringDeduplication -DCAS_CONFIGS=${CAS_CONFIGS} > -Dspring.profiles.active=${SPRING_PROFILE}" > > RUN \ > mkdir -p $CAS_CONFIGS && \ > chmod 777 ${WILDFLY_HOME}/appConfig && \ > chmod 777 ${WILDFLY_HOME}/appConfigTemplate && \ > /opt/jboss/wildfly/bin/add-user.sh admin REDACTED --silent > > # Add REVISION FILE FOR GITHASH Reporting > ADD config/REVISION REVISION > > CMD ["/opt/jboss/wildfly/bin/standalone.sh", "-b", "0.0.0.0"] > > > Log looks like this.. > > + exec /opt/jboss/wildfly/bin/standalone.sh -b 0.0.0.0 > JAVA_OPTS already set in environment; overriding default settings with > values: -XX:+UseG1GC -XX:+UseStringDeduplication > -DCAS_CONFIGS=/opt/jboss/wildfly/appConfig -Dspring.profiles.active=QA > -Xms9000m -Xmx9000m -XX:+UseG1GC -XX:+UseStringDeduplication > -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+UseStringDeduplication > = > > JBoss Bootstrap Environment > > JBOSS_HOME: /opt/jboss/wildfly > > JAVA: /usr/lib/jvm/java/bin/java > > JAVA_OPTS: -server -XX:+UseCompressedOops -server -XX:+UseCompressedOops > -XX:+UseG1GC -XX:+UseStringDeduplication > -DCAS_CONFIGS=/opt/jboss/wildfly/appConfig -Dspring.profiles.active=QA > -Xms9000m -Xmx9000m -XX:+UseG1GC -XX:+UseStringDeduplication > -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+UseStringDeduplication > > = > > 11:28:24,386 INFO [org.jboss.modules] (main) JBoss Modules version > 1.3.3.Final > 11:28:24,539 INFO [org.jboss.msc] (main) JBoss MSC version 1.2.2.Final > 11:28:24,602 INFO [org.jboss.as] (MSC service thread 1-6) JBAS015899: > WildFly 8.2.1.Final "Tweek" starting > 11:28:25,430 INFO [org.jboss.as.controller.management-deprecated] > (Controller Boot Thread) JBAS014627: Attribute any-ipv4-address is > deprecated, and it might be removed in future version! > 11:28:25,479 INFO [org.jboss.as.server] (Controller Boot Thread) > JBAS015888: Creating http management service using socket-binding > (management-http) > 11:28:25,498 INFO [org.xnio] (MSC service thread 1-10) XNIO version > 3.3.0.Final > 11:28:25,510 INFO [org.xnio.nio] (MSC service thread 1-10) XNIO NIO > Implementation Version 3.3.0.Final > 11:28:25,534 INFO [org.jboss.as.clustering.infinispan] (ServerService > Thread Pool -- 32) JBAS010280: Activating Infinispan subsystem. > 11:28:25,542 WARN [org.jboss.as.txn] (ServerService Thread Pool -- 46) > JBAS010153: Node identifier property is set to the default value. Please > make sure it is unique. > 11:28:25,546 INFO [org.jboss.as.security] (ServerService Thread Pool -- 45) > JBAS013171: Activating Security Subs