Hello all, I have a Kafka cluster deployed with version 3.2.1 , JDK 11 and log4j 2.18.0. I built my own Kafka image. One of my Kafka brokers is experiencing CPU issues, and based on the jstack information, it seems that log4j is causing the problem due to its usage of StackWalker. How to solve this issue?
Here is jstack information: "data-plane-kafka-request-handler-6" #59 daemon prio=5 os_prio=0 cpu=86381259.23ms elapsed=1948787.21s tid=0x00007f8939c04800 nid=0x190 runnable [0x00007f883f6f5000] java.lang.Thread.State: RUNNABLE at java.lang.StackStreamFactory$AbstractStackWalker.fetchStackFrames(java.base@11.0.9/Native Method) at java.lang.StackStreamFactory$AbstractStackWalker.fetchStackFrames(java.base@11.0.9/Unknown Source) at java.lang.StackStreamFactory$AbstractStackWalker.getNextBatch(java.base@11.0.9/Unknown Source) at java.lang.StackStreamFactory$AbstractStackWalker.peekFrame(java.base@11.0.9/Unknown Source) at java.lang.StackStreamFactory$AbstractStackWalker.hasNext(java.base@11.0.9/Unknown Source) at java.lang.StackStreamFactory$StackFrameTraverser.tryAdvance(java.base@11.0.9/Unknown Source) at java.util.stream.ReferencePipeline.forEachWithCancel(java.base@11.0.9/Unknown Source) at java.util.stream.AbstractPipeline.copyIntoWithCancel(java.base@11.0.9/Unknown Source) at java.util.stream.AbstractPipeline.copyInto(java.base@11.0.9/Unknown Source) at java.util.stream.AbstractPipeline.wrapAndCopyInto(java.base@11.0.9/Unknown Source) at java.util.stream.FindOps$FindOp.evaluateSequential(java.base@11.0.9/Unknown Source) at java.util.stream.AbstractPipeline.evaluate(java.base@11.0.9/Unknown Source) at java.util.stream.ReferencePipeline.findFirst(java.base@11.0.9/Unknown Source) at org.apache.logging.log4j.util.StackLocator.lambda$getCallerClass$2(StackLocator.java:57) at org.apache.logging.log4j.util.StackLocator$$Lambda$117/0x00000008001a6c40.apply(Unknown Source) at java.lang.StackStreamFactory$StackFrameTraverser.consumeFrames(java.base@11.0.9/Unknown Source) at java.lang.StackStreamFactory$AbstractStackWalker.doStackWalk(java.base@11.0.9/Unknown Source) at java.lang.StackStreamFactory$AbstractStackWalker.callStackWalk(java.base@11.0.9/Native Method) at java.lang.StackStreamFactory$AbstractStackWalker.beginStackWalk(java.base@11.0.9/Unknown Source) at java.lang.StackStreamFactory$AbstractStackWalker.walk(java.base@11.0.9/Unknown Source) at java.lang.StackWalker.walk(java.base@11.0.9/Unknown Source) at org.apache.logging.log4j.util.StackLocator.getCallerClass(StackLocator.java:51) at org.apache.logging.log4j.util.StackLocatorUtil.getCallerClass(StackLocatorUtil.java:104) at org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:50) at org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:47) at org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:33) at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:363) at kafka.utils.Logging.logger(Logging.scala:43) at kafka.utils.Logging.logger$(Logging.scala:43) at kafka.server.SessionlessFetchContext.logger$lzycompute(FetchSession.scala:364) - locked <0x00000007fa037e58> (a kafka.server.SessionlessFetchContext) at kafka.server.SessionlessFetchContext.logger(FetchSession.scala:364) at kafka.utils.Logging.debug(Logging.scala:62) at kafka.utils.Logging.debug$(Logging.scala:62) at kafka.server.SessionlessFetchContext.debug(FetchSession.scala:364) at kafka.server.SessionlessFetchContext.updateAndGenerateResponseData(FetchSession.scala:377) at kafka.server.KafkaApis.processResponseCallback$1(KafkaApis.scala:932) at kafka.server.KafkaApis.$anonfun$handleFetchRequest$33(KafkaApis.scala:965) at kafka.server.KafkaApis.$anonfun$handleFetchRequest$33$adapted(KafkaApis.scala:965) at kafka.server.KafkaApis$$Lambda$1241/0x00000008007e4040.apply(Unknown Source)