Yu Yang created KAFKA-7364:
------------------------------
Summary: kafka periodically run into high cpu usage with ssl
writing
Key: KAFKA-7364
URL: https://issues.apache.org/jira/browse/KAFKA-7364
Project: Kafka
Issue Type: Bug
Components: core
Affects Versions: 2.0.0
Reporter: Yu Yang
while testing ssl writing to kafka, we found that kafka often run into high cpu
usage due to inefficiency in jdk ssl implementation.
In detail, we use a test cluster that have 12 d2.8xlarge instances, jdk-10.0.2,
and hosts only one topic that have ~20k producers write to through ssl
channel. We observed that the network threads often get 100% cpu usage after
enabling ssl writing to kafka. To improve kafka's throughput, we have
"num.network.threads=32" for the broker. Even with 32 network threads, we see
the broker cpu usage jump right after ssl writing is enabled.
!Screen Shot 2018-08-30 at 10.57.32 PM.png!
When the broker's cpu usage is high, 'perf top' shows that kafka is busy with
executing code in libsunec.so. The following is a sample stack track that we
get when the broker's cpu usage was high.
{code}
Thread 77562: (state = IN_NATIVE)
- sun.security.ec.ECDSASignature.verifySignedDigest(byte[], byte[], byte[],
byte[]) @bci=0 (Compiled frame; information may be imprecise)
- sun.security.ec.ECDSASignature.engineVerify(byte[]) @bci=70, line=321
(Compiled frame)
- java.security.Signature$Delegate.engineVerify(byte[]) @bci=9, line=1222
(Compiled frame)
- java.security.Signature.verify(byte[]) @bci=10, line=655 (Compiled frame)
- sun.security.x509.X509CertImpl.verify(java.security.PublicKey,
java.lang.String) @bci=136, line=444 (Compiled frame)
-
sun.security.provider.certpath.BasicChecker.verifySignature(java.security.cert.X509Certificate)
@bci=48, line=166 (Compiled frame)
-
sun.security.provider.certpath.BasicChecker.check(java.security.cert.Certificate,
java.util.Collection) @bci=24, line=147 (Compiled frame)
-
sun.security.provider.certpath.PKIXMasterCertPathValidator.validate(java.security.cert.CertPath,
java.util.List, java.util.List) @bci=316, line=125 (Compiled frame)
-
sun.security.provider.certpath.PKIXCertPathValidator.validate(java.security.cert.TrustAnchor,
sun.security.provider.certpath.PKIX$ValidatorParams) @bci=390, line=233
(Compiled frame)
-
sun.security.provider.certpath.PKIXCertPathValidator.validate(sun.security.provider.certpath.PKIX$ValidatorParams)
@bci=217, line=141 (Compiled frame)
-
sun.security.provider.certpath.PKIXCertPathValidator.engineValidate(java.security.cert.CertPath,
java.security.cert.CertPathParameters) @bci=7, line=80 (Compiled frame)
- java.security.cert.CertPathValidator.validate(java.security.cert.CertPath,
java.security.cert.CertPathParameters) @bci=6, line=292 (Compiled frame)
-
sun.security.validator.PKIXValidator.doValidate(java.security.cert.X509Certificate[],
java.security.cert.PKIXBuilderParameters) @bci=34, line=357 (Compiled frame)
-
sun.security.validator.PKIXValidator.engineValidate(java.security.cert.X509Certificate[],
java.util.Collection, java.security.AlgorithmConstraints, java.lang.Object)
@bci=232, line=259 (Compiled frame)
-
sun.security.validator.Validator.validate(java.security.cert.X509Certificate[],
java.util.Collection, java.security.AlgorithmConstraints, java.lang.Object)
@bci=6, line=260 (Compiled frame)
-
sun.security.ssl.X509TrustManagerImpl.validate(sun.security.validator.Validator,
java.security.cert.X509Certificate[], java.security.AlgorithmConstraints,
java.lang.String) @bci=10, line=324 (Compiled frame)
-
sun.security.ssl.X509TrustManagerImpl.checkTrusted(java.security.cert.X509Certificate[],
java.lang.String, javax.net.ssl.SSLEngine, boolean) @bci=179, line=279
(Compiled frame)
-
sun.security.ssl.X509TrustManagerImpl.checkClientTrusted(java.security.cert.X509Certificate[],
java.lang.String, javax.net.ssl.SSLEngine) @bci=5, line=130 (Compiled frame)
-
sun.security.ssl.ServerHandshaker.clientCertificate(sun.security.ssl.HandshakeMessage$CertificateMsg)
@bci=190, line=1966 (Compiled frame)
- sun.security.ssl.ServerHandshaker.processMessage(byte, int) @bci=160,
line=237 (Compiled frame)
- sun.security.ssl.Handshaker.processLoop() @bci=96, line=1052 (Compiled frame)
- sun.security.ssl.Handshaker$1.run() @bci=4, line=992 (Compiled frame)
- sun.security.ssl.Handshaker$1.run() @bci=1, line=989 (Compiled frame)
-
java.security.AccessController.doPrivileged(java.security.PrivilegedExceptionAction,
java.security.AccessControlContext) @bci=0 (Compiled frame)
- sun.security.ssl.Handshaker$DelegatedTask.run() @bci=24, line=1467 (Compiled
frame)
- org.apache.kafka.common.network.SslTransportLayer.runDelegatedTasks()
@bci=13, line=393 (Compiled frame)
- org.apache.kafka.common.network.SslTransportLayer.handshakeUnwrap(boolean)
@bci=88, line=473 (Compiled frame)
- org.apache.kafka.common.network.SslTransportLayer.doHandshake() @bci=570,
line=331 (Compiled frame)
- org.apache.kafka.common.network.SslTransportLayer.handshake() @bci=48,
line=258 (Compiled frame)
- org.apache.kafka.common.network.KafkaChannel.prepare() @bci=16, line=125
(Compiled frame)
- org.apache.kafka.common.network.Selector.pollSelectionKeys(java.util.Set,
boolean, long) @bci=251, line=494 (Compiled frame)
- org.apache.kafka.common.network.Selector.poll(long) @bci=312, line=432
(Compiled frame)
- kafka.network.Processor.poll() @bci=7, line=678 (Compiled frame)
- kafka.network.Processor.run() @bci=31, line=583 (Compiled frame)
- java.lang.Thread.run() @bci=11, line=748 (Interpreted frame)
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)