xiaotong.wang created KAFKA-15796:
-
Summary: High CPU issue in Kafka Producer when Auth Failed
Key: KAFKA-15796
URL: https://issues.apache.org/jira/browse/KAFKA-15796
Project: Kafka
Issue Type: Bug
Components: clients
Affects Versions: 3.5.1, 3.6.0, 3.4.1, 3.5.0, 3.3.2, 3.3.1, 3.2.3, 3.2.2
Reporter: xiaotong.wang
How to reproduce
1、kafka-client 3.x.x Producer config enable.idempotence=true (this is
default)
2、start kafka server , not contain client user auth info
3、start client producer , after 3.x,producer will initProducerId and TCM state
trans to INITIALIZING
4、server reject client reqesut , producer will raise
AuthenticationException
(org.apache.kafka.clients.producer.internals.Sender#maybeSendAndPollTransactionalRequest)
5、kafka-client org.apache.kafka.clients.producer.internals.Sender#runOnce catch
AuthenticationException
call transactionManager.authenticationFailed(e);
synchronized void authenticationFailed(AuthenticationException e) {
for (TxnRequestHandler request : pendingRequests)
request.fatalError(e);
}
this method only handle pendingRequest,but inflight request is miss
6、 TCM state will alway in INITIALIZING
for udgment Condition: currentState != State.INITIALIZING &&
!hasProducerId()
7、producer send mesasge , mesasge go into batch queue,Sender will wake up and
set pollTimeout=0 , prepare to send message
8、but , before Sender sendProducerData ,it will do message filter
,RecordAccumulator drain
-->drainBatchesForOneNode-->shouldStopDrainBatchesForPartition
when producerIdAndEpoch.isValid()==false,return true, it will not collect
any message
9、now kafka producer network thread CPU useage will go 100%
10、even we add user auth info and permission in kafka server ,it can not
self-healing
suggest :
also catch AuthenticationException
org.apache.kafka.clients.producer.internals.Sender#maybeSendAndPollTransactionalRequest
and respone failed to inflight InitProducerId request
--
This message was sent by Atlassian Jira
(v8.20.10#820010)