[ https://issues.apache.org/jira/browse/KAFKA-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719585#comment-17719585 ]
Luke Chen commented on KAFKA-14914: ----------------------------------- [~flashmouse] , thanks for reporting this issue. Does the offset index still exist? Could you upload these indexes for investigation? > binarySearch in AbstactIndex may execute with infinite loop > ----------------------------------------------------------- > > Key: KAFKA-14914 > URL: https://issues.apache.org/jira/browse/KAFKA-14914 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 2.4.0 > Reporter: li xiangyuan > Priority: Major > Attachments: stack.1.txt, stack.2.txt, stack.3.txt > > > Recently our servers in production environment may suddenly stop handle > request frequently(for now 3 times in less than 10 days), please check the > stack file uploaded, it show that 1 > ioThread(data-plane-kafka-request-handler-11) hold the ReadLock of > Partition's leaderIsrUpdateLock and keep run the binarySearch function, once > any thread(kafka-scheduler-2) need WriteMode Of this lock, then all requests > read this partition need ReadMode Lock will use out all ioThreads and then > this broker couldn't handle any request. > the 3 stack files are fetched with interval about 6 minute, with my > standpoint i just could think obviously the binarySearch function cause dead > lock and I presuppose maybe the index block values in offsetIndex (at least > in mmap) are not sorted. > > detail information: > this problem appear in 2 brokers > broker version: 2.4.0 > jvm: openjdk 11 > hardware: aws c7g 4xlarge, this is a arm64 server, we recently upgrade our > servers from c6g 4xlarge to this type, when we use c6g haven't meet this > problem, we don't know whether arm or aws c7g server have any problem. > other: once we restart broker, it will recover, so we doubt offset index file > may not corrupted and maybe something wrong with mmap. > plz give any suggestion solve this problem, thx. -- This message was sent by Atlassian Jira (v8.20.10#820010)