[ 
https://issues.apache.org/jira/browse/KAFKA-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719585#comment-17719585
 ] 

Luke Chen commented on KAFKA-14914:
-----------------------------------

[~flashmouse] , thanks for reporting this issue. Does the offset index still 
exist? Could you upload these indexes for investigation?

> binarySearch in AbstactIndex may execute with infinite loop
> -----------------------------------------------------------
>
>                 Key: KAFKA-14914
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14914
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.4.0
>            Reporter: li xiangyuan
>            Priority: Major
>         Attachments: stack.1.txt, stack.2.txt, stack.3.txt
>
>
> Recently our servers in production environment may suddenly stop handle 
> request frequently(for now 3 times in less than 10 days),   please check the 
> stack file uploaded, it show that 1 
> ioThread(data-plane-kafka-request-handler-11) hold  the ReadLock of 
> Partition's leaderIsrUpdateLock and keep run the binarySearch function, once 
> any thread(kafka-scheduler-2) need WriteMode Of this lock, then all requests 
> read this partition need ReadMode Lock will use out all ioThreads and then 
> this broker couldn't handle any request.
> the 3 stack files are fetched with interval  about 6 minute, with my 
> standpoint i just could think obviously the  binarySearch function cause dead 
> lock and I presuppose maybe the index block values in offsetIndex (at least 
> in mmap) are not sorted.
>  
> detail information:
> this problem appear in 2 brokers
> broker version: 2.4.0
> jvm: openjdk 11
> hardware: aws c7g 4xlarge, this is a arm64 server, we recently upgrade our 
> servers from c6g 4xlarge to this type, when we use c6g haven't meet this 
> problem, we don't know whether arm or aws c7g server have any problem.
> other: once we restart broker, it will recover, so we doubt offset index file 
> may not corrupted and maybe something wrong with mmap.
> plz give any suggestion solve this problem, thx.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to