[ https://issues.apache.org/jira/browse/KAFKA-4848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953804#comment-15953804 ]
Eno Thereska edited comment on KAFKA-4848 at 4/3/17 4:54 PM: ------------------------------------------------------------- Sachin, stay tuned, the committers are looking into it for now. was (Author: enothereska): Saching, stay tuned, the committers are looking into it for now. > Stream thread getting into deadlock state while trying to get rocksdb lock in > retryWithBackoff > ---------------------------------------------------------------------------------------------- > > Key: KAFKA-4848 > URL: https://issues.apache.org/jira/browse/KAFKA-4848 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 0.10.2.0 > Reporter: Sachin Mittal > Assignee: Sachin Mittal > Fix For: 0.11.0.0, 0.10.2.1 > > Attachments: thr-1 > > > We see a deadlock state when streams thread to process a task takes longer > than MAX_POLL_INTERVAL_MS_CONFIG time. In this case this threads partitions > are assigned to some other thread including rocksdb lock. When it tries to > process the next task it cannot get rocks db lock and simply keeps waiting > for that lock forever. > in retryWithBackoff for AbstractTaskCreator we have a backoffTimeMs = 50L. > If it does not get lock the we simply increase the time by 10x and keep > trying inside the while true loop. > We need to have a upper bound for this backoffTimeM. If the time is greater > than MAX_POLL_INTERVAL_MS_CONFIG and it still hasn't got the lock means this > thread's partitions are moved somewhere else and it may not get the lock > again. -- This message was sent by Atlassian JIRA (v6.3.15#6346)