[ 
https://issues.apache.org/jira/browse/KAFKA-9607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyang Chen updated KAFKA-9607:
-------------------------------
    Summary: Should not clear partition queue or throw illegal state exception 
during task revocation  (was: Should not clear partition group if the task will 
be revived again)

> Should not clear partition queue or throw illegal state exception during task 
> revocation
> ----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-9607
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9607
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Boyang Chen
>            Assignee: Boyang Chen
>            Priority: Major
>
> We detected an issue with a corrupted task failed to revive:
> {code:java}
> [2020-02-25T08:23:38-08:00] 
> (streams-soak-trunk_soak_i-06305ad57801079ae_streamslog) [2020-02-25 
> 16:23:38,137] INFO 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] 
> stream-thread 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] Handle 
> new assignment with:
>         New active tasks: [0_0, 3_1]
>         New standby tasks: []
>         Existing active tasks: [0_0]
>         Existing standby tasks: [] 
> (org.apache.kafka.streams.processor.internals.TaskManager)
> [2020-02-25T08:23:38-08:00] 
> (streams-soak-trunk_soak_i-06305ad57801079ae_streamslog) [2020-02-25 
> 16:23:38,138] INFO 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] 
> [Consumer 
> clientId=stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1-consumer,
>  groupId=stream-soak-test] Adding newly assigned partitions: 
> k8sName-id-repartition-1 
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> [2020-02-25T08:23:38-08:00] 
> (streams-soak-trunk_soak_i-06305ad57801079ae_streamslog) [2020-02-25 
> 16:23:38,138] INFO 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] 
> stream-thread 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] State 
> transition from RUNNING to PARTITIONS_ASSIGNED 
> (org.apache.kafka.streams.processor.internals.StreamThread)
> [2020-02-25T08:23:39-08:00] 
> (streams-soak-trunk_soak_i-06305ad57801079ae_streamslog) [2020-02-25 
> 16:23:38,419] WARN 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] 
> stream-thread 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] 
> Encountered org.apache.kafka.clients.consumer.OffsetOutOfRangeException 
> fetching records from restore consumer for partitions 
> [stream-soak-test-KSTREAM-AGGREGATE-STATE-STORE-0000000049-changelog-1], it 
> is likely that the consumer's position has fallen out of the topic partition 
> offset range because the topic was truncated or compacted on the broker, 
> marking the corresponding tasks as corrupted and re-initializingit later. 
> (org.apache.kafka.streams.processor.internals.StoreChangelogReader)
> [2020-02-25T08:23:38-08:00] 
> (streams-soak-trunk_soak_i-06305ad57801079ae_streamslog) [2020-02-25 
> 16:23:38,139] INFO 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] 
> [Consumer 
> clientId=stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1-consumer,
>  groupId=stream-soak-test] Setting offset for partition 
> k8sName-id-repartition-1 to the committed offset 
> FetchPosition{offset=3592242, offsetEpoch=Optional.empty, 
> currentLeader=LeaderAndEpoch{leader=Optional[ip-172-31-25-115.us-west-2.compute.internal:9092
>  (id: 1003 rack: null)], epoch=absent}} 
> (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
> [2020-02-25T08:23:39-08:00] 
> (streams-soak-trunk_soak_i-06305ad57801079ae_streamslog) [2020-02-25 
> 16:23:38,463] ERROR 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] 
> stream-thread 
> [stream-soak-test-8f2124ec-8bd0-410a-8e3d-f202a32ab774-StreamThread-1] 
> Encountered the following exception during processing and the thread is going 
> to shut down:  (org.apache.kafka.streams.processor.internals.StreamThread)
> [2020-02-25T08:23:39-08:00] 
> (streams-soak-trunk_soak_i-06305ad57801079ae_streamslog) 
> java.lang.IllegalStateException: Partition k8sName-id-repartition-1 not found.
>         at 
> org.apache.kafka.streams.processor.internals.PartitionGroup.setPartitionTime(PartitionGroup.java:99)
>         at 
> org.apache.kafka.streams.processor.internals.StreamTask.initializeTaskTime(StreamTask.java:651)
>         at 
> org.apache.kafka.streams.processor.internals.StreamTask.initializeMetadata(StreamTask.java:631)
>         at 
> org.apache.kafka.streams.processor.internals.StreamTask.completeRestoration(StreamTask.java:209)
>         at 
> org.apache.kafka.streams.processor.internals.TaskManager.tryToCompleteRestoration(TaskManager.java:270)
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:834)
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:749)
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:725)
> {code}
> The root cause is that we accidentally cleanup the partition group map so 
> that next time we reboot the task it would miss input partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to