[jira] [Commented] (KAFKA-13131) Consumer offsets lost during partition reassignment

2021-07-24 Thread Sergejs Andrejevs (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17386781#comment-17386781
 ] 

Sergejs Andrejevs commented on KAFKA-13131:
---

This happened once at production cluster and, unfortunately, so far I couldn't 
reproduce the issue at test environment (with the same version).

Therefore, currently I cannot test it with any other version.

> Consumer offsets lost during partition reassignment
> ---
>
> Key: KAFKA-13131
> URL: https://issues.apache.org/jira/browse/KAFKA-13131
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Sergejs Andrejevs
>Priority: Major
>
> While doing replicas reassignment of a *___consumer_offsets_* partition from 
> one set of brokers to another, the consumer group offset got lost (seems to 
> be reset to earliest).
>  
> offsets.retention.minutes: 10080
> Consumers are constantly reading and regularly commit offsets.
> Initial setup:
>  __consumer_offsets-18 
>  Replicas: 9,7,6
> Desired setup:
>  __consumer_offsets-18
>  Replicas: 11,10,5
> File_with_desired_state:
> {code:java}
> {
>   "version": 1,
>   "partitions": [
> {
>   "topic": "__consumer_offsets",
>   "partition": 18,
>   "replicas": [
> 11,
> 10,
> 5
>   ],
>   "log_dirs": [
> "/path_replica_1",
> "/path_replica_2",
> "/path_replica_3"
>   ]
> }
>   ]
> }
> {code}
> Reassignment command:
> {code:java}
> /opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server localhost:9092 
> --execute --reassignment-json-file File_with_desired_state --throttle 
> 104857600 --replica-alter-log-dirs-throttle 104857600
> {code}
> The error in logs at the broker:
> {code:java}
> [2021-07-22 05:28:11,777] ERROR [GroupMetadataManager brokerId=11] Error 
> loading offsets from __consumer_offsets-18 
> (kafka.coordinator.group.GroupMetadataManager)
> java.lang.NullPointerException
> at kafka.log.OffsetIndex.$anonfun$lookup$1(OffsetIndex.scala:90)
> at kafka.log.AbstractIndex.maybeLock(AbstractIndex.scala:338)
> at kafka.log.OffsetIndex.lookup(OffsetIndex.scala:89)
> at kafka.log.LogSegment.translateOffset(LogSegment.scala:274)
> at kafka.log.LogSegment.read(LogSegment.scala:298)
> at kafka.log.Log.$anonfun$read$2(Log.scala:1522)
> at kafka.log.Log.read(Log.scala:2340)
> at 
> kafka.coordinator.group.GroupMetadataManager.loadGroupsAndOffsets(GroupMetadataManager.scala:589)
> at 
> kafka.coordinator.group.GroupMetadataManager.$anonfun$scheduleLoadGroupAndOffsets$2(GroupMetadataManager.scala:537)
> at 
> kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> It was tried to reproduce at test environments, but so far unsuccessfully.
> Let me know if any other configuration/parameters/details shall be added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13131) Consumer offsets lost during partition reassignment

2021-07-23 Thread Ismael Juma (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17386536#comment-17386536
 ] 

Ismael Juma commented on KAFKA-13131:
-

Can you please test with 2.8.0 or trunk?

> Consumer offsets lost during partition reassignment
> ---
>
> Key: KAFKA-13131
> URL: https://issues.apache.org/jira/browse/KAFKA-13131
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Sergejs Andrejevs
>Priority: Major
>
> While doing replicas reassignment of a *___consumer_offsets_* partition from 
> one set of brokers to another, the consumer group offset got lost (seems to 
> be reset to earliest).
>  
> offsets.retention.minutes: 10080
> Consumers are constantly reading and regularly commit offsets.
> Initial setup:
>  __consumer_offsets-18 
>  Replicas: 9,7,6
> Desired setup:
>  __consumer_offsets-18
>  Replicas: 11,10,5
> File_with_desired_state:
> {code:java}
> {
>   "version": 1,
>   "partitions": [
> {
>   "topic": "__consumer_offsets",
>   "partition": 18,
>   "replicas": [
> 11,
> 10,
> 5
>   ],
>   "log_dirs": [
> "/path_replica_1",
> "/path_replica_2",
> "/path_replica_3"
>   ]
> }
>   ]
> }
> {code}
> Reassignment command:
> {code:java}
> /opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server localhost:9092 
> --execute --reassignment-json-file File_with_desired_state --throttle 
> 104857600 --replica-alter-log-dirs-throttle 104857600
> {code}
> The error in logs at the broker:
> {code:java}
> [2021-07-22 05:28:11,777] ERROR [GroupMetadataManager brokerId=11] Error 
> loading offsets from __consumer_offsets-18 
> (kafka.coordinator.group.GroupMetadataManager)
> java.lang.NullPointerException
> at kafka.log.OffsetIndex.$anonfun$lookup$1(OffsetIndex.scala:90)
> at kafka.log.AbstractIndex.maybeLock(AbstractIndex.scala:338)
> at kafka.log.OffsetIndex.lookup(OffsetIndex.scala:89)
> at kafka.log.LogSegment.translateOffset(LogSegment.scala:274)
> at kafka.log.LogSegment.read(LogSegment.scala:298)
> at kafka.log.Log.$anonfun$read$2(Log.scala:1522)
> at kafka.log.Log.read(Log.scala:2340)
> at 
> kafka.coordinator.group.GroupMetadataManager.loadGroupsAndOffsets(GroupMetadataManager.scala:589)
> at 
> kafka.coordinator.group.GroupMetadataManager.$anonfun$scheduleLoadGroupAndOffsets$2(GroupMetadataManager.scala:537)
> at 
> kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> It was tried to reproduce at test environments, but so far unsuccessfully.
> Let me know if any other configuration/parameters/details shall be added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13131) Consumer offsets lost during partition reassignment

2021-07-23 Thread Sergejs Andrejevs (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17386342#comment-17386342
 ] 

Sergejs Andrejevs commented on KAFKA-13131:
---

Several comments at ticket https://issues.apache.org/jira/browse/KAFKA-7447 
also included similar error message, but not sure if the original issue was 
about the same, as there were linked a few tasks that were closed&released in 
previous versions.

> Consumer offsets lost during partition reassignment
> ---
>
> Key: KAFKA-13131
> URL: https://issues.apache.org/jira/browse/KAFKA-13131
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Sergejs Andrejevs
>Priority: Major
>
> While doing replicas reassignment of a *___consumer_offsets_* partition from 
> one set of brokers to another, the consumer group offset got lost (seems to 
> be reset to earliest).
> Initial setup:
>  __consumer_offsets-18 
>  Replicas: 9,7,6
> Desired setup:
>  __consumer_offsets-18
>  Replicas: 11,10,5
> File_with_desired_state:
> {code:java}
> {
>   "version": 1,
>   "partitions": [
> {
>   "topic": "__consumer_offsets",
>   "partition": 18,
>   "replicas": [
> 11,
> 10,
> 5
>   ],
>   "log_dirs": [
> "/path_replica_1",
> "/path_replica_2",
> "/path_replica_3"
>   ]
> }
>   ]
> }
> {code}
> Reassignment command:
> {code:java}
> /opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server localhost:9092 
> --execute --reassignment-json-file File_with_desired_state --throttle 
> 104857600 --replica-alter-log-dirs-throttle 104857600
> {code}
> The error in logs at the broker:
> {code:java}
> [2021-07-22 05:28:11,777] ERROR [GroupMetadataManager brokerId=11] Error 
> loading offsets from __consumer_offsets-18 
> (kafka.coordinator.group.GroupMetadataManager)
> java.lang.NullPointerException
> at kafka.log.OffsetIndex.$anonfun$lookup$1(OffsetIndex.scala:90)
> at kafka.log.AbstractIndex.maybeLock(AbstractIndex.scala:338)
> at kafka.log.OffsetIndex.lookup(OffsetIndex.scala:89)
> at kafka.log.LogSegment.translateOffset(LogSegment.scala:274)
> at kafka.log.LogSegment.read(LogSegment.scala:298)
> at kafka.log.Log.$anonfun$read$2(Log.scala:1522)
> at kafka.log.Log.read(Log.scala:2340)
> at 
> kafka.coordinator.group.GroupMetadataManager.loadGroupsAndOffsets(GroupMetadataManager.scala:589)
> at 
> kafka.coordinator.group.GroupMetadataManager.$anonfun$scheduleLoadGroupAndOffsets$2(GroupMetadataManager.scala:537)
> at 
> kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> It was tried to reproduce at test environments, but so far unsuccessfully.
>  Let me know if any other configuration/parameters/details shall be added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)