[jira] [Commented] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.PlaintextAdminIntegrationTest.testLogStartOffsetCheckpoint and kafka.api.PlaintextAdminIntegrationTest.testAlterReplicaLogDi

2020-12-01 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17242016#comment-17242016
 ] 

Chia-Ping Tsai commented on KAFKA-9263:
---

PlaintextAdminIntegrationTest.testLogStartOffsetCheckpoint does not fail 
recently and I looped it 200 times, all pass.

The https://github.com/apache/kafka/pull/9423 which fixes 
kafka.api.PlaintextAdminIntegrationTest.testAlterReplicaLogDirs is going to be 
merged so I will revise the title of this issue (i.e remove 
PlaintextAdminIntegrationTest.testLogStartOffsetCheckpoint)

> Reocurrence: Transient failure in 
> kafka.api.PlaintextAdminIntegrationTest.testLogStartOffsetCheckpoint and 
> kafka.api.PlaintextAdminIntegrationTest.testAlterReplicaLogDirs
> --
>
> Key: KAFKA-9263
> URL: https://issues.apache.org/jira/browse/KAFKA-9263
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.4.0
>Reporter: John Roesler
>Priority: Major
>  Labels: flaky-test
>
> This test has failed for me on 
> https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/
> {noformat}
> Error Message
> org.scalatest.exceptions.TestFailedException: only 0 messages are produced 
> within timeout after replica movement. Producer future 
> Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for 
> 1 ms.))
> Stacktrace
> org.scalatest.exceptions.TestFailedException: only 0 messages are produced 
> within timeout after replica movement. Producer future 
> Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for 
> 1 ms.))
>   at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530)
>   at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529)
>   at 
> org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389)
>   at org.scalatest.Assertions.fail(Assertions.scala:1091)
>   at org.scalatest.Assertions.fail$(Assertions.scala:1087)
>   at org.scalatest.Assertions$.fail(Assertions.scala:1389)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842)
>   at 
> kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Standard Output
> [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition unclean-test-topic-1-0 at 

[jira] [Commented] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.PlaintextAdminIntegrationTest.testLogStartOffsetCheckpoint and kafka.api.PlaintextAdminIntegrationTest.testAlterReplicaLogDi

2020-08-14 Thread Bill Bejeck (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17178021#comment-17178021
 ] 

Bill Bejeck commented on KAFKA-9263:


 

Test failure 
[https://builds.apache.org/job/kafka-pr-jdk11-scala2.13/7843/testReport/junit/kafka.api/PlaintextAdminIntegrationTest/testAlterReplicaLogDirs/]

 
{noformat}
2020-08-14 17:34:06,420] ERROR [ReplicaManager broker=0] Error while changing 
replica dir for partition topic-0 (kafka.server.ReplicaManager:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: Error while 
fetching partition state for topic-0
[2020-08-14 17:34:06,420] ERROR [ReplicaManager broker=1] Error while changing 
replica dir for partition topic-0 (kafka.server.ReplicaManager:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: Error while 
fetching partition state for topic-0
[2020-08-14 17:34:06,420] ERROR [ReplicaManager broker=2] Error while changing 
replica dir for partition topic-0 (kafka.server.ReplicaManager:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: Error while 
fetching partition state for topic-0
[2020-08-14 17:36:24,822] ERROR [Consumer instanceId=test_instance_id_1, 
clientId=test_client_id, groupId=test_group_id] Offset commit failed on 
partition test_topic-1 at offset 0: The coordinator is not aware of this 
member. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1191)
[2020-08-14 17:36:24,823] ERROR Thread Thread[Thread-4,5,FailOnTimeoutGroup] 
died (org.apache.zookeeper.server.NIOServerCnxnFactory:92)
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be 
completed since the group has already rebalanced and assigned the partitions to 
another member. This means that the time between subsequent calls to poll() was 
longer than the configured max.poll.interval.ms, which typically implies that 
the poll loop is spending too much time message processing. You can address 
this either by increasing max.poll.interval.ms or by reducing the maximum size 
of batches returned in poll() with max.poll.records.
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:1257)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:1164)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:1132)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:1107)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:206)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:169)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:129)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:602)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:412)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:297)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:1006)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1394)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1348)
at 
kafka.api.PlaintextAdminIntegrationTest$$anon$1.run(PlaintextAdminIntegrationTest.scala:1071)
[2020-08-14 17:36:24,848] ERROR [Consumer instanceId=test_instance_id_2, 
clientId=test_client_id, groupId=test_group_id] Offset commit failed on 
partition test_topic1-0 at offset 0: Specified group generation id is not 
valid. (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1191)
[2020-08-14 17:36:24,852] ERROR [Consumer clientId=test_client_id, 
groupId=test_group_id] Offset commit failed on partition test_topic2-0 at 
offset 0: Specified group generation id is not valid. 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1191)
[2020-08-14 18:29:44,855] ERROR [ReplicaManager broker=1] Error while changing 
replica dir for partition topic-0 (kafka.server.ReplicaManager:76)
org.apache.kafka.common.errors.UnknownTopicOrPartitionException: Error while 
fetching partition state for topic-0
[2020-08-14 18:29:44,855] ERROR [ReplicaManager broker=0] 

[jira] [Commented] (KAFKA-9263) Reocurrence: Transient failure in kafka.api.PlaintextAdminIntegrationTest.testLogStartOffsetCheckpoint and kafka.api.PlaintextAdminIntegrationTest.testAlterReplicaLogDi

2020-02-03 Thread Chia-Ping Tsai (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028762#comment-17028762
 ] 

Chia-Ping Tsai commented on KAFKA-9263:
---

update the title since KAFKA-9183 had renamed AdminClientIntegrationTest to 
PlaintextAdminIntegrationTest

> Reocurrence: Transient failure in 
> kafka.api.PlaintextAdminIntegrationTest.testLogStartOffsetCheckpoint and 
> kafka.api.PlaintextAdminIntegrationTest.testAlterReplicaLogDirs
> --
>
> Key: KAFKA-9263
> URL: https://issues.apache.org/jira/browse/KAFKA-9263
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.4.0
>Reporter: John Roesler
>Priority: Major
>  Labels: flaky-test
>
> This test has failed for me on 
> https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/9691/testReport/junit/kafka.api/AdminClientIntegrationTest/testAlterReplicaLogDirs/
> {noformat}
> Error Message
> org.scalatest.exceptions.TestFailedException: only 0 messages are produced 
> within timeout after replica movement. Producer future 
> Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for 
> 1 ms.))
> Stacktrace
> org.scalatest.exceptions.TestFailedException: only 0 messages are produced 
> within timeout after replica movement. Producer future 
> Some(Failure(java.util.concurrent.TimeoutException: Timeout after waiting for 
> 1 ms.))
>   at 
> org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:530)
>   at 
> org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:529)
>   at 
> org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1389)
>   at org.scalatest.Assertions.fail(Assertions.scala:1091)
>   at org.scalatest.Assertions.fail$(Assertions.scala:1087)
>   at org.scalatest.Assertions$.fail(Assertions.scala:1389)
>   at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:842)
>   at 
> kafka.api.AdminClientIntegrationTest.testAlterReplicaLogDirs(AdminClientIntegrationTest.scala:459)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> Standard Output
> [2019-12-03 04:54:16,111] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-12-03 04:54:21,711] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-12-03 04:54:21,712] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-12-03 04:54:27,092] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error for partition unclean-test-topic-1-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-12-03 04:54:27,091] ERROR [ReplicaFetcher replicaId=0, leaderId=1, 
> fetcherId=0] Error for