[jira] [Comment Edited] (KAFKA-8041) Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll

2024-04-18 Thread Omnia Ibrahim (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838583#comment-17838583
 ] 

Omnia Ibrahim edited comment on KAFKA-8041 at 4/18/24 10:14 AM:


[~soarez] Yes, sorry for the late respond. I believe this should be fixed now 
after the merge of [https://github.com/apache/kafka/pull/15335] . It has been 
passing for the last couple of weeks with no flakiness 
[https://ge.apache.org/scans/tests?search.names=Git%20branch=P28D=kafka=America%2FLos_Angeles=trunk=kafka.server.LogDirFailureTest=testIOExceptionDuringLogRoll(String)%5B2%5D
 
|https://ge.apache.org/scans/tests?search.names=Git%20branch=P28D=kafka=America%2FLos_Angeles=trunk=kafka.server.LogDirFailureTest=testIOExceptionDuringLogRoll(String)%5B2%5D]


was (Author: omnia_h_ibrahim):
[~soarez] I believe this should be fixed now after the merge of 
[https://github.com/apache/kafka/pull/15335] . It has been passing for the last 
couple of weeks with no flakiness 
[https://ge.apache.org/scans/tests?search.names=Git%20branch=P28D=kafka=America%2FLos_Angeles=trunk=kafka.server.LogDirFailureTest=testIOExceptionDuringLogRoll(String)%5B2%5D
 
|https://ge.apache.org/scans/tests?search.names=Git%20branch=P28D=kafka=America%2FLos_Angeles=trunk=kafka.server.LogDirFailureTest=testIOExceptionDuringLogRoll(String)%5B2%5D]

> Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll
> -
>
> Key: KAFKA-8041
> URL: https://issues.apache.org/jira/browse/KAFKA-8041
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.0.1, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Bob Barrett
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/236/tests]
> {quote}java.lang.AssertionError: Expected some messages
> at kafka.utils.TestUtils$.fail(TestUtils.scala:357)
> at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:787)
> at 
> kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:189)
> at 
> kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63){quote}
> STDOUT
> {quote}[2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=0, fetcherId=0] Error for partition topic-6 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-0 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-10 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-4 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-8 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,615] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-2 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:45:00,248] ERROR Error while rolling log segment for topic-0 
> in dir 
> /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216
>  (kafka.server.LogDirFailureChannel:76)
> java.io.FileNotFoundException: 
> /home/jenkins/jenkins-slave/workspace/kafka-2.0-jdk8/core/data/kafka-3869208920357262216/topic-0/.index
>  (Not a directory)
> at java.io.RandomAccessFile.open0(Native Method)
> at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
> at java.io.RandomAccessFile.(RandomAccessFile.java:243)
> at kafka.log.AbstractIndex.$anonfun$resize$1(AbstractIndex.scala:121)
> at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:12)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at 

[jira] [Comment Edited] (KAFKA-8041) Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll

2024-03-21 Thread Igor Soarez (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829605#comment-17829605
 ] 

Igor Soarez edited comment on KAFKA-8041 at 3/21/24 3:30 PM:
-

This failed again in a PR build:[ 
https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14903/10/tests/|https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14903/10/tests/]

 

 
{code:java}
[2024-03-20T12:19:04.275Z] Gradle Test Run :core:test > Gradle Test Executor 
105 > LogDirFailureTest > testIOExceptionDuringCheckpoint(String) > 
testIOExceptionDuringCheckpoint(String).quorum=kraft FAILED
[2024-03-20T12:19:04.275Z]     org.opentest4j.AssertionFailedError: expected: 
 but was: 
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)
[2024-03-20T12:19:04.275Z]         at 
app//kafka.utils.TestUtils$.causeLogDirFailure(TestUtils.scala:1715)
[2024-03-20T12:19:04.275Z]         at 
app//kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:187)
[2024-03-20T12:19:04.275Z]         at 
app//kafka.server.LogDirFailureTest.testIOExceptionDuringCheckpoint(LogDirFailureTest.scala:114)
 {code}
 

 


was (Author: soarez):
This failed again in [a PR 
build|[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-14903/10/tests/]]:

 

 
{code:java}
[2024-03-20T12:19:04.275Z] Gradle Test Run :core:test > Gradle Test Executor 
105 > LogDirFailureTest > testIOExceptionDuringCheckpoint(String) > 
testIOExceptionDuringCheckpoint(String).quorum=kraft FAILED
[2024-03-20T12:19:04.275Z]     org.opentest4j.AssertionFailedError: expected: 
 but was: 
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
[2024-03-20T12:19:04.275Z]         at 
app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)
[2024-03-20T12:19:04.275Z]         at 
app//kafka.utils.TestUtils$.causeLogDirFailure(TestUtils.scala:1715)
[2024-03-20T12:19:04.275Z]         at 
app//kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:187)
[2024-03-20T12:19:04.275Z]         at 
app//kafka.server.LogDirFailureTest.testIOExceptionDuringCheckpoint(LogDirFailureTest.scala:114)
 {code}
 

 

> Flaky Test LogDirFailureTest#testIOExceptionDuringLogRoll
> -
>
> Key: KAFKA-8041
> URL: https://issues.apache.org/jira/browse/KAFKA-8041
> Project: Kafka
>  Issue Type: Bug
>  Components: core, unit tests
>Affects Versions: 2.0.1, 2.3.0
>Reporter: Matthias J. Sax
>Assignee: Bob Barrett
>Priority: Critical
>  Labels: flaky-test
> Fix For: 2.4.0
>
>
> [https://builds.apache.org/blue/organizations/jenkins/kafka-2.0-jdk8/detail/kafka-2.0-jdk8/236/tests]
> {quote}java.lang.AssertionError: Expected some messages
> at kafka.utils.TestUtils$.fail(TestUtils.scala:357)
> at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:787)
> at 
> kafka.server.LogDirFailureTest.testProduceAfterLogDirFailureOnLeader(LogDirFailureTest.scala:189)
> at 
> kafka.server.LogDirFailureTest.testIOExceptionDuringLogRoll(LogDirFailureTest.scala:63){quote}
> STDOUT
> {quote}[2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, 
> leaderId=0, fetcherId=0] Error for partition topic-6 at offset 0 
> (kafka.server.ReplicaFetcherThread:76)
> org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server 
> does not host this topic-partition.
> [2019-03-05 03:44:58,614] ERROR [ReplicaFetcher replicaId=1, leaderId=0, 
> fetcherId=0] Error for partition topic-0 at offset 0 
>