[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917669#comment-16917669 ] Harsh Singh edited comment on KAFKA-5998 at 8/28/19 11:36 AM: -- Hi [~pkleindl] [~vvcephei] , I couldn't locate version 2.3.1 in repo yet. Latest I see is 2.3.0, where the fix for [https://github.com/apache/kafka/pull/6846] seems missing. Could you please confirm if 2.3.1 is released yet? Same is the case with 2.2.2 and 2.4.0. Also just curious to know, did we figure out the root cause for base issue of {{FileNotFoundException}} on checkpoint.write, inspite of having locking mechanism in place to avoid cleanup? This will help in reproducing the issue on a small scale dev-ish kind of environment, than waiting for few hours of test run. was (Author: itsharshonly): Hi [~pkleindl] [~vvcephei] , I couldn't locate version 2.3.1 in repo yet. Latest I see is 2.3.0, where the fix for [https://github.com/apache/kafka/pull/6846] seems missing. Could you please confirm if 2.3.1 is released yet? Same is the case with 2.2.2 and 2.4.0. Also just curious to know, did we figure out the root cause for base issue of {{FileNotFoundException}} on checkpoint.write, inspite of having locking mechanism in place to avoid cleanup? > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Assignee: John Roesler >Priority: Critical > Fix For: 2.2.2, 2.4.0, 2.3.1 > > Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, > exc.txt, props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917669#comment-16917669 ] Harsh Singh edited comment on KAFKA-5998 at 8/28/19 10:56 AM: -- Hi [~pkleindl] [~vvcephei] , I couldn't locate version 2.3.1 in repo yet. Latest I see is 2.3.0, where the fix for [https://github.com/apache/kafka/pull/6846] seems missing. Could you please confirm if 2.3.1 is released yet? Same is the case with 2.2.2 and 2.4.0. Also just curious to know, did we figure out the root cause for base issue of {{FileNotFoundException}} on checkpoint.write, inspite of having locking mechanism in place to avoid cleanup? was (Author: itsharshonly): Hi [~pkleindl] [~vvcephei] , I couldn't locate version 2.3.1 in repo yet. Latest I see is 2.3.0, where the fix for [https://github.com/apache/kafka/pull/6846] seems missing. Could you please confirm if 2.3.1 is released yet? Same is the case with 2.4.0. Also just curious to know, did we figure out the root cause for base issue of {{FileNotFoundException}} on checkpoint.write, inspite of having locking mechanism in place to avoid cleanup? > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Assignee: John Roesler >Priority: Critical > Fix For: 2.2.2, 2.4.0, 2.3.1 > > Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, > exc.txt, props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890932#comment-16890932 ] Sainath Y edited comment on KAFKA-5998 at 7/23/19 12:16 PM: Team/[~bbejeck] I took kafka streams 2.3 version but still the issue persists. Please suggest if there is any way to suppress this warn log as log files are growing?? 2019-07-23 11:59:51.260 10.227.254.31 task [0_45] Failed to write offset checkpoint file to [/kafkarest/counting-stream/0_45/.checkpoint],exc.stack=java.io.FileNotFoundException: /kafkarest/counting-stream/0_45/.checkpoint.tmp (No such file or directory)\n\tat java.io.FileOutputStream.open0(Native Method)\n\tat java.io.FileOutputStream.open(FileOutputStream.java:270)\n\tat java.io.FileOutputStream.(FileOutputStream.java:213)\n\tat java.io.FileOutputStream.(FileOutputStream.java:162)\n\tat org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79)\n\tat org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:347)\n\tat org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:476)\n\tat org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:461)\n\tat org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286)\n\tat org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412)\n\tat org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057)\n\tat org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911)\n\tat org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)\n\tat org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)\n was (Author: sainath932): Team, I took kafka streams 2.3 version but still the issue persists. 2019-07-23 11:59:51.260 10.227.254.31 task [0_45] Failed to write offset checkpoint file to [/kafkarest/counting-stream/0_45/.checkpoint],exc.stack=java.io.FileNotFoundException: /kafkarest/counting-stream/0_45/.checkpoint.tmp (No such file or directory)\n\tat java.io.FileOutputStream.open0(Native Method)\n\tat java.io.FileOutputStream.open(FileOutputStream.java:270)\n\tat java.io.FileOutputStream.(FileOutputStream.java:213)\n\tat java.io.FileOutputStream.(FileOutputStream.java:162)\n\tat org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79)\n\tat org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:347)\n\tat org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:476)\n\tat org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:461)\n\tat org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286)\n\tat org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412)\n\tat org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057)\n\tat org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911)\n\tat org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)\n\tat org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)\n > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Assignee: John Roesler >Priority: Critical > Fix For: 2.2.2, 2.4.0, 2.3.1 > > Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, > exc.txt, props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878674#comment-16878674 ] Di Campo edited comment on KAFKA-5998 at 7/4/19 1:55 PM: - My topology contains stateless subtopologies as well as stateful, so it may be that case. One question: > The "trigger" for the cleanup is the absence of any traffic on a topic > partition which will allow the directory to be deleted Can you elaborate on this trigger? What will be cleaned in this cleanup process? How long does it wait by default for traffic to determine absence? was (Author: xmar): My topology contains stateless subtopologies, so it may be that case. One question: > The "trigger" for the cleanup is the absence of any traffic on a topic > partition which will allow the directory to be deleted Can you elaborate on this trigger? What will be cleaned in this cleanup process? How long does it wait by default for traffic to determine absence? > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Assignee: Bill Bejeck >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, > exc.txt, props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878549#comment-16878549 ] Patrik Kleindl edited comment on KAFKA-5998 at 7/4/19 10:57 AM: As this is hopefully coming to an end thanks to [~vvcephei] I'll try a little writeup so others can compare their setup and hopefully verify that the fix covers all situations. I'll also attach the code I used to reproduce the issue, should be runnable for anyone with docker. [^Kafka5998.zip] Setup: We have a streams application with two subtopologies where the first one does not have state and the second one does. State directories are created for all tasks, but as the first subtopology does not have state it does not get locked internally and is therefor subject to the cleanup process. The "trigger" for the cleanup is the absence of any traffic on a topic partition which will allow the directory to be deleted, so this has higher probability on any low-volume topics (in our case, CDC from a reference table with little to no changes). After the directory is deleted, the next record received (and any thereafter) will trigger the warning because Kafka Streams erroneously tries to write a checkpoint because all tasks tried to do this for all restored partitions, not only those assigned to them. This is what Johns PR covers. We have EOS disabled, so it would be good if people could tell if this has happened with EOS too. Or if this warning has occurred for anyone with a different topology situation or on a high volume topic. For reference our topology: {code:java} Topologies: Sub-topology: 0 Source: KSTREAM-SOURCE-00 (topics: [inputTopic]) --> KSTREAM-FILTER-01 Processor: KSTREAM-FILTER-01 (stores: []) --> KSTREAM-KEY-SELECT-02 <-- KSTREAM-SOURCE-00 Processor: KSTREAM-KEY-SELECT-02 (stores: []) --> KSTREAM-FILTER-05 <-- KSTREAM-FILTER-01 Processor: KSTREAM-FILTER-05 (stores: []) --> KSTREAM-SINK-04 <-- KSTREAM-KEY-SELECT-02 Sink: KSTREAM-SINK-04 (topic: store-repartition) <-- KSTREAM-FILTER-05 Sub-topology: 1 Source: KSTREAM-SOURCE-06 (topics: [store-repartition]) --> KSTREAM-REDUCE-03 Processor: KSTREAM-REDUCE-03 (stores: [store]) --> KTABLE-TOSTREAM-07 <-- KSTREAM-SOURCE-06 Processor: KTABLE-TOSTREAM-07 (stores: []) --> KSTREAM-SINK-08 <-- KSTREAM-REDUCE-03 Sink: KSTREAM-SINK-08 (topic: outputTopic) <-- KTABLE-TOSTREAM-07{code} was (Author: pkleindl): As this is hopefully coming to an end thanks to [~vvcephei] I'll try a little writeup so others can compare their setup and hopefully verify that the fix covers all situations. I'll also attach the code I used to reproduce the issue, should be runnable for anyone with docker. Setup: We have a streams application with two subtopologies where the first one does not have state and the second one does. State directories are created for all tasks, but as the first subtopology does not have state it does not get locked internally and is therefor subject to the cleanup process. The "trigger" for the cleanup is the absence of any traffic on a topic partition which will allow the directory to be deleted, so this has higher probability on any low-volume topics (in our case, CDC from a reference table with little to no changes). After the directory is deleted, the next record received (and any thereafter) will trigger the warning because Kafka Streams erroneously tries to write a checkpoint because all tasks tried to do this for all restored partitions, not only those assigned to them. This is what Johns PR covers. We have EOS disabled, so it would be good if people could tell if this has happened with EOS too. Or if this warning has occurred for anyone with a different topology situation or on a high volume topic. For reference our topology: {code:java} Topologies: Sub-topology: 0 Source: KSTREAM-SOURCE-00 (topics: [inputTopic]) --> KSTREAM-FILTER-01 Processor: KSTREAM-FILTER-01 (stores: []) --> KSTREAM-KEY-SELECT-02 <-- KSTREAM-SOURCE-00 Processor: KSTREAM-KEY-SELECT-02 (stores: []) --> KSTREAM-FILTER-05 <-- KSTREAM-FILTER-01 Processor: KSTREAM-FILTER-05 (stores: []) --> KSTREAM-SINK-04 <-- KSTREAM-KEY-SELECT-02 Sink: KSTREAM-SINK-04 (topic: store-repartition) <-- KSTREAM-FILTER-05 Sub-topology: 1 Source: KSTREAM-SOURCE-06 (topics: [store-repartition]) --> KSTREAM-REDUCE-03 Processor: KSTREAM-REDUCE-03 (stores: [store]) --> KTABLE-TOSTREAM-07 <-- KSTREAM-SOURCE-06 Processor: KTABLE-TOSTREAM-07 (stores: []) --> KSTREAM-SINK-08 <--
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877136#comment-16877136 ] Di Campo edited comment on KAFKA-5998 at 7/2/19 4:52 PM: - Just in case it helps. I just found it today on 2.1.1 (again, I commented here some months ago). 5 brokers cluster, 3 Kafka Streams instances (2 `num.streams.threads` each). AMZN Linux. Docker on ECS. I've seen that, before the task dies, it prints the following WARNs from one task. Please note that from the 64 partitions, only a few of them fail starting at 13:17. And the same batch of the same partitions start failing again at 13:42. Why are the same partitions failing? Does it match with your findings? {{[2019-07-02 13:17:01,101] WARN task [2_31] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_31/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:17:01,118] WARN task [2_47] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_47/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:17:01,156] WARN task [2_27] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_27/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:12,360] WARN task [2_63] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_63/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:12,579] WARN task [2_35] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_35/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:20:13,001] WARN task [2_23] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_23/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:23:18,421] WARN task [2_39] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_39/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:23:18,613] WARN task [2_55] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_55/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,366] WARN task [2_31] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_31/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,473] WARN task [2_47] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_47/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:42:46,639] WARN task [2_27] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_27/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:19,888] WARN task [2_63] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_63/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,042] WARN task [2_35] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_35/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,380] WARN task [2_55] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_55/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:46:20,384] WARN task [2_23] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_23/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} {{[2019-07-02 13:48:07,011] WARN task [2_39] Failed to write offset checkpoint file to /data/kafka-streams/stream-processor-0.0.1/2_39/.checkpoint: {} (org.apache.kafka.streams.processor.internals.ProcessorStateManager)}} Later, the application died some minutes later, at 13:59:13. In case there is a relation, it was killed due to OOM. was (Author: xmar): Just in case it helps. I just found it today on 2.1.1 (again, I commented here some months ago). 5 brokers cluster, 3 Kafka Streams instances (2 `num.streams.threads` each). AMZN Linux. Docker on ECS. I've seen that, before the task dies, it prints the following WARNs from one task. Please note that from the 64 partitions, only a few of them fail starting at 13:17. And the same batch of the same partitions start failing again at 13:42. Why are the
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846566#comment-16846566 ] Andrew edited comment on KAFKA-5998 at 5/23/19 11:32 AM: - I have just seen this error with kafka 2.2.0 and confluent 5.2.1. We are running in docker on kubernetes, and the same container works when run against some topics but not others it seems. {code:java} java.io.FileNotFoundException: /tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather/0_11/.checkpoint.tmp (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79) at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:459) at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286) at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412) at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057) at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911) at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774) {code} When I `exec` into the running container I see this : {code:java} /tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather # ls 2_0 2_10 2_12 2_14 2_16 2_18 2_2 2_4 2_6 2_8 3_0 3_10 3_12 3_14 3_16 3_18 3_2 3_4 3_6 3_8 {code} Also I see {Code} 09:55:49,898 INFO org.apache.kafka.streams.processor.internals.StateDirectory - stream-thread [kafka-streams-join-to-nearest-spm-draft-position-weather-404a240b-1e95-4587-80bf-686047658254-CleanupThread] Deleting obsolete state directory 0_1 for task 0_1 as 600898ms has elapsed (cleanup delay is 60ms). {Code} And I see the following in the StreamsConfig log output : {Code} state.cleanup.delay.ms = 60 state.dir = /tmp/kafka-streams {Code} was (Author: the4thamigo_uk): I have just seen this error with kafka 2.2.0 and confluent 5.2.1. We are running in docker on kubernetes, and the same container works when run against some topics but not others it seems. {code:java} java.io.FileNotFoundException: /tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather/0_11/.checkpoint.tmp (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79) at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:459) at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286) at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412) at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057) at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911) at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774) {code} When I `exec` into the running container I see this : {Code} /tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather # ls 2_0 2_10 2_12 2_14 2_16 2_18 2_2 2_4 2_6 2_8 3_0 3_10 3_12 3_14 3_16 3_18 3_2 3_4 3_6 3_8 {Code} > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt,
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846566#comment-16846566 ] Andrew edited comment on KAFKA-5998 at 5/23/19 9:29 AM: I have just seen this error with kafka 2.2.0 and confluent 5.2.1. We are running in docker on kubernetes, and the same container works when run against some topics but not others it seems. {code:java} java.io.FileNotFoundException: /tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather/0_11/.checkpoint.tmp (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79) at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:459) at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286) at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412) at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057) at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911) at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774) {code} When I `exec` into the running container I see this : {Code} /tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather # ls 2_0 2_10 2_12 2_14 2_16 2_18 2_2 2_4 2_6 2_8 3_0 3_10 3_12 3_14 3_16 3_18 3_2 3_4 3_6 3_8 {Code} was (Author: the4thamigo_uk): I have just seen this error with kafka 2.2.0 and confluent 5.2.1. We are running in docker on kubernetes, and the same container works when run against some topics but not others it seems. {Code} java.io.FileNotFoundException: /tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather/0_11/.checkpoint.tmp (No such file or directory) at java.io.FileOutputStream.open0(Native Method) at java.io.FileOutputStream.open(FileOutputStream.java:270) at java.io.FileOutputStream.(FileOutputStream.java:213) at java.io.FileOutputStream.(FileOutputStream.java:162) at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79) at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474) at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:459) at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286) at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412) at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057) at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911) at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805) at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774) {Code} > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827123#comment-16827123 ] Ted Yu edited comment on KAFKA-5998 at 4/26/19 5:31 PM: https://pastebin.com/vyvQ8pkF shows what I mentioned earlier. In StateDirectory#cleanRemovedTasks, we use both last modification time of the directory and whether the task has been committed as criteria. When both are satisfied, we clean the directory for the task. was (Author: yuzhih...@gmail.com): https://pastebin.com/Y247UZgb shows what I mentioned earlier (incomplete). In StateDirectory#cleanRemovedTasks, we use both last modification time of the directory and whether the task has been committed as criteria. When both are satisfied, we clean the directory for the task. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827123#comment-16827123 ] Ted Yu edited comment on KAFKA-5998 at 4/26/19 5:12 PM: https://pastebin.com/Y247UZgb shows what I mentioned earlier (incomplete). In StateDirectory#cleanRemovedTasks, we use both last modification time of the directory and whether the task has been committed as criteria. When both are satisfied, we clean the directory for the task. was (Author: yuzhih...@gmail.com): https://pastebin.com/Y247UZgb shows what I mentioned earlier. In StateDirectory#cleanRemovedTasks, we don't rely on last modification time - we check whether the task has been committed. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825583#comment-16825583 ] Matthias J. Sax edited comment on KAFKA-5998 at 4/24/19 11:52 PM: -- Sounds reasonable. Not sure why we see this problem only for the checkpoint file. Restore would only be triggered after a rebalance. was (Author: mjsax): Sounds reasonable. Not sure why we see this problem only for the checkpoint file. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822610#comment-16822610 ] Matthias J. Sax edited comment on KAFKA-5998 at 4/21/19 3:52 AM: - [~mparthas] It there are state stores, on commit, the last written offsets to the changelog topic are stored in the checkpoint files. This way, Kafka Streams can check if a store need to be recovered and if yes from what point in the changelog. > Why isn't that sufficient ? For what? [~yuzhih...@gmail.com] The other thread would be the state-cleaner thread, that only checks if there is old state that should be deleted. This thread might delete the state directory, and later the processing thread that tries to write the checkpoint does not find the checkpoint file any longer. But otherwise, you are right – if there is another thread trying to write to the same checkpoint file, it could be corrupted, too. was (Author: mjsax): [~mparthas] It there are state stores, on commit, the last written offsets to the changelog topic are stored in the checkpoint files. This way, Kafka Streams can check if a store need to be recovered and if yes from what point in the changelog. > Why isn't that sufficient ? For what? [~yuzhih...@gmail.com] The other thread would be the log-cleaner thread, that only checks if there is old state that should be deleted. This thread might delete the state directory, and later the processing thread that tries to write the checkpoint does not find the checkpoint file any longer. But otherwise, you are right – if there is another thread trying to write to the same checkpoint file, it could be corrupted, too. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814233#comment-16814233 ] Di Campo edited comment on KAFKA-5998 at 4/10/19 9:13 AM: -- It also happens to me on kafka_2.12-2.1.0. Run in AWS in docker (based in wurstmeister image) over Amazon Linux. {{cat /proc/version}} {{Linux version 4.14.77-69.57.amzn1.x86_64 (mockbuild@gobi-build-60003) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #1 SMP Tue Nov 6 21:32:55 UTC 2018}} was (Author: xmar): It also happens to me on kafka_2.12-2.1.0. Run in AWS in docker (based in wurstmeister image) over Amazon Linux. {{$ cat /proc/version }} {{Linux version 4.14.77-69.57.amzn1.x86_64 (mockbuild@gobi-build-60003) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #1 SMP Tue Nov 6 21:32:55 UTC 2018}} > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Critical > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811611#comment-16811611 ] Mohan Parthasarathy edited comment on KAFKA-5998 at 4/6/19 3:03 PM: Ted, Thanks for your suggestion. We will give it a try. The current default seems to be 10 minutes. Also, have you seen the issue of old messages showing up because of this exception ? I can't possibly understand that as there is no reported lag in the consumer. Also, where is the state kept about these old directories that we can manually cleanup in case we see this error in production ? was (Author: mparthas): Ted, Thanks for your suggestion. We will give it a try. The current default seems to be 10 minutes. Also, have you seen the issue of old messages showing up because of this exception ? I can't possibly understand that as there is no reported lag in the consumer. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811551#comment-16811551 ] Ted Yu edited comment on KAFKA-5998 at 4/6/19 10:55 AM: Cleanup delay is controlled by: {code} public static final String STATE_CLEANUP_DELAY_MS_CONFIG = "state.cleanup.delay.ms"; {code} Maybe longer value can be specified so that CleanupThread doesn't clean up the task directory prematurely (in case task directory hasn't been modified for a while). was (Author: yuzhih...@gmail.com): Cleanup delay is controlled by: {code} public static final String STATE_CLEANUP_DELAY_MS_CONFIG = "state.cleanup.delay.ms"; {code} Maybe longer value can be specified so that CleanupThread doesn't clean up the task directory prematurely. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221)
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784126#comment-16784126 ] leibo edited comment on KAFKA-5998 at 3/5/19 6:57 AM: -- It seems that {color:#ff} final File temp = new File(file.getAbsolutePath() + ".tmp");{color} will not create new file on some system sometimes. can we do this as bellow before write offset to checkpoint.tmp file. {code:java} //代码占位符 final File temp = new File(file.getAbsolutePath() + ".tmp"); if (!temp.exists()) { // create .checkpoint.tmp file if it's not exists temp.createNewFile(); } final FileOutputStream fileOutputStream = new FileOutputStream(temp); {code} was (Author: lbdai3190): It seems that {color:#FF} final File temp = new File(file.getAbsolutePath() + ".tmp");{color} will not create new file on some system. {code:java} //代码占位符 final File temp = new File(file.getAbsolutePath() + ".tmp"); if (!temp.exists()) { // create .checkpoint.tmp file if it's not exists temp.createNewFile(); } final FileOutputStream fileOutputStream = new FileOutputStream(temp); {code} > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745716#comment-16745716 ] Dmitry Minkovsky edited comment on KAFKA-5998 at 1/18/19 2:26 AM: -- I'm getting this as well: {{[2019-01-18 02:10:07,709] WARN (org.apache.kafka.streams.processor.internals.ProcessorStateManager:295) task [2_1] Failed to write offset checkpoint file to /opt/streams-state/user-service/2_1/.checkpoint: java.io.FileNotFoundException: /opt/streams-state/user-service/2_1/.checkpoint.tmp (No such file or directory) }} {{ [2019-01-18 02:10:07,790] WARN (org.apache.kafka.streams.processor.internals.ProcessorStateManager:295) task [2_1] Failed to write offset checkpoint file to /opt/streams-state/user-service/2_1/.checkpoint: java.io.FileNotFoundException: /opt/streams-state/user-service/2_1/.checkpoint.tmp (No such file or directory)}} When I look in the state dir: {quote}root@user-service-659fcb455-2fkxp:/opt/streams-state/user-service# ls -alh total 48K drwxr-xr-x 12 root root 4.0K Jan 18 01:50 . drwxrwxrwx 3 root root 4.0K Jan 18 01:30 .. drwxr-xr-x 4 root root 4.0K Jan 18 02:10 0_0 drwxr-xr-x 4 root root 4.0K Jan 18 01:30 0_1 drwxr-xr-x 4 root root 4.0K Jan 18 02:10 0_2 drwxr-xr-x 4 root root 4.0K Jan 18 01:30 0_3 drwxr-xr-x 4 root root 4.0K Jan 18 01:30 0_4 drwxr-xr-x 3 root root 4.0K Jan 18 01:30 1_0 drwxr-xr-x 3 root root 4.0K Jan 18 02:10 1_1 drwxr-xr-x 3 root root 4.0K Jan 18 01:30 1_2 drwxr-xr-x 3 root root 4.0K Jan 18 01:30 1_3 drwxr-xr-x 3 root root 4.0K Jan 18 01:30 1_4 root@user-service-659fcb455-2fkxp:/opt/streams-state/user-service# {quote} Not sure what that task 2_1. Will have to add some logging to check. was (Author: dminkovsky): I'm getting this as well: {{[2019-01-18 02:10:07,709] WARN (org.apache.kafka.streams.processor.internals.ProcessorStateManager:295) task [2_1] Failed to write offset checkpoint file to /opt/streams-state/user-service/2_1/.checkpoint: java.io.FileNotFoundException: /opt/streams-state/user-service/2_1/.checkpoint.tmp (No such file or directory) }} {{ [2019-01-18 02:10:07,790] WARN (org.apache.kafka.streams.processor.internals.ProcessorStateManager:295) task [2_1] Failed to write offset checkpoint file to /opt/streams-state/user-service/2_1/.checkpoint: java.io.FileNotFoundException: /opt/streams-state/user-service/2_1/.checkpoint.tmp (No such file or directory)}} When I look in the state dir: root@user-service-659fcb455-2fkxp:/opt/streams-state/user-service# ls -alh total 48K drwxr-xr-x 12 root root 4.0K Jan 18 01:50 . drwxrwxrwx 3 root root 4.0K Jan 18 01:30 .. drwxr-xr-x 4 root root 4.0K Jan 18 02:10 0_0 drwxr-xr-x 4 root root 4.0K Jan 18 01:30 0_1 drwxr-xr-x 4 root root 4.0K Jan 18 02:10 0_2 drwxr-xr-x 4 root root 4.0K Jan 18 01:30 0_3 drwxr-xr-x 4 root root 4.0K Jan 18 01:30 0_4 drwxr-xr-x 3 root root 4.0K Jan 18 01:30 1_0 drwxr-xr-x 3 root root 4.0K Jan 18 02:10 1_1 drwxr-xr-x 3 root root 4.0K Jan 18 01:30 1_2 drwxr-xr-x 3 root root 4.0K Jan 18 01:30 1_3 drwxr-xr-x 3 root root 4.0K Jan 18 01:30 1_4 root@user-service-659fcb455-2fkxp:/opt/streams-state/user-service# > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592383#comment-16592383 ] Ayushi edited comment on KAFKA-5998 at 8/28/18 5:11 AM: [~guozhang] [~mjsax] I ran into this issue after running it for 1-2 hours for the shared application. Load of around 8000 messages/sec. Also, it is not restricted to some particular partitions. There are 50 partitions. was (Author: ayushi0430): [~guozhang] [~mjsax] I ran into this issue after running it for 1-2 hours for the shared application. Load of around 8000 messages/sec. Also, it is not restricted to some particular partitions. There are 50 partitions. As a hack, I could set STATE_CLEANUP_DELAY_MS_CONFIG to a large no but I'm afraid it could cause OOM issues. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > props.txt, streams.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592383#comment-16592383 ] Ayushi edited comment on KAFKA-5998 at 8/25/18 2:32 AM: [~guozhang] [~mjsax] I ran into this issue after running it for 1-2 hours for the shared application. Load of around 8000 messages/sec. Also, it is not restricted to some particular partitions. There are 50 partitions. As a hack, I could set STATE_CLEANUP_DELAY_MS_CONFIG to a large no but I'm afraid it could cause OOM issues. was (Author: ayushi0430): [~guozhang] I ran into this issue after running it for 1-2 hours for the shared application. As a hack, I could set STATE_CLEANUP_DELAY_MS_CONFIG to a large no but I'm afraid it could cause OOM issues. > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, > properties.txt, properties.txt, props.txt, streams.txt, streamsSnippet.txt, > streamsSnippet.txt, topology.txt, topology.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method)
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588406#comment-16588406 ] Ayushi edited comment on KAFKA-5998 at 8/22/18 5:33 AM: [~guozhang] This issue is there in 2.0.0 as well. Logs for the same: ```WARN [2018-08-22 03:17:03,745] org.apache.kafka.streams.processor.internals.ProcessorStateManager: task [0_45] Failed to write offset checkpoint file to /tmp/kafka-streams/ /0_45/.checkpoint: {} ! java.nio.file.NoSuchFileException: /tmp/kafka-streams//0_45/.checkpoint.tmp ! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) ! at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) ! at java.nio.file.Files.move(Files.java:1395) ! at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:786) ! at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:92) ! at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:315) ! at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:383) ! at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:368) ! at org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:67) ! at org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:362) ! at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:352) ! at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:401) ! at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1035) ! at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:845) ! at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767) ! at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736) ! Suppressed: java.nio.file.NoSuchFileException: /tmp/kafka-streams//0_45/.checkpoint.tmp -> /tmp/kafka-streams//0_45/.checkpoint ! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396) ! at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) ! at java.nio.file.Files.move(Files.java:1395) ! at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:783) ! ... 12 common frames omitted``` This exception is thrown consistently after a given interval which I'm guessing is the commit interval for the streams. was (Author: ayushi0430): [~guozhang] This issue is there in 2.0.0 as well > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588406#comment-16588406 ] Ayushi edited comment on KAFKA-5998 at 8/22/18 5:33 AM: [~guozhang] This issue is there in 2.0.0 as well. Logs for the same: {{WARN [2018-08-22 03:17:03,745] org.apache.kafka.streams.processor.internals.ProcessorStateManager: task [0_45] Failed to write offset checkpoint file to /tmp/kafka-streams/}} {{ /0_45/.checkpoint: {}}} {{ ! java.nio.file.NoSuchFileException: /tmp/kafka-streams//0_45/.checkpoint.tmp}} {{ ! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)}} {{ ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)}} {{ ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)}} {{ ! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)}} {{ ! at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)}} {{ ! at java.nio.file.Files.move(Files.java:1395)}} {{ ! at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:786)}} {{ ! at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:92)}} {{ ! at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:315)}} {{ ! at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:383)}} {{ ! at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:368)}} {{ ! at org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:67)}} {{ ! at org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:362)}} {{ ! at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:352)}} {{ ! at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:401)}} {{ ! at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1035)}} {{ ! at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:845)}} {{ ! at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)}} {{ ! at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)}} {{ ! Suppressed: java.nio.file.NoSuchFileException: /tmp/kafka-streams//0_45/.checkpoint.tmp -> /tmp/kafka-streams//0_45/.checkpoint}} {{ ! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)}} {{ ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)}} {{ ! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)}} {{ ! at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)}} {{ ! at java.nio.file.Files.move(Files.java:1395)}} {{ ! at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:783)}} {{ ! ... 12 common frames omitted}} This exception is thrown consistently after a given interval which I'm guessing is the commit interval for the streams. was (Author: ayushi0430): [~guozhang] This issue is there in 2.0.0 as well. Logs for the same: ```WARN [2018-08-22 03:17:03,745] org.apache.kafka.streams.processor.internals.ProcessorStateManager: task [0_45] Failed to write offset checkpoint file to /tmp/kafka-streams/ /0_45/.checkpoint: {} ! java.nio.file.NoSuchFileException: /tmp/kafka-streams//0_45/.checkpoint.tmp ! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409) ! at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262) ! at java.nio.file.Files.move(Files.java:1395) ! at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:786) ! at org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:92) ! at org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:315) ! at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:383) ! at org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:368) ! at org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:67) ! at org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:362) ! at org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:352) ! at org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:401) ! at org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1035) ! at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:845) ! at
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528213#comment-16528213 ] Guozhang Wang edited comment on KAFKA-5998 at 6/29/18 8:50 PM: --- [~yogeshbelur] [~srinivenkat] [~timvanlaer] [~aman1064] I'm investigating this issue along with Ted's patch now. Could you share your experience if possible: 1. What's your file system / OS settings? Are you working with Windows, for example? 2. What's your Kafka Streams version? 3. If you have tried out Ted's patch v1 (e.g. [~yogeshbelur]), did it help fixing your issue already? was (Author: guozhang): [~yogeshbelur] [~srinivenkat] [~timvanlaer] [~aman1064] I'm investigating this issue along with Ted's patch now. Could you share your experience if possible: 1. What's your file system / OS settings? Are you working with Windows, for example? 2. If you have tried out Ted's patch v1 (e.g. [~yogeshbelur]), did it help fixing your issue already? > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) >
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528213#comment-16528213 ] Guozhang Wang edited comment on KAFKA-5998 at 6/29/18 8:49 PM: --- [~yogeshbelur] [~srinivenkat] [~timvanlaer] [~aman1064] I'm investigating this issue along with Ted's patch now. Could you share your experience if possible: 1. What's your file system / OS settings? Are you working with Windows, for example? 2. If you have tried out Ted's patch v1 (e.g. [~yogeshbelur]), did it help fixing your issue already? was (Author: guozhang): [~yogeshbelur] [~srinivenkat] [~timvanlaer] I'm investigating this issue along with Ted's patch now. Could you share your experience if possible: 1. What's your file system / OS settings? Are you working with Windows, for example? 2. If you have tried out Ted's patch v1 (e.g. [~yogeshbelur]), did it help fixing your issue already? > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0, 0.11.0.1 >Reporter: Yogesh BG >Priority: Major > Attachments: 5998.v1.txt, 5998.v2.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native
[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception
[ https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187353#comment-16187353 ] Ted Yu edited comment on KAFKA-5998 at 10/1/17 3:10 PM: Patch v1 has been attached. If the .tmp file cannot be renamed to checkpoint file, the checkpoint is not taken. The patch doesn't change that behavior. But the patch tries to avoid the error shown above was (Author: yuzhih...@gmail.com): Patch v1 has been attached. If the .tmp file cannot be renamed, the checkpoint is not taken. The patch doesn't change that behavior. But the patch tries to avoid the error shown above > /.checkpoint.tmp Not found exception > > > Key: KAFKA-5998 > URL: https://issues.apache.org/jira/browse/KAFKA-5998 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.11.0.0 >Reporter: Yogesh BG > Attachments: 5998.v1.txt > > > I have one kafka broker and one kafka stream running... I am running its > since two days under load of around 2500 msgs per second.. On third day am > getting below exception for some of the partitions, I have 16 partitions only > 0_0 and 0_1 gives this error > {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324) > ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > at > org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457) > [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na] > 09:43:25.974 [ks_0_inst-StreamThread-15] WARN > o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint: > java.io.FileNotFoundException: > /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or > directory) > at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:221) > ~[na:1.7.0_111] > at java.io.FileOutputStream.(FileOutputStream.java:171) > ~[na:1.7.0_111] > at > org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73) >