[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-08-28 Thread Harsh Singh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917669#comment-16917669
 ] 

Harsh Singh edited comment on KAFKA-5998 at 8/28/19 11:36 AM:
--

Hi [~pkleindl] [~vvcephei] , I couldn't locate version 2.3.1 in repo yet. 
Latest I see is 2.3.0, where the fix for 
[https://github.com/apache/kafka/pull/6846] seems missing. Could you please 
confirm if 2.3.1 is released yet? Same is the case with 2.2.2 and 2.4.0.

Also just curious to know, did we figure out the root cause for base issue of 
{{FileNotFoundException}} on checkpoint.write, inspite of having locking 
mechanism in place to avoid cleanup?  This will help in reproducing the issue 
on a small scale dev-ish kind of environment, than waiting for few hours of 
test run.


was (Author: itsharshonly):
Hi [~pkleindl] [~vvcephei] , I couldn't locate version 2.3.1 in repo yet. 
Latest I see is 2.3.0, where the fix for 
[https://github.com/apache/kafka/pull/6846] seems missing. Could you please 
confirm if 2.3.1 is released yet? Same is the case with 2.2.2 and 2.4.0.

Also just curious to know, did we figure out the root cause for base issue of 
{{FileNotFoundException}} on checkpoint.write, inspite of having locking 
mechanism in place to avoid cleanup? 

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Assignee: John Roesler
>Priority: Critical
> Fix For: 2.2.2, 2.4.0, 2.3.1
>
> Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, 
> exc.txt, props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-08-28 Thread Harsh Singh (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917669#comment-16917669
 ] 

Harsh Singh edited comment on KAFKA-5998 at 8/28/19 10:56 AM:
--

Hi [~pkleindl] [~vvcephei] , I couldn't locate version 2.3.1 in repo yet. 
Latest I see is 2.3.0, where the fix for 
[https://github.com/apache/kafka/pull/6846] seems missing. Could you please 
confirm if 2.3.1 is released yet? Same is the case with 2.2.2 and 2.4.0.

Also just curious to know, did we figure out the root cause for base issue of 
{{FileNotFoundException}} on checkpoint.write, inspite of having locking 
mechanism in place to avoid cleanup? 


was (Author: itsharshonly):
Hi [~pkleindl] [~vvcephei] , I couldn't locate version 2.3.1 in repo yet. 
Latest I see is 2.3.0, where the fix for 
[https://github.com/apache/kafka/pull/6846] seems missing. Could you please 
confirm if 2.3.1 is released yet? Same is the case with 2.4.0.

Also just curious to know, did we figure out the root cause for base issue of 
{{FileNotFoundException}} on checkpoint.write, inspite of having locking 
mechanism in place to avoid cleanup? 

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Assignee: John Roesler
>Priority: Critical
> Fix For: 2.2.2, 2.4.0, 2.3.1
>
> Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, 
> exc.txt, props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-07-23 Thread Sainath Y (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890932#comment-16890932
 ] 

Sainath Y edited comment on KAFKA-5998 at 7/23/19 12:16 PM:


Team/[~bbejeck]

I took kafka streams 2.3 version but still the issue persists.

Please suggest if there is any way to suppress this warn log as log files are 
growing?? 

2019-07-23 11:59:51.260 10.227.254.31

task [0_45] Failed to write offset checkpoint file to 
[/kafkarest/counting-stream/0_45/.checkpoint],exc.stack=java.io.FileNotFoundException:
 /kafkarest/counting-stream/0_45/.checkpoint.tmp (No such file or 
directory)\n\tat java.io.FileOutputStream.open0(Native Method)\n\tat 
java.io.FileOutputStream.open(FileOutputStream.java:270)\n\tat 
java.io.FileOutputStream.(FileOutputStream.java:213)\n\tat 
java.io.FileOutputStream.(FileOutputStream.java:162)\n\tat 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79)\n\tat
 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:347)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:476)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:461)\n\tat
 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286)\n\tat
 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)\n


was (Author: sainath932):
Team,

I took kafka streams 2.3 version but still the issue persists.

 

2019-07-23 11:59:51.260 10.227.254.31

task [0_45] Failed to write offset checkpoint file to 
[/kafkarest/counting-stream/0_45/.checkpoint],exc.stack=java.io.FileNotFoundException:
 /kafkarest/counting-stream/0_45/.checkpoint.tmp (No such file or 
directory)\n\tat java.io.FileOutputStream.open0(Native Method)\n\tat 
java.io.FileOutputStream.open(FileOutputStream.java:270)\n\tat 
java.io.FileOutputStream.(FileOutputStream.java:213)\n\tat 
java.io.FileOutputStream.(FileOutputStream.java:162)\n\tat 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79)\n\tat
 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:347)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:476)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:461)\n\tat
 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286)\n\tat
 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)\n\tat
 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)\n

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Assignee: John Roesler
>Priority: Critical
> Fix For: 2.2.2, 2.4.0, 2.3.1
>
> Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, 
> exc.txt, props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-07-04 Thread Di Campo (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878674#comment-16878674
 ] 

Di Campo edited comment on KAFKA-5998 at 7/4/19 1:55 PM:
-

My topology contains stateless subtopologies as well as stateful, so it may be 
that case. 

One question: 

> The "trigger" for the cleanup is the absence of any traffic on a topic 
> partition which will allow the directory to be deleted

 Can you elaborate on this trigger? What will be cleaned in this cleanup 
process? How long does it wait by default for traffic to determine absence?


was (Author: xmar):
My topology contains stateless subtopologies, so it may be that case. 

One question: 

> The "trigger" for the cleanup is the absence of any traffic on a topic 
> partition which will allow the directory to be deleted

 Can you elaborate on this trigger? What will be cleaned in this cleanup 
process? How long does it wait by default for traffic to determine absence?

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Assignee: Bill Bejeck
>Priority: Critical
> Attachments: 5998.v1.txt, 5998.v2.txt, Kafka5998.zip, Topology.txt, 
> exc.txt, props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-07-04 Thread Patrik Kleindl (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878549#comment-16878549
 ] 

Patrik Kleindl edited comment on KAFKA-5998 at 7/4/19 10:57 AM:


As this is hopefully coming to an end thanks to [~vvcephei] I'll try a little 
writeup so others can compare their setup and hopefully verify that the fix 
covers all situations.

I'll also attach the code I used to reproduce the issue, should be runnable for 
anyone with docker. [^Kafka5998.zip]

Setup:

We have a streams application with two subtopologies where the first one does 
not have state and the second one does.

State directories are created for all tasks, but as the first subtopology does 
not have state it does not get locked internally and is therefor subject to the 
cleanup process.

The "trigger" for the cleanup is the absence of any traffic on a topic 
partition which will allow the directory to be deleted, so this has higher 
probability on any low-volume topics (in our case, CDC from a reference table 
with little to no changes).

After the directory is deleted, the next record received (and any thereafter) 
will trigger the warning because Kafka Streams erroneously tries to write a 
checkpoint because all tasks tried to do this for all restored partitions, not 
only those assigned to them. This is what Johns PR covers.

We have EOS disabled, so it would be good if people could tell if this has 
happened with EOS too.

Or if this warning has occurred for anyone with a different topology situation 
or on a high volume topic.

For reference our topology:
{code:java}
Topologies:
 Sub-topology: 0
 Source: KSTREAM-SOURCE-00 (topics: [inputTopic])
 --> KSTREAM-FILTER-01
 Processor: KSTREAM-FILTER-01 (stores: [])
 --> KSTREAM-KEY-SELECT-02
 <-- KSTREAM-SOURCE-00
 Processor: KSTREAM-KEY-SELECT-02 (stores: [])
 --> KSTREAM-FILTER-05
 <-- KSTREAM-FILTER-01
 Processor: KSTREAM-FILTER-05 (stores: [])
 --> KSTREAM-SINK-04
 <-- KSTREAM-KEY-SELECT-02
 Sink: KSTREAM-SINK-04 (topic: store-repartition)
 <-- KSTREAM-FILTER-05

 Sub-topology: 1
 Source: KSTREAM-SOURCE-06 (topics: [store-repartition])
 --> KSTREAM-REDUCE-03
 Processor: KSTREAM-REDUCE-03 (stores: [store])
 --> KTABLE-TOSTREAM-07
 <-- KSTREAM-SOURCE-06
 Processor: KTABLE-TOSTREAM-07 (stores: [])
 --> KSTREAM-SINK-08
 <-- KSTREAM-REDUCE-03
 Sink: KSTREAM-SINK-08 (topic: outputTopic)
 <-- KTABLE-TOSTREAM-07{code}
 


was (Author: pkleindl):
As this is hopefully coming to an end thanks to [~vvcephei] I'll try a little 
writeup so others can compare their setup and hopefully verify that the fix 
covers all situations.

I'll also attach the code I used to reproduce the issue, should be runnable for 
anyone with docker.

Setup:

We have a streams application with two subtopologies where the first one does 
not have state and the second one does.

State directories are created for all tasks, but as the first subtopology does 
not have state it does not get locked internally and is therefor subject to the 
cleanup process.

The "trigger" for the cleanup is the absence of any traffic on a topic 
partition which will allow the directory to be deleted, so this has higher 
probability on any low-volume topics (in our case, CDC from a reference table 
with little to no changes).

After the directory is deleted, the next record received (and any thereafter) 
will trigger the warning because Kafka Streams erroneously tries to write a 
checkpoint because all tasks tried to do this for all restored partitions, not 
only those assigned to them. This is what Johns PR covers.

We have EOS disabled, so it would be good if people could tell if this has 
happened with EOS too.

Or if this warning has occurred for anyone with a different topology situation 
or on a high volume topic.

For reference our topology:
{code:java}
Topologies:
 Sub-topology: 0
 Source: KSTREAM-SOURCE-00 (topics: [inputTopic])
 --> KSTREAM-FILTER-01
 Processor: KSTREAM-FILTER-01 (stores: [])
 --> KSTREAM-KEY-SELECT-02
 <-- KSTREAM-SOURCE-00
 Processor: KSTREAM-KEY-SELECT-02 (stores: [])
 --> KSTREAM-FILTER-05
 <-- KSTREAM-FILTER-01
 Processor: KSTREAM-FILTER-05 (stores: [])
 --> KSTREAM-SINK-04
 <-- KSTREAM-KEY-SELECT-02
 Sink: KSTREAM-SINK-04 (topic: store-repartition)
 <-- KSTREAM-FILTER-05

 Sub-topology: 1
 Source: KSTREAM-SOURCE-06 (topics: [store-repartition])
 --> KSTREAM-REDUCE-03
 Processor: KSTREAM-REDUCE-03 (stores: [store])
 --> KTABLE-TOSTREAM-07
 <-- KSTREAM-SOURCE-06
 Processor: KTABLE-TOSTREAM-07 (stores: [])
 --> KSTREAM-SINK-08
 <-- 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-07-02 Thread Di Campo (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877136#comment-16877136
 ] 

Di Campo edited comment on KAFKA-5998 at 7/2/19 4:52 PM:
-

Just in case it helps. I just found it today on 2.1.1 (again, I commented here 
some months ago). 
 5 brokers cluster, 3 Kafka Streams instances (2 `num.streams.threads` each). 
AMZN Linux. Docker on ECS.

I've seen that, before the task dies, it prints the following WARNs from one 
task.

Please note that from the 64 partitions, only a few of them fail starting at 
13:17. And the same batch of the same partitions start failing again at 13:42. 
 Why are the same partitions failing? Does it match with your findings?

 

{{[2019-07-02 13:17:01,101] WARN task [2_31] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_31/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
{{[2019-07-02 13:17:01,118] WARN task [2_47] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_47/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
{{[2019-07-02 13:17:01,156] WARN task [2_27] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_27/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
{{[2019-07-02 13:20:12,360] WARN task [2_63] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_63/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
{{[2019-07-02 13:20:12,579] WARN task [2_35] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_35/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
{{[2019-07-02 13:20:13,001] WARN task [2_23] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_23/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
{{[2019-07-02 13:23:18,421] WARN task [2_39] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_39/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
{{[2019-07-02 13:23:18,613] WARN task [2_55] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_55/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}

{{[2019-07-02 13:42:46,366] WARN task [2_31] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_31/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
 {{[2019-07-02 13:42:46,473] WARN task [2_47] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_47/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
 {{[2019-07-02 13:42:46,639] WARN task [2_27] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_27/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
 {{[2019-07-02 13:46:19,888] WARN task [2_63] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_63/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
 {{[2019-07-02 13:46:20,042] WARN task [2_35] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_35/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
 {{[2019-07-02 13:46:20,380] WARN task [2_55] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_55/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
 {{[2019-07-02 13:46:20,384] WARN task [2_23] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_23/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}
 {{[2019-07-02 13:48:07,011] WARN task [2_39] Failed to write offset checkpoint 
file to /data/kafka-streams/stream-processor-0.0.1/2_39/.checkpoint: {} 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager)}}

 

Later, the application died some minutes later, at 13:59:13. In case there is a 
relation, it was killed due to OOM.

 


was (Author: xmar):
Just in case it helps. I just found it today on 2.1.1 (again, I commented here 
some months ago). 
 5 brokers cluster, 3 Kafka Streams instances (2 `num.streams.threads` each). 
AMZN Linux. Docker on ECS.

I've seen that, before the task dies, it prints the following WARNs from one 
task.

Please note that from the 64 partitions, only a few of them fail starting at 
13:17. And the same batch of the same partitions start failing again at 13:42. 
Why are the 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-05-23 Thread Andrew (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846566#comment-16846566
 ] 

Andrew edited comment on KAFKA-5998 at 5/23/19 11:32 AM:
-

I have just seen this error with kafka 2.2.0 and confluent 5.2.1. We are 
running in docker on kubernetes, and the same container works when run against 
some topics but not others it seems.
{code:java}
java.io.FileNotFoundException: 
/tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather/0_11/.checkpoint.tmp
 (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.(FileOutputStream.java:213)
    at java.io.FileOutputStream.(FileOutputStream.java:162)
    at 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79)
    at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325)
    at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474)
    at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:459)
    at 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286)
    at 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
{code}
When I `exec` into the running container I see this :
{code:java}
/tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather # ls
2_0   2_10  2_12  2_14  2_16  2_18  2_2   2_4   2_6   2_8   3_0   3_10  3_12  
3_14  3_16  3_18  3_2   3_4   3_6   3_8
{code}
Also I see 

{Code}

09:55:49,898 INFO org.apache.kafka.streams.processor.internals.StateDirectory - 
stream-thread 
[kafka-streams-join-to-nearest-spm-draft-position-weather-404a240b-1e95-4587-80bf-686047658254-CleanupThread]
 Deleting obsolete state directory 0_1 for task 0_1 as 600898ms has elapsed 
(cleanup delay is 60ms).
{Code}

 

And I see the following in the StreamsConfig log output :

{Code}
 state.cleanup.delay.ms = 60

state.dir = /tmp/kafka-streams

{Code}

 


was (Author: the4thamigo_uk):
I have just seen this error with kafka 2.2.0 and confluent 5.2.1. We are 
running in docker on kubernetes, and the same container works when run against 
some topics but not others it seems.
{code:java}
java.io.FileNotFoundException: 
/tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather/0_11/.checkpoint.tmp
 (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.(FileOutputStream.java:213)
    at java.io.FileOutputStream.(FileOutputStream.java:162)
    at 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79)
    at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325)
    at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474)
    at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:459)
    at 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286)
    at 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
{code}

When I `exec` into the running container I see this :

{Code}
/tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather # ls
2_0   2_10  2_12  2_14  2_16  2_18  2_2   2_4   2_6   2_8   3_0   3_10  3_12  
3_14  3_16  3_18  3_2   3_4   3_6   3_8
{Code}

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-05-23 Thread Andrew (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16846566#comment-16846566
 ] 

Andrew edited comment on KAFKA-5998 at 5/23/19 9:29 AM:


I have just seen this error with kafka 2.2.0 and confluent 5.2.1. We are 
running in docker on kubernetes, and the same container works when run against 
some topics but not others it seems.
{code:java}
java.io.FileNotFoundException: 
/tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather/0_11/.checkpoint.tmp
 (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.(FileOutputStream.java:213)
    at java.io.FileOutputStream.(FileOutputStream.java:162)
    at 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79)
    at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325)
    at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474)
    at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:459)
    at 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286)
    at 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
{code}

When I `exec` into the running container I see this :

{Code}
/tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather # ls
2_0   2_10  2_12  2_14  2_16  2_18  2_2   2_4   2_6   2_8   3_0   3_10  3_12  
3_14  3_16  3_18  3_2   3_4   3_6   3_8
{Code}


was (Author: the4thamigo_uk):
I have just seen this error with kafka 2.2.0 and confluent 5.2.1. We are 
running in docker on kubernetes, and the same container works when run against 
some topics but not others it seems.

{Code}
java.io.FileNotFoundException: 
/tmp/kafka-streams/kafka-streams-join-to-nearest-spm-draft-position-weather/0_11/.checkpoint.tmp
 (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.(FileOutputStream.java:213)
    at java.io.FileOutputStream.(FileOutputStream.java:162)
    at 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:79)
    at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:325)
    at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:474)
    at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:459)
    at 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:286)
    at 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:412)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1057)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:911)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
    at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
{Code}

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-04-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827123#comment-16827123
 ] 

Ted Yu edited comment on KAFKA-5998 at 4/26/19 5:31 PM:


https://pastebin.com/vyvQ8pkF shows what I mentioned earlier.

In StateDirectory#cleanRemovedTasks, we use both last modification time of the 
directory and whether the task has been committed as criteria. When both are 
satisfied, we clean the directory for the task.


was (Author: yuzhih...@gmail.com):
https://pastebin.com/Y247UZgb shows what I mentioned earlier (incomplete).

In StateDirectory#cleanRemovedTasks, we use both last modification time of the 
directory and whether the task has been committed as criteria. When both are 
satisfied, we clean the directory for the task.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-04-26 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827123#comment-16827123
 ] 

Ted Yu edited comment on KAFKA-5998 at 4/26/19 5:12 PM:


https://pastebin.com/Y247UZgb shows what I mentioned earlier (incomplete).

In StateDirectory#cleanRemovedTasks, we use both last modification time of the 
directory and whether the task has been committed as criteria. When both are 
satisfied, we clean the directory for the task.


was (Author: yuzhih...@gmail.com):
https://pastebin.com/Y247UZgb shows what I mentioned earlier.
In StateDirectory#cleanRemovedTasks, we don't rely on last modification time - 
we check whether the task has been committed.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-04-24 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825583#comment-16825583
 ] 

Matthias J. Sax edited comment on KAFKA-5998 at 4/24/19 11:52 PM:
--

Sounds reasonable. Not sure why we see this problem only for the checkpoint 
file.

Restore would only be triggered after a rebalance.


was (Author: mjsax):
Sounds reasonable. Not sure why we see this problem only for the checkpoint 
file.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-04-20 Thread Matthias J. Sax (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16822610#comment-16822610
 ] 

Matthias J. Sax edited comment on KAFKA-5998 at 4/21/19 3:52 AM:
-

[~mparthas] It there are state stores, on commit, the last written offsets to 
the changelog topic are stored in the checkpoint files. This way, Kafka Streams 
can check if a store need to be recovered and if yes from what point in the 
changelog.

> Why isn't that sufficient ?

For what?

[~yuzhih...@gmail.com] The other thread would be the state-cleaner thread, that 
only checks if there is old state that should be deleted. This thread might 
delete the state directory, and later the processing thread that tries to write 
the checkpoint does not find the checkpoint file any longer. But otherwise, you 
are right – if there is another thread trying to write to the same checkpoint 
file, it could be corrupted, too.


was (Author: mjsax):
[~mparthas] It there are state stores, on commit, the last written offsets to 
the changelog topic are stored in the checkpoint files. This way, Kafka Streams 
can check if a store need to be recovered and if yes from what point in the 
changelog.

> Why isn't that sufficient ?

For what?

[~yuzhih...@gmail.com] The other thread would be the log-cleaner thread, that 
only checks if there is old state that should be deleted. This thread might 
delete the state directory, and later the processing thread that tries to write 
the checkpoint does not find the checkpoint file any longer. But otherwise, you 
are right – if there is another thread trying to write to the same checkpoint 
file, it could be corrupted, too.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-04-10 Thread Di Campo (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814233#comment-16814233
 ] 

Di Campo edited comment on KAFKA-5998 at 4/10/19 9:13 AM:
--

It also happens to me on kafka_2.12-2.1.0. Run in AWS in docker (based in 
wurstmeister image) over Amazon Linux.

 

{{cat /proc/version}}
{{Linux version 4.14.77-69.57.amzn1.x86_64 (mockbuild@gobi-build-60003) (gcc 
version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #1 SMP Tue Nov 6 21:32:55 UTC 
2018}}


was (Author: xmar):
It also happens to me on kafka_2.12-2.1.0. Run in AWS in docker (based in 
wurstmeister image) over Amazon Linux.

 

{{$ cat /proc/version }}
{{Linux version 4.14.77-69.57.amzn1.x86_64 (mockbuild@gobi-build-60003) (gcc 
version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC)) #1 SMP Tue Nov 6 21:32:55 UTC 
2018}}

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Critical
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-04-06 Thread Mohan Parthasarathy (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811611#comment-16811611
 ] 

Mohan Parthasarathy edited comment on KAFKA-5998 at 4/6/19 3:03 PM:


Ted, Thanks for your suggestion. We will give it a try. The current default 
seems to be 10 minutes. Also, have you seen the issue of old messages showing 
up because of this exception ? I can't possibly understand that as there is no 
reported lag in the consumer.

Also, where is the state kept about these old directories that we can manually 
cleanup in case we see this error in production ?


was (Author: mparthas):
Ted, Thanks for your suggestion. We will give it a try. The current default 
seems to be 10 minutes. Also, have you seen the issue of old messages showing 
up because of this exception ? I can't possibly understand that as there is no 
reported lag in the consumer.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-04-06 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811551#comment-16811551
 ] 

Ted Yu edited comment on KAFKA-5998 at 4/6/19 10:55 AM:


Cleanup delay is controlled by:
{code}
public static final String STATE_CLEANUP_DELAY_MS_CONFIG = 
"state.cleanup.delay.ms";
{code}
Maybe longer value can be specified so that CleanupThread doesn't clean up the 
task directory prematurely (in case task directory hasn't been modified for a 
while).


was (Author: yuzhih...@gmail.com):
Cleanup delay is controlled by:
{code}
public static final String STATE_CLEANUP_DELAY_MS_CONFIG = 
"state.cleanup.delay.ms";
{code}
Maybe longer value can be specified so that CleanupThread doesn't clean up the 
task directory prematurely.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-03-04 Thread leibo (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784126#comment-16784126
 ] 

leibo edited comment on KAFKA-5998 at 3/5/19 6:57 AM:
--

It seems that {color:#ff} final File temp = new File(file.getAbsolutePath() 
+ ".tmp");{color} will not create new file on some system sometimes. can we do 
this as bellow before write offset to checkpoint.tmp file.
{code:java}
//代码占位符
final File temp = new File(file.getAbsolutePath() + ".tmp");
if (!temp.exists()) {       // create .checkpoint.tmp file if it's not exists
temp.createNewFile();
}
final FileOutputStream fileOutputStream = new FileOutputStream(temp);
{code}


was (Author: lbdai3190):
It seems that {color:#FF} final File temp = new File(file.getAbsolutePath() 
+ ".tmp");{color} will not create new file on some system.
{code:java}
//代码占位符
final File temp = new File(file.getAbsolutePath() + ".tmp");
if (!temp.exists()) {       // create .checkpoint.tmp file if it's not exists
temp.createNewFile();
}
final FileOutputStream fileOutputStream = new FileOutputStream(temp);
{code}

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1, 2.1.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2019-01-17 Thread Dmitry Minkovsky (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745716#comment-16745716
 ] 

Dmitry Minkovsky edited comment on KAFKA-5998 at 1/18/19 2:26 AM:
--

I'm getting this as well:

{{[2019-01-18 02:10:07,709] WARN 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager:295) task 
[2_1] Failed to write offset checkpoint file to 
/opt/streams-state/user-service/2_1/.checkpoint: java.io.FileNotFoundException: 
/opt/streams-state/user-service/2_1/.checkpoint.tmp (No such file or directory) 
}}
 {{ [2019-01-18 02:10:07,790] WARN 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager:295) task 
[2_1] Failed to write offset checkpoint file to 
/opt/streams-state/user-service/2_1/.checkpoint: java.io.FileNotFoundException: 
/opt/streams-state/user-service/2_1/.checkpoint.tmp (No such file or 
directory)}}

 

When I look in the state dir:

 
{quote}root@user-service-659fcb455-2fkxp:/opt/streams-state/user-service# ls 
-alh

total 48K

drwxr-xr-x 12 root root 4.0K Jan 18 01:50 .

drwxrwxrwx  3 root root 4.0K Jan 18 01:30 ..

drwxr-xr-x  4 root root 4.0K Jan 18 02:10 0_0

drwxr-xr-x  4 root root 4.0K Jan 18 01:30 0_1

drwxr-xr-x  4 root root 4.0K Jan 18 02:10 0_2

drwxr-xr-x  4 root root 4.0K Jan 18 01:30 0_3

drwxr-xr-x  4 root root 4.0K Jan 18 01:30 0_4

drwxr-xr-x  3 root root 4.0K Jan 18 01:30 1_0

drwxr-xr-x  3 root root 4.0K Jan 18 02:10 1_1

drwxr-xr-x  3 root root 4.0K Jan 18 01:30 1_2

drwxr-xr-x  3 root root 4.0K Jan 18 01:30 1_3

drwxr-xr-x  3 root root 4.0K Jan 18 01:30 1_4

root@user-service-659fcb455-2fkxp:/opt/streams-state/user-service#
{quote}
 

Not sure what that task 2_1. Will have to add some logging to check. 


was (Author: dminkovsky):
I'm getting this as well:

{{[2019-01-18 02:10:07,709] WARN 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager:295) task 
[2_1] Failed to write offset checkpoint file to 
/opt/streams-state/user-service/2_1/.checkpoint: java.io.FileNotFoundException: 
/opt/streams-state/user-service/2_1/.checkpoint.tmp (No such file or directory) 
}}
{{ [2019-01-18 02:10:07,790] WARN 
(org.apache.kafka.streams.processor.internals.ProcessorStateManager:295) task 
[2_1] Failed to write offset checkpoint file to 
/opt/streams-state/user-service/2_1/.checkpoint: java.io.FileNotFoundException: 
/opt/streams-state/user-service/2_1/.checkpoint.tmp (No such file or 
directory)}}

 

When I look in the state dir:

 

root@user-service-659fcb455-2fkxp:/opt/streams-state/user-service# ls -alh

total 48K

drwxr-xr-x 12 root root 4.0K Jan 18 01:50 .

drwxrwxrwx  3 root root 4.0K Jan 18 01:30 ..

drwxr-xr-x  4 root root 4.0K Jan 18 02:10 0_0

drwxr-xr-x  4 root root 4.0K Jan 18 01:30 0_1

drwxr-xr-x  4 root root 4.0K Jan 18 02:10 0_2

drwxr-xr-x  4 root root 4.0K Jan 18 01:30 0_3

drwxr-xr-x  4 root root 4.0K Jan 18 01:30 0_4

drwxr-xr-x  3 root root 4.0K Jan 18 01:30 1_0

drwxr-xr-x  3 root root 4.0K Jan 18 02:10 1_1

drwxr-xr-x  3 root root 4.0K Jan 18 01:30 1_2

drwxr-xr-x  3 root root 4.0K Jan 18 01:30 1_3

drwxr-xr-x  3 root root 4.0K Jan 18 01:30 1_4

root@user-service-659fcb455-2fkxp:/opt/streams-state/user-service#

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2018-08-27 Thread Ayushi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592383#comment-16592383
 ] 

Ayushi edited comment on KAFKA-5998 at 8/28/18 5:11 AM:


[~guozhang] [~mjsax] I ran into this issue after running it for 1-2 hours for 
the shared application.

Load of around 8000 messages/sec.

Also, it is not restricted to some particular partitions. There are 50 
partitions.


was (Author: ayushi0430):
[~guozhang] [~mjsax] I ran into this issue after running it for 1-2 hours for 
the shared application.

Load of around 8000 messages/sec.

Also, it is not restricted to some particular partitions. There are 50 
partitions.

As a hack, I could set STATE_CLEANUP_DELAY_MS_CONFIG to a large no but I'm 
afraid it could cause OOM issues.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> props.txt, streams.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2018-08-24 Thread Ayushi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16592383#comment-16592383
 ] 

Ayushi edited comment on KAFKA-5998 at 8/25/18 2:32 AM:


[~guozhang] [~mjsax] I ran into this issue after running it for 1-2 hours for 
the shared application.

Load of around 8000 messages/sec.

Also, it is not restricted to some particular partitions. There are 50 
partitions.

As a hack, I could set STATE_CLEANUP_DELAY_MS_CONFIG to a large no but I'm 
afraid it could cause OOM issues.


was (Author: ayushi0430):
[~guozhang] I ran into this issue after running it for 1-2 hours for the shared 
application.

As a hack, I could set STATE_CLEANUP_DELAY_MS_CONFIG to a large no but I'm 
afraid it could cause OOM issues.

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt, Topology.txt, exc.txt, 
> properties.txt, properties.txt, props.txt, streams.txt, streamsSnippet.txt, 
> streamsSnippet.txt, topology.txt, topology.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2018-08-21 Thread Ayushi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588406#comment-16588406
 ] 

Ayushi edited comment on KAFKA-5998 at 8/22/18 5:33 AM:


[~guozhang] This issue is there in 2.0.0 as well. Logs for the same:

```WARN [2018-08-22 03:17:03,745] 
org.apache.kafka.streams.processor.internals.ProcessorStateManager: task [0_45] 
Failed to write offset checkpoint file to /tmp/kafka-streams/
/0_45/.checkpoint: {}
! java.nio.file.NoSuchFileException: 
/tmp/kafka-streams//0_45/.checkpoint.tmp
! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
! at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
! at java.nio.file.Files.move(Files.java:1395)
! at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:786)
! at 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:92)
! at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:315)
! at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:383)
! at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:368)
! at 
org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:67)
! at 
org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:362)
! at 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:352)
! at 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:401)
! at 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1035)
! at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:845)
! at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)
! at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)
! Suppressed: java.nio.file.NoSuchFileException: 
/tmp/kafka-streams//0_45/.checkpoint.tmp -> 
/tmp/kafka-streams//0_45/.checkpoint
! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)
! at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
! at java.nio.file.Files.move(Files.java:1395)
! at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:783)
! ... 12 common frames omitted```

This exception is thrown consistently after a given interval which I'm guessing 
is the commit interval for the streams.


was (Author: ayushi0430):
[~guozhang] This issue is there in 2.0.0 as well

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2018-08-21 Thread Ayushi (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588406#comment-16588406
 ] 

Ayushi edited comment on KAFKA-5998 at 8/22/18 5:33 AM:


[~guozhang] This issue is there in 2.0.0 as well. Logs for the same:

{{WARN [2018-08-22 03:17:03,745] 
org.apache.kafka.streams.processor.internals.ProcessorStateManager: task [0_45] 
Failed to write offset checkpoint file to /tmp/kafka-streams/}}
{{ /0_45/.checkpoint: {}}}
{{ ! java.nio.file.NoSuchFileException: 
/tmp/kafka-streams//0_45/.checkpoint.tmp}}
{{ ! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)}}
{{ ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)}}
{{ ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)}}
{{ ! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)}}
{{ ! at 
sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)}}
{{ ! at java.nio.file.Files.move(Files.java:1395)}}
{{ ! at 
org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:786)}}
{{ ! at 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:92)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:315)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:383)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:368)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:67)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:362)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:352)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:401)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1035)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:845)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)}}
{{ ! at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)}}
{{ ! Suppressed: java.nio.file.NoSuchFileException: 
/tmp/kafka-streams//0_45/.checkpoint.tmp -> 
/tmp/kafka-streams//0_45/.checkpoint}}
{{ ! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)}}
{{ ! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)}}
{{ ! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:396)}}
{{ ! at 
sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)}}
{{ ! at java.nio.file.Files.move(Files.java:1395)}}
{{ ! at 
org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:783)}}
{{ ! ... 12 common frames omitted}}

This exception is thrown consistently after a given interval which I'm guessing 
is the commit interval for the streams.


was (Author: ayushi0430):
[~guozhang] This issue is there in 2.0.0 as well. Logs for the same:

```WARN [2018-08-22 03:17:03,745] 
org.apache.kafka.streams.processor.internals.ProcessorStateManager: task [0_45] 
Failed to write offset checkpoint file to /tmp/kafka-streams/
/0_45/.checkpoint: {}
! java.nio.file.NoSuchFileException: 
/tmp/kafka-streams//0_45/.checkpoint.tmp
! at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
! at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
! at sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:409)
! at sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:262)
! at java.nio.file.Files.move(Files.java:1395)
! at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:786)
! at 
org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:92)
! at 
org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:315)
! at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:383)
! at 
org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:368)
! at 
org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:67)
! at 
org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:362)
! at 
org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:352)
! at 
org.apache.kafka.streams.processor.internals.TaskManager.commitAll(TaskManager.java:401)
! at 
org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:1035)
! at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:845)
! at 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2018-06-29 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528213#comment-16528213
 ] 

Guozhang Wang edited comment on KAFKA-5998 at 6/29/18 8:50 PM:
---

[~yogeshbelur] [~srinivenkat] [~timvanlaer] [~aman1064] I'm investigating this 
issue along with Ted's patch now. Could you share your experience if possible:

1. What's your file system / OS settings? Are you working with Windows, for 
example?
2. What's your Kafka Streams version?
3. If you have tried out Ted's patch v1 (e.g. [~yogeshbelur]), did it help 
fixing your issue already?


was (Author: guozhang):
[~yogeshbelur] [~srinivenkat] [~timvanlaer] [~aman1064] I'm investigating this 
issue along with Ted's patch now. Could you share your experience if possible:

1. What's your file system / OS settings? Are you working with Windows, for 
example?
2. If you have tried out Ted's patch v1 (e.g. [~yogeshbelur]), did it help 
fixing your issue already?

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
>  

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2018-06-29 Thread Guozhang Wang (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16528213#comment-16528213
 ] 

Guozhang Wang edited comment on KAFKA-5998 at 6/29/18 8:49 PM:
---

[~yogeshbelur] [~srinivenkat] [~timvanlaer] [~aman1064] I'm investigating this 
issue along with Ted's patch now. Could you share your experience if possible:

1. What's your file system / OS settings? Are you working with Windows, for 
example?
2. If you have tried out Ted's patch v1 (e.g. [~yogeshbelur]), did it help 
fixing your issue already?


was (Author: guozhang):
[~yogeshbelur] [~srinivenkat] [~timvanlaer] I'm investigating this issue along 
with Ted's patch now. Could you share your experience if possible:

1. What's your file system / OS settings? Are you working with Windows, for 
example?
2. If you have tried out Ted's patch v1 (e.g. [~yogeshbelur]), did it help 
fixing your issue already?

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0, 0.11.0.1
>Reporter: Yogesh BG
>Priority: Major
> Attachments: 5998.v1.txt, 5998.v2.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native 

[jira] [Comment Edited] (KAFKA-5998) /.checkpoint.tmp Not found exception

2017-10-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187353#comment-16187353
 ] 

Ted Yu edited comment on KAFKA-5998 at 10/1/17 3:10 PM:


Patch v1 has been attached. 
If the .tmp file cannot be renamed to checkpoint file, the checkpoint is not 
taken. The patch doesn't change that behavior. 

But the patch tries to avoid the error shown above


was (Author: yuzhih...@gmail.com):
Patch v1 has been attached. 
If the .tmp file cannot be renamed, the checkpoint is not taken. The patch 
doesn't change that behavior. 

But the patch tries to avoid the error shown above

> /.checkpoint.tmp Not found exception
> 
>
> Key: KAFKA-5998
> URL: https://issues.apache.org/jira/browse/KAFKA-5998
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.0
>Reporter: Yogesh BG
> Attachments: 5998.v1.txt
>
>
> I have one kafka broker and one kafka stream running... I am running its 
> since two days under load of around 2500 msgs per second.. On third day am 
> getting below exception for some of the partitions, I have 16 partitions only 
> 0_0 and 0_1 gives this error
> {{09:43:25.955 [ks_0_inst-StreamThread-6] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_1/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.checkpoint(ProcessorStateManager.java:324)
>  ~[rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask$1.run(StreamTask.java:267)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamsMetricsImpl.measureLatencyNs(StreamsMetricsImpl.java:201)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:260)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:254)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks$1.apply(AssignedTasks.java:322)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.applyToRunningTasks(AssignedTasks.java:415)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.AssignedTasks.commit(AssignedTasks.java:314)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:700)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:683)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:523)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:480)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:457)
>  [rtp-kafkastreams-1.0-SNAPSHOT-jar-with-dependencies.jar:na]
> 09:43:25.974 [ks_0_inst-StreamThread-15] WARN  
> o.a.k.s.p.i.ProcessorStateManager - Failed to write checkpoint file to 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint:
> java.io.FileNotFoundException: 
> /data/kstreams/rtp-kafkastreams/0_0/.checkpoint.tmp (No such file or 
> directory)
> at java.io.FileOutputStream.open(Native Method) ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:221) 
> ~[na:1.7.0_111]
> at java.io.FileOutputStream.(FileOutputStream.java:171) 
> ~[na:1.7.0_111]
> at 
> org.apache.kafka.streams.state.internals.OffsetCheckpoint.write(OffsetCheckpoint.java:73)
>