Hi Sachin,

The first exception - Is each instance of your streams app on a single
machine running with the same state directory config?
The second exception - i believe is a bug in 0.10.1 that has been fixed in
0.10.2. There has been a number of issues fixed in this area.

Thanks,
Damian

On Mon, 6 Feb 2017 at 05:43 Sachin Mittal <sjmit...@gmail.com> wrote:

> Hello All,
> We recently upgraded to  kafka_2.10-0.10.1.1.
>
> We have a source topic with replication = 3 and partition = 40.
> We have a streams application run with NUM_STREAM_THREADS_CONFIG = 4 and on
> three machines. So 12 threads in total.
>
> What we do is start the same streams application one by one on three
> machines.
>
> After some time what we noticed was that one of the machine streams
> application just crashed. When we inspected the log here is what we found.
> There were 2 sets of errors like:
>
> stream-thread [StreamThread-3] Failed to commit StreamTask 0_38 state:
> org.apache.kafka.streams.errors.ProcessorStateException: task [0_38] Failed
> to flush state store key-table
>     at
>
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:331)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamTask.commit(StreamTask.java:267)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.commitOne(StreamThread.java:576)
> [kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.commitAll(StreamThread.java:562)
> [kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.maybeCommit(StreamThread.java:538)
> [kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:456)
> [kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242)
> [kafka-streams-0.10.1.1.jar:na]
> Caused by: org.apache.kafka.streams.errors.ProcessorStateException: Error
> while executing flush from store key-table-201702052100
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBStore.flushInternal(RocksDBStore.java:375)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBStore.flush(RocksDBStore.java:366)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.flush(RocksDBWindowStore.java:256)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.MeteredWindowStore.flush(MeteredWindowStore.java:116)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.CachingWindowStore.flush(CachingWindowStore.java:119)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:329)
> ~[kafka-streams-0.10.1.1.jar:na]
>     ... 6 common frames omitted
> Caused by: org.rocksdb.RocksDBException: IO error:
>
> /data/kafka-streams/new-part/0_38/key-table/key-table-201702052100/000009.sst:
> No such file or directory
>     at org.rocksdb.RocksDB.flush(Native Method) ~[rocksdbjni-4.9.0.jar:na]
>     at org.rocksdb.RocksDB.flush(RocksDB.java:1360)
> ~[rocksdbjni-4.9.0.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBStore.flushInternal(RocksDBStore.java:373)
> ~[kafka-streams-0.10.1.1.jar:na]
>     ... 11 common frames omitted
>
> When we queried the directory
> /data/kafka-streams/new-part/0_38/key-table/key-table-201702052100/ we
> could not find the directory.
> So looks like this directory was earlier deleted by some task and now some
> other task is trying to flush it too.
> What could be possible the reason for the same?
>
> Another set of error we see is this:
> org.apache.kafka.streams.errors.StreamsException: stream-thread
> [StreamThread-1] Failed to rebalance
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:410)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:242)
> ~[kafka-streams-0.10.1.1.jar:na]
> Caused by: org.apache.kafka.streams.errors.ProcessorStateException: Error
> opening store key-table-201702052000 at location
> /data/kafka-streams/new-part/0_38/key-table/key-table-201702052000
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:190)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:159)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBWindowStore$Segment.openDB(RocksDBWindowStore.java:72)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.getOrCreateSegment(RocksDBWindowStore.java:388)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.putInternal(RocksDBWindowStore.java:319)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.access$000(RocksDBWindowStore.java:51)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBWindowStore$1.restore(RocksDBWindowStore.java:206)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.restoreActiveState(ProcessorStateManager.java:235)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.ProcessorStateManager.register(ProcessorStateManager.java:198)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.ProcessorContextImpl.register(ProcessorContextImpl.java:123)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBWindowStore.init(RocksDBWindowStore.java:200)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.MeteredWindowStore.init(MeteredWindowStore.java:66)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.CachingWindowStore.init(CachingWindowStore.java:64)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.AbstractTask.initializeStateStores(AbstractTask.java:81)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamTask.<init>(StreamTask.java:119)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:633)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:660)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.access$100(StreamThread.java:69)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread$1.onPartitionsAssigned(StreamThread.java:124)
> ~[kafka-streams-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:228)
> ~[kafka-clients-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:313)
> ~[kafka-clients-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:277)
> ~[kafka-clients-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:259)
> ~[kafka-clients-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1013)
> ~[kafka-clients-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:979)
> ~[kafka-clients-0.10.1.1.jar:na]
>     at
>
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:407)
> ~[kafka-streams-0.10.1.1.jar:na]
>     ... 1 common frames omitted
> Caused by: org.rocksdb.RocksDBException: IO error: lock
> /data/kafka-streams/new-part/0_38/key-table/key-table-201702052000/LOCK: No
> locks available
>     at org.rocksdb.RocksDB.open(Native Method) ~[rocksdbjni-4.9.0.jar:na]
>     at org.rocksdb.RocksDB.open(RocksDB.java:184)
> ~[rocksdbjni-4.9.0.jar:na]
>     at
>
> org.apache.kafka.streams.state.internals.RocksDBStore.openDB(RocksDBStore.java:183)
> ~[kafka-streams-0.10.1.1.jar:na]
>     ... 26 common frames omitted
>
> And then the whole application shuts down. We have not understood this part
> of the error and if this second error is somehow related to first error.
>
> Please let us know what could be the cause of these errors or if they have
> been fixed and if there is some way to fix this in current release.
>
> Thanks
> Sachin
>

Reply via email to