Hi,
We are getting some deadlock state after re-balance and what we found that
it happens after re-balance the partition is assigned to a different thread
in same instance.

Looks like previous thread is not releasing the rocks db lock.

Here are the complete logs spanned across multiple files.

https://drive.google.com/open?id=0B-KsMy4P1bGxLTVVVWlZOGdQZTUzYm9aV3lfOFdubTJ2X2JR

I am just posting the gist here as how we understood this issue.


*===== TestKafkaAdvice.2017-03-01.89.log ========*

53352 INFO 2017-03-01 18:11:56,980 [StreamThread-3]:
org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
Successfully joined group new-part-advice with *generation 96*

53353 INFO 2017-03-01 18:11:56,980 [StreamThread-4]:
org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
Successfully joined group new-part-advice with *generation 96*

53354 INFO 2017-03-01 18:11:56,980 [StreamThread-1]:
org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
Successfully joined group new-part-advice with *generation 96*


53361 INFO 2017-03-01 *18:11:57,084* [StreamThread-1]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-1]
New partitions [[advice-stream-9, advice-strea**m-6]]* assigned at the end
of consumer rebalance.

53362 INFO 2017-03-01 18:11:57,084 [StreamThread-3]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-3]
New partitions [[advice-stream-7]]* assigned at the end of consumer
rebalance.

53363 DEBUG 2017-03-01 18:11:57,084 [StreamThread-3]:
org.apache.kafka.streams.processor.internals.StreamThread - stream-thread
[StreamThread-3] closing suspended non-assigned task

53364 INFO 2017-03-01 18:11:57,084 [StreamThread-4]:
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Setting
newly assigned partitions [advice-stream-8, advice-stream-2] fo      r
group new-part-advice

53365 INFO 2017-03-01 18:11:57,084 [StreamThread-4]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-4]
New partitions [[advice-stream-8, advice-strea**m-2]]* assigned at the end
of consumer rebalance.



54253 INFO 2017-03-01 18:13:19,146 *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-4]
Creating active task 0_2 with assigned partiti**ons [[advice-stream-2]]*


56789 INFO 2017-03-01 18:17:42,311 [StreamThread-1]:
org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
Successfully joined group new-part-advice with *generation 97*

56790 INFO 2017-03-01 18:17:42,311 [StreamThread-3]:
org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
Successfully joined group new-part-advice with *generation 97*


56796 INFO 2017-03-01 *18:17:42,312* [StreamThread-3]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-3]
New partitions [[advice-stream-9, advice-strea**m-6]]* assigned at the end
of consumer rebalance.

56805 INFO 2017-03-01 18:17:42,312 [StreamThread-1]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-1]
New partitions [[advice-stream-8, advice-strea**m-4]]* assigned at the end
of consumer rebalance.



56823 INFO 2017-03-01 18:17:42,465 *[StreamThread-1]*:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-1]
Creating active task 0_4 with assigned partiti**ons [[advice-stream-4]]*

56831 DEBUG 2017-03-01 18:17:42,531 *[StreamThread-1]*:
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - Group
new-part-advice *fetching committed offsets for partitions: [advi**ce-stream-8,
advice-stream-4]*



57769 INFO 2017-03-01 *18:20:16,291* [StreamThread-4]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-4]
Creating active task 0_8 with assigned partiti**ons [[advice-stream-8]]*




57935 INFO 2017-03-01 18:20:19,181 [StreamThread-4]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-4]
Removing all active tasks [[0_2, 0_8]]*



*======== TestKafkaAdvice.2017-03-01.90.log  ============*


 742 INFO 2017-03-01 *18:22:42,697* [StreamThread-3]:
org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
Successfully joined group new-part-advice with *generation 98*

 743 INFO 2017-03-01 18:22:42,697 [StreamThread-4]:
org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
Successfully joined group new-part-advice with *generation 98*


746 INFO 2017-03-01 18:22:42,697 [StreamThread-3]: org.apache.kafka.streams.
processor.internals.StreamThread - stream-thread *[StreamThread-3] New
partitions [[advice-stream-9, advice-stream**-4]]* assigned at the end of
consumer rebalance.

748 INFO 2017-03-01 18:22:42,712 [StreamThread-4]: org.apache.kafka.streams.
processor.internals.StreamThread - stream-thread *[StreamThread-4] New
partitions [[advice-stream-8, advice-stream**-2]]* assigned at the end of
consumer rebalance.

...............

4502 WARN 2017-03-01 *18:26:23,323* *[StreamThread-3]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_4*. Will retry.


***************************************  Some time later.......*


4444 INFO 2017-03-01 18:25:44,071 [StreamThread-4]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-4]
Removing all active tasks [[0_2, 0_8]]*


************************************
*Task 0_2 (from line # 54253) moved to a different instance's thread-3 and
Task 0_8 got re-assigned to Thread-4 (line # 4584 below)*


4474 INFO 2017-03-01 18:26:13,443 [StreamThread-1]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread [*StreamThread-1]
Creating active task 0_8 with assigned partitio**ns [[advice-stream-8]]*

4475 WARN 2017-03-01 18:26:14,445 *[StreamThread-1]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_8*. Will retry.




4581 INFO 2017-03-01 *18:27:43,048* [StreamThread-4]:
org.apache.kafka.clients.consumer.internals.AbstractCoordinator -
Successfully joined group new-part-advice with generation 99


4584 INFO 2017-03-01 18:27:43,048 [StreamThread-4]:
org.apache.kafka.streams.processor.internals.StreamThread -
stream-thread *[StreamThread-4]
New partitions [[advice-stream-8, advice-stream**-4]]* assigned at the end
of consumer rebalance.


*************************************  Task Task 0_4 lock's as per line
#56823 above still seems to be with Thread-1 !!!*


4625 WARN 2017-03-01 *18:27:44,092* *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_4.* Will retry.

4646 WARN 2017-03-01 18:27:45,145 *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

4955 WARN 2017-03-01 18:29:37,508 *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

4982 WARN 2017-03-01 18:29:49,125 *[StreamThread-3]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

5073 WARN 2017-03-01 18:31:20,911 *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

5173 WARN 2017-03-01 18:34:46,714 *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

5199 WARN 2017-03-01 18:36:39,729 *[StreamThread-3]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

5248 WARN 2017-03-01 18:41:37,329 *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

5279 WARN 2017-03-01 18:50:19,977 *[StreamThread-3]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.


5397 WARN 2017-03-01 *19:19:13,900* *[StreamThread-1]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_8*. Will retry.

5721 WARN 2017-03-01 *22:03:07,629 [StreamThread-1]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_8*. Will retry.


5698 WARN 2017-03-01 *22:01:32,017* *[StreamThread-3]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

5746 WARN 2017-03-01 *22:06:29,340* *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_4.* Will retry.


*======== TestKafkaAdvice.log  ============*


148 WARN 2017-03-02 *01:41:35,831* *[StreamThread-1]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_8.* Will retry.

735 WARN 2017-03-02 *08:58:31,234* *[StreamThread-1]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_8*. Will retry.


712 WARN 2017-03-02 *08:56:55,810 [StreamThread-3]*:
org.apache.kafka.streams.processor.internals.StreamThread - Could not
create task 0_4. Will retry.

760 WARN 2017-03-02 *09:01:52,945* *[StreamThread-4]*:
org.apache.kafka.streams.processor.internals.StreamThread - *Could not
create task 0_4*. Will retry.


***********************************    Finally all threads reach the
deadlock state where they cannot create task*


Please let us know, why we get such a deadlock state.


Thanks

Sachin

Reply via email to