[ 
https://issues.apache.org/jira/browse/KAFKA-5882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16417158#comment-16417158
 ] 

Ionut-Maxim Margelatu commented on KAFKA-5882:
----------------------------------------------

We have been able to consistently reproduce this issue. In our case, the reason 
is the change of the topology between deployments:
 * the existing deployment uses 2 subtopologies (A->B,B->C)
 * the new deployment uses 3 subtopologies (A->B,A'->B,B->C)

Performing a deployment with an updated topology results in a failure. 
Performing a deployment with the same topology works perfectly.

Here are some logs we added to StreamsThread in order to figure out what was 
going on:
{noformat}
2018-03-28T09:51:32 srv="mepw" [MEP-StreamThread-2] WARN  
org.apache.kafka.streams.processor.internals.StreamTask 
partition=mepw_internal-47 partitionTopic=mepw_internal 
topology=ProcessorTopology:
   hubSource:
      topics:    [mepw_change_event]
      children:  [CDCProcessor]
   CDCProcessor:
      children:  [CDCSink]
   CDCSink:
      topic:    mepw_internal

2018-03-28T09:51:32 srv="mepw" [MEP-StreamThread-2] ERROR 
org.apache.kafka.streams.processor.internals.StreamThread stream-thread 
[MEP-StreamThread-2] Error caught during partition assignment, will abort the 
current process and re-throw at the end of rebalance: null
{noformat}
During the rebalancing the Kafka Streams threads in the new app instances are 
assigned to a sub-topology that doesn't necessarily have a source for the 
partition they were assigned. This is what we could figure out. I hope this 
helps other people.

> NullPointerException in StreamTask
> ----------------------------------
>
>                 Key: KAFKA-5882
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5882
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.11.0.0, 0.11.0.1, 0.11.0.2
>            Reporter: Seweryn Habdank-Wojewodzki
>            Priority: Major
>
> It seems bugfix [KAFKA-5073|https://issues.apache.org/jira/browse/KAFKA-5073] 
> is made, but introduce some other issue.
> In some cases (I am not sure which ones) I got NPE (below).
> I would expect that even in case of FATAL error anythink except NPE is thrown.
> {code}
> 2017-09-12 23:34:54 ERROR ConsumerCoordinator:269 - User provided listener 
> org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener 
> for group streamer failed on partition assignment
> java.lang.NullPointerException: null
>         at 
> org.apache.kafka.streams.processor.internals.StreamTask.<init>(StreamTask.java:123)
>  ~[myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.createStreamTask(StreamThread.java:1234)
>  ~[myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:294)
>  ~[myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.retryWithBackoff(StreamThread.java:254)
>  ~[myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.addStreamTasks(StreamThread.java:1313)
>  ~[myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.access$1100(StreamThread.java:73)
>  ~[myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:183)
>  ~[myapp-streamer.jar:?]
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:265)
>  [myapp-streamer.jar:?]
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:363)
>  [myapp-streamer.jar:?]
>         at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:310)
>  [myapp-streamer.jar:?]
>         at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:297)
>  [myapp-streamer.jar:?]
>         at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1078)
>  [myapp-streamer.jar:?]
>         at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043) 
> [myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:582)
>  [myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:553)
>  [myapp-streamer.jar:?]
>         at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:527)
>  [myapp-streamer.jar:?]
> 2017-09-12 23:34:54 INFO  StreamThread:1040 - stream-thread 
> [streamer-3a44578b-faa8-4b5b-bbeb-7a7f04639563-StreamThread-1] Shutting down
> 2017-09-12 23:34:54 INFO  KafkaProducer:972 - Closing the Kafka producer with 
> timeoutMillis = 9223372036854775807 ms.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to