[jira] [Commented] (KAFKA-6817) UnknownProducerIdException when writing messages with old timestamps
[ https://issues.apache.org/jira/browse/KAFKA-6817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534097#comment-16534097 ] Chris Schwarzfischer commented on KAFKA-6817: - Same issue here. Although we think it’s because we have huge messages that fill up the segments for intermediate topics too quickly. We get the exception for every single commit. For now we are reverting to non-EOS mode... > UnknownProducerIdException when writing messages with old timestamps > > > Key: KAFKA-6817 > URL: https://issues.apache.org/jira/browse/KAFKA-6817 > Project: Kafka > Issue Type: Bug > Components: producer >Affects Versions: 1.1.0 >Reporter: Odin Standal >Priority: Major > > We are seeing the following exception in our Kafka application: > {code:java} > ERROR o.a.k.s.p.internals.StreamTask - task [0_0] Failed to close producer > due to the following error: org.apache.kafka.streams.errors.StreamsException: > task [0_0] Abort sending since an error caught with a previous record (key > 22 value some-value timestamp 1519200902670) to topic > exactly-once-test-topic- v2 due to This exception is raised by the broker if > it could not locate the producer metadata associated with the producerId in > question. This could happen if, for instance, the producer's records were > deleted because their retention time had elapsed. Once the last records of > the producerId are removed, the producer's metadata is removed from the > broker, and future appends by the producer will return this exception. at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:125) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:48) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:180) > at > org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1199) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.completeFutureAndFireCallbacks(ProducerBatch.java:204) > at > org.apache.kafka.clients.producer.internals.ProducerBatch.done(ProducerBatch.java:187) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:627) > at > org.apache.kafka.clients.producer.internals.Sender.failBatch(Sender.java:596) > at > org.apache.kafka.clients.producer.internals.Sender.completeBatch(Sender.java:557) > at > org.apache.kafka.clients.producer.internals.Sender.handleProduceResponse(Sender.java:481) > at > org.apache.kafka.clients.producer.internals.Sender.access$100(Sender.java:74) > at > org.apache.kafka.clients.producer.internals.Sender$1.onComplete(Sender.java:692) > at > org.apache.kafka.clients.ClientResponse.onComplete(ClientResponse.java:101) > at > org.apache.kafka.clients.NetworkClient.completeResponses(NetworkClient.java:482) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:474) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:239) at > org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:163) at > java.lang.Thread.run(Thread.java:748) Caused by: > org.apache.kafka.common.errors.UnknownProducerIdException > {code} > We discovered this error when we had the need to reprocess old messages. See > more details on > [Stackoverflow|https://stackoverflow.com/questions/49872827/unknownproduceridexception-in-kafka-streams-when-enabling-exactly-once?noredirect=1#comment86901917_49872827] > We have reproduced the error with a smaller example application. The error > occurs after 10 minutes of producing messages that have old timestamps (type > 1 year old). The topic we are writing to has a retention.ms set to 1 year so > we are expecting the messages to stay there. > After digging through the ProducerStateManager-code in the Kafka source code > we have a theory of what might be wrong. > The ProducerStateManager.removeExpiredProducers() seems to remove producers > from memory erroneously when processing records which are older than the > maxProducerIdExpirationMs (coming from the `transactional.id.expiration.ms` > configuration), which is set by default to 7 days. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (KAFKA-7133) DisconnectException every 5 minutes in single restore consumer thread
Chris Schwarzfischer created KAFKA-7133: --- Summary: DisconnectException every 5 minutes in single restore consumer thread Key: KAFKA-7133 URL: https://issues.apache.org/jira/browse/KAFKA-7133 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 1.1.0 Environment: Kafka Streams application in Kubernetes. Kafka Server in Docker on machine in host mode Reporter: Chris Schwarzfischer One of our streams applications (and only this one) gets a {{org.apache.kafka.common.errors.DisconnectException}} almost exactly every 5 minutes. The application has two of KStream -> KGroupedStream -> KTable -> KGroupedTable -> KTable aggregations. Relevant config is in Streams: {code:java} this.properties.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.AT_LEAST_ONCE); //... this.properties.put(StreamsConfig.NUM_STREAM_THREADS_CONFIG, 2); this.properties.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 1024 * 1024 * 500 /* 500 MB */ ); this.properties.put(ConsumerConfig.MAX_PARTITION_FETCH_BYTES_CONFIG, 1024 * 1024 * 100 /* 100 MB */); this.properties.put(ConsumerConfig.FETCH_MAX_BYTES_CONFIG, 1024 * 1024 * 50 /* 50 MB */); {code} On the broker: {noformat} KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3 KAFKA_OFFSETS_RETENTION_MINUTES: 108000 KAFKA_MIN_INSYNC_REPLICAS: 2 KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 3 KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 2 KAFKA_TRANSACTIONAL_ID_EXPIRATION_MS: 2147483000 KAFKA_LOG_RETENTION_HOURS: 2688 KAFKA_OFFSETS_RETENTION_CHECK_INTERVAL_MS: 120 KAFKA_ZOOKEEPER_SESSION_TIMEOUT_MS: 12000 {noformat} Logging gives us a single restore consumer thread that throws exceptions every 5 mins: {noformat} July 4th 2018, 15:38:51.560 dockertest032018-07-04T13:38:51,559Z INFO : [testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]: FetchSessionHandler::handleError:440 - [Consumer clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer, groupId=] Error sending fetch request (sessionId=317141939, epoch=INITIAL) to node 1: org.apache.kafka.common.errors.DisconnectException. July 4th 2018, 15:37:54.833 dockertest032018-07-04T13:37:54,832Z INFO : [testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]: FetchSessionHandler::handleError:440 - [Consumer clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer, groupId=] Error sending fetch request (sessionId=2064325970, epoch=INITIAL) to node 3: org.apache.kafka.common.errors.DisconnectException. July 4th 2018, 15:37:54.833 dockertest032018-07-04T13:37:54,832Z INFO : [testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]: FetchSessionHandler::handleError:440 - [Consumer clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer, groupId=] Error sending fetch request (sessionId=1735432619, epoch=INITIAL) to node 2: org.apache.kafka.common.errors.DisconnectException. July 4th 2018, 15:32:26.379 dockertest032018-07-04T13:32:26,378Z INFO : [testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]: FetchSessionHandler::handleError:440 - [Consumer clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer, groupId=] Error sending fetch request (sessionId=317141939, epoch=INITIAL) to node 1: org.apache.kafka.common.errors.DisconnectException. July 4th 2018, 15:32:01.926 dockertest032018-07-04T13:32:01,925Z INFO : [testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]: FetchSessionHandler::handleError:440 - [Consumer clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer, groupId=] Error sending fetch request (sessionId=1735432619, epoch=INITIAL) to node 2: org.apache.kafka.common.errors.DisconnectException. July 4th 2018, 15:32:01.926 dockertest032018-07-04T13:32:01,925Z INFO : [testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]: FetchSessionHandler::handleError:440 - [Consumer clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restore-consumer, groupId=] Error sending fetch request (sessionId=2064325970, epoch=INITIAL) to node 3: org.apache.kafka.common.errors.DisconnectException. July 4th 2018, 15:26:53.886 dockertest032018-07-04T13:26:53,886Z INFO : [testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2][]: FetchSessionHandler::handleError:440 - [Consumer clientId=testdev-cs9-test-aggregate-udrs-e39ef2d4-452b-4697-b031-26fc1bac8831-StreamThread-2-restor
[jira] [Commented] (KAFKA-6437) Streams does not warn about missing input topics, but hangs
[ https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410940#comment-16410940 ] Chris Schwarzfischer commented on KAFKA-6437: - Yes, I think, the current behavior is not an error. A warning level message would help people on deployment when there is something missing. Making the behavior configurable would be nice, but I‘m not sure anybody would actually use that. > Streams does not warn about missing input topics, but hangs > --- > > Key: KAFKA-6437 > URL: https://issues.apache.org/jira/browse/KAFKA-6437 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 1.0.0 > Environment: Single client on single node broker >Reporter: Chris Schwarzfischer >Priority: Minor > Labels: newbie > > *Case* > Streams application with two input topics being used for a left join. > When the left side topic is missing upon starting the streams application, it > hangs "in the middle" of the topology (at …9, see below). Only parts of > the intermediate topics are created (up to …9) > When the missing input topic is created, the streams application resumes > processing. > {noformat} > Topology: > StreamsTask taskId: 2_0 > ProcessorTopology: > KSTREAM-SOURCE-11: > topics: > [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition] > children: [KTABLE-AGGREGATE-12] > KTABLE-AGGREGATE-12: > states: > [KTABLE-AGGREGATE-STATE-STORE-09] > children: [KTABLE-TOSTREAM-20] > KTABLE-TOSTREAM-20: > children: [KSTREAM-SINK-21] > KSTREAM-SINK-21: > topic: data_udr_month_customer_aggregration > KSTREAM-SOURCE-17: > topics: > [mystreams_app-KSTREAM-MAP-14-repartition] > children: [KSTREAM-LEFTJOIN-18] > KSTREAM-LEFTJOIN-18: > states: > [KTABLE-AGGREGATE-STATE-STORE-09] > children: [KSTREAM-SINK-19] > KSTREAM-SINK-19: > topic: data_UDR_joined > Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, > mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0] > {noformat} > *Why this matters* > The applications does quite a lot of preprocessing before joining with the > missing input topic. This preprocessing won't happen without the topic, > creating a huge backlog of data. > *Fix* > Issue an `warn` or `error` level message at start to inform about the missing > topic and it's consequences. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (KAFKA-6437) Streams does not warn about missing input topics, but hangs
[ https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16320675#comment-16320675 ] Chris Schwarzfischer commented on KAFKA-6437: - Yep, I know it's by design and that doesn't need to change, of course. "It hangs in the middle" means, that the application is actually starting and processing data up to some intermediate topic. This makes it easy to overlook that there are topics missing that prevent the application from running correctly. It would make it a lot easier to spot this error if there was an error messaging saying that the topic is missing instead of simply switching to "RUNNING" as if everything was ok… > Streams does not warn about missing input topics, but hangs > --- > > Key: KAFKA-6437 > URL: https://issues.apache.org/jira/browse/KAFKA-6437 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 1.0.0 > Environment: Single client on single node broker >Reporter: Chris Schwarzfischer >Priority: Minor > > *Case* > Streams application with two input topics being used for a left join. > When the left side topic is missing upon starting the streams application, it > hangs "in the middle" of the topology (at …9, see below). Only parts of > the intermediate topics are created (up to …9) > When the missing input topic is created, the streams application resumes > processing. > {noformat} > Topology: > StreamsTask taskId: 2_0 > ProcessorTopology: > KSTREAM-SOURCE-11: > topics: > [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition] > children: [KTABLE-AGGREGATE-12] > KTABLE-AGGREGATE-12: > states: > [KTABLE-AGGREGATE-STATE-STORE-09] > children: [KTABLE-TOSTREAM-20] > KTABLE-TOSTREAM-20: > children: [KSTREAM-SINK-21] > KSTREAM-SINK-21: > topic: data_udr_month_customer_aggregration > KSTREAM-SOURCE-17: > topics: > [mystreams_app-KSTREAM-MAP-14-repartition] > children: [KSTREAM-LEFTJOIN-18] > KSTREAM-LEFTJOIN-18: > states: > [KTABLE-AGGREGATE-STATE-STORE-09] > children: [KSTREAM-SINK-19] > KSTREAM-SINK-19: > topic: data_UDR_joined > Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, > mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0] > {noformat} > *Why this matters* > The applications does quite a lot of preprocessing before joining with the > missing input topic. This preprocessing won't happen without the topic, > creating a huge backlog of data. > *Fix* > Issue an `warn` or `error` level message at start to inform about the missing > topic and it's consequences. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-6437) Streams does not warn about missing input topics, but hangs
[ https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Schwarzfischer updated KAFKA-6437: Issue Type: Improvement (was: Bug) > Streams does not warn about missing input topics, but hangs > --- > > Key: KAFKA-6437 > URL: https://issues.apache.org/jira/browse/KAFKA-6437 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 1.0.0 > Environment: Single client on single node broker >Reporter: Chris Schwarzfischer >Priority: Minor > > *Case* > Streams application with two input topics being used for a left join. > When the left side topic is missing upon starting the streams application, it > hangs "in the middle" of the topology (at …9, see below). Only parts of > the intermediate topics are created (up to …9) > When the missing input topic is created, the streams application resumes > processing. > {noformat} > Topology: > StreamsTask taskId: 2_0 > ProcessorTopology: > KSTREAM-SOURCE-11: > topics: > [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition] > children: [KTABLE-AGGREGATE-12] > KTABLE-AGGREGATE-12: > states: > [KTABLE-AGGREGATE-STATE-STORE-09] > children: [KTABLE-TOSTREAM-20] > KTABLE-TOSTREAM-20: > children: [KSTREAM-SINK-21] > KSTREAM-SINK-21: > topic: data_udr_month_customer_aggregration > KSTREAM-SOURCE-17: > topics: > [mystreams_app-KSTREAM-MAP-14-repartition] > children: [KSTREAM-LEFTJOIN-18] > KSTREAM-LEFTJOIN-18: > states: > [KTABLE-AGGREGATE-STATE-STORE-09] > children: [KSTREAM-SINK-19] > KSTREAM-SINK-19: > topic: data_UDR_joined > Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, > mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0] > {noformat} > *Why this matters* > The applications does quite a lot of preprocessing before joining with the > missing input topic. This preprocessing won't happen without the topic, > creating a huge backlog of data. > *Fix* > Issue an `warn` or `error` level message at start to inform about the missing > topic and it's consequences. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (KAFKA-6437) Streams does not warn about missing input topics, but hangs
[ https://issues.apache.org/jira/browse/KAFKA-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Schwarzfischer updated KAFKA-6437: Description: *Case* Streams application with two input topics being used for a left join. When the left side topic is missing upon starting the streams application, it hangs "in the middle" of the topology (at …9, see below). Only parts of the intermediate topics are created (up to …9) When the missing input topic is created, the streams application resumes processing. {noformat} Topology: StreamsTask taskId: 2_0 ProcessorTopology: KSTREAM-SOURCE-11: topics: [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition] children: [KTABLE-AGGREGATE-12] KTABLE-AGGREGATE-12: states: [KTABLE-AGGREGATE-STATE-STORE-09] children: [KTABLE-TOSTREAM-20] KTABLE-TOSTREAM-20: children: [KSTREAM-SINK-21] KSTREAM-SINK-21: topic: data_udr_month_customer_aggregration KSTREAM-SOURCE-17: topics: [mystreams_app-KSTREAM-MAP-14-repartition] children: [KSTREAM-LEFTJOIN-18] KSTREAM-LEFTJOIN-18: states: [KTABLE-AGGREGATE-STATE-STORE-09] children: [KSTREAM-SINK-19] KSTREAM-SINK-19: topic: data_UDR_joined Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0] {noformat} *Why this matters* The applications does quite a lot of preprocessing before joining with the missing input topic. This preprocessing won't happen without the topic, creating a huge backlog of data. *Fix* Issue an `warn` or `error` level message at start to inform about the missing topic and it's consequences. was: *Case* Streams application with two input topics being used for a left join. When the left side topic is missing upon starting the streams application, it hangs "in the middle" of the topology (at …9, see below). Only parts of the intermediate topics are created (up to …9) When the missing input topic is created, the streams application resumes processing. {noformat} Topology: StreamsTask taskId: 2_0 ProcessorTopology: KSTREAM-SOURCE-11: topics: [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition] children: [KTABLE-AGGREGATE-12] KTABLE-AGGREGATE-12: states: [KTABLE-AGGREGATE-STATE-STORE-09] children: [KTABLE-TOSTREAM-20] KTABLE-TOSTREAM-20: children: [KSTREAM-SINK-21] KSTREAM-SINK-21: topic: faxout_udr_month_customer_aggregration KSTREAM-SOURCE-17: topics: [mystreams_app-KSTREAM-MAP-14-repartition] children: [KSTREAM-LEFTJOIN-18] KSTREAM-LEFTJOIN-18: states: [KTABLE-AGGREGATE-STATE-STORE-09] children: [KSTREAM-SINK-19] KSTREAM-SINK-19: topic: data_UDR_joined Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0] {noformat} *Why this matters* The applications does quite a lot of preprocessing before joining with the missing input topic. This preprocessing won't happen without the topic, creating a huge backlog of data. *Fix* Issue an `warn` or `error` level message at start to inform about the missing topic and it's consequences. > Streams does not warn about missing input topics, but hangs > --- > > Key: KAFKA-6437 > URL: https://issues.apache.org/jira/browse/KAFKA-6437 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 1.0.0 > Environment: Single client on single node broker >Reporter: Chris Schwarzfischer >Priority: Minor > > *Case* > Streams application with two input topics being used for a left join. > When the left side topic is missing upon starting the streams application, it > hangs "in the mid
[jira] [Created] (KAFKA-6437) Streams does not warn about missing input topics, but hangs
Chris Schwarzfischer created KAFKA-6437: --- Summary: Streams does not warn about missing input topics, but hangs Key: KAFKA-6437 URL: https://issues.apache.org/jira/browse/KAFKA-6437 Project: Kafka Issue Type: Bug Components: streams Affects Versions: 1.0.0 Environment: Single client on single node broker Reporter: Chris Schwarzfischer Priority: Minor *Case* Streams application with two input topics being used for a left join. When the left side topic is missing upon starting the streams application, it hangs "in the middle" of the topology (at …9, see below). Only parts of the intermediate topics are created (up to …9) When the missing input topic is created, the streams application resumes processing. {noformat} Topology: StreamsTask taskId: 2_0 ProcessorTopology: KSTREAM-SOURCE-11: topics: [mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition] children: [KTABLE-AGGREGATE-12] KTABLE-AGGREGATE-12: states: [KTABLE-AGGREGATE-STATE-STORE-09] children: [KTABLE-TOSTREAM-20] KTABLE-TOSTREAM-20: children: [KSTREAM-SINK-21] KSTREAM-SINK-21: topic: faxout_udr_month_customer_aggregration KSTREAM-SOURCE-17: topics: [mystreams_app-KSTREAM-MAP-14-repartition] children: [KSTREAM-LEFTJOIN-18] KSTREAM-LEFTJOIN-18: states: [KTABLE-AGGREGATE-STATE-STORE-09] children: [KSTREAM-SINK-19] KSTREAM-SINK-19: topic: data_UDR_joined Partitions [mystreams_app-KSTREAM-MAP-14-repartition-0, mystreams_app-KTABLE-AGGREGATE-STATE-STORE-09-repartition-0] {noformat} *Why this matters* The applications does quite a lot of preprocessing before joining with the missing input topic. This preprocessing won't happen without the topic, creating a huge backlog of data. *Fix* Issue an `warn` or `error` level message at start to inform about the missing topic and it's consequences. -- This message was sent by Atlassian JIRA (v6.4.14#64029)