Re: Graceful Shutdown on Kafka Streams exception handler

2025-09-03 Thread Matthias J. Sax
able to deliver. There is many configs, so maybe there is a way to tune your app to make it work with EOS? -Matthias On 9/3/25 10:50 AM, Victor Osorio wrote: Hello everyone, We’re currently using Kafka Streams to process transactional data with *exactly-once semantics (EOS)*. However, for some

Graceful Shutdown on Kafka Streams exception handler

2025-09-03 Thread Victor Osorio
Hello everyone, We're currently using Kafka Streams to process transactional data with exactly-once semantics (EOS). However, for some of our workloads, we require higher throughput, which makes EOS impractical. To ensure data integrity, we rely on UncaughtExceptionHandle

Re: Optimizing Kafka Streams Validation for Array Elements

2025-02-27 Thread Paweł Szymczyk
topics source and sink (no >>> additional repartition and changelog), and as I mentioned before we don't >>> like to introduce manual validation in the flatMap (like if null >>> statements) method, >>> we would like to use automatic json schema validation feature f

Re: Optimizing Kafka Streams Validation for Array Elements

2025-02-25 Thread Paweł Szymczyk
>> >>> Ideal solution has two topics, can we somehow do flatMap and change a key >>> without changing creating the internal repartition topic? This will allow >>> us to skip using the internal repartition topic as source for k table. >>> >>> >

Re: Optimizing Kafka Streams Validation for Array Elements

2025-02-25 Thread Paweł Szymczyk
;> napisał(a): >> >>> Ideal solution has two topics, can we somehow do flatMap and change a key >>> without changing creating the internal repartition topic? This will allow >>> us to skip using the internal repartition topic as source for k table. >>>

Re: Optimizing Kafka Streams Validation for Array Elements

2025-02-25 Thread Bruno Cadonna
topic and from the repartition topic also do the validation of the JSON? Regarding point 2 and 3: I agree that the dependency on the name of the repartition is not a good idea. The naming of the repartition topic is rather an implementation detail that should not be leaked. You could improve o

Re: Optimizing Kafka Streams Validation for Array Elements

2025-02-25 Thread Paweł Szymczyk
rom the repartition topic also do the validation of the >> JSON? >> >> Regarding point 2 and 3: >> I agree that the dependency on the name of the repartition is not a good >> idea. The naming of the repartition topic is rather an implementation detail >>

Re: Optimizing Kafka Streams Validation for Array Elements

2025-02-25 Thread Paweł Szymczyk
Then, you can read the >elements again from the output topic and write them into the table. > >Regarding point 1: >Could you do the validation when you write to the elements to the output topic? > >Best, >Bruno > >On 25.02.25 14:20, Paweł Szymczyk wrote: >> Dear Kaf

Re: Optimizing Kafka Streams Validation for Array Elements

2025-02-25 Thread Bruno Cadonna
again from the output topic and write them into the table. Regarding point 1: Could you do the validation when you write to the elements to the output topic? Best, Bruno On 25.02.25 14:20, Paweł Szymczyk wrote: Dear Kafka users, The last few days I spent working with Kafka Streams on some

Optimizing Kafka Streams Validation for Array Elements

2025-02-25 Thread Paweł Szymczyk
Dear Kafka users, The last few days I spent working with Kafka Streams on some tasks which looked very easy at first glance but finally I struggled with the Streams Builder API and did something which I am not proud of. Please help me, I am open to any suggestions. On the input topic we have a

Re: Kafka Streams Consumer Constantly Rebalance over 100k tps

2025-01-22 Thread Matthias J. Sax
the appropriate change (more KS instance, fewer threads, more memory, config change) should help. -Matthias On 1/9/25 7:30 PM, Martinus Elvin wrote: Hello, We have a kafka streams service that performs a left join (KStreams) operation by their message key. The message size is 1 KB more or l

Re: Kafka Streams Consumer Constantly Rebalance over 100k tps

2025-01-21 Thread Martinus Elvin
e have a kafka streams service that performs a left join (KStreams) operation by their message key. The message size is 1 KB more or less. The left topic has around two hundred thousand (200,000) messages per second, while the right topic has around two thousand (2000) message per second. Each t

Re: Kafka Streams Consumer Constantly Rebalance over 100k tps

2025-01-17 Thread Matthias J. Sax
. Both together should give you a starting point to understand what the issue could be, and what the appropriate change (more KS instance, fewer threads, more memory, config change) should help. -Matthias On 1/9/25 7:30 PM, Martinus Elvin wrote: Hello, We have a kafka streams service that

Re: kafka-streams stream-table join with a grace period does not respect passed serializer?

2025-01-10 Thread Matthias J. Sax
error when building/starting the topology. I've put a minimal reproduction below (against kafka-streams 3.7.0, please excuse the Scala). This fails with: org.apache.kafka.streams.errors.StreamsException: failed to initialize processor KSTREAM-LEFTJOIN-

Kafka Streams Consumer Constantly Rebalance over 100k tps

2025-01-09 Thread Martinus Elvin
Hello, We have a kafka streams service that performs a left join (KStreams) operation by their message key. The message size is 1 KB more or less. The left topic has around two hundred thousand (200,000) messages per second, while the right topic has around two thousand (2000) message per

Re: Kafka streams lost messages on repartition topic during rebalancing

2024-12-18 Thread Bill Bejeck
Updating Kafka Streams would be enough. -Bill On Wed, Dec 18, 2024 at 6:50 AM TheKaztek wrote: > Would updating the kafka streams client library be enough ? Or should the > cluster be updated to ? > > wt., 17 gru 2024 o 19:28 Bill Bejeck > napisał(a): > > > Hi, >

Re: Kafka streams lost messages on repartition topic during rebalancing

2024-12-18 Thread TheKaztek
Would updating the kafka streams client library be enough ? Or should the cluster be updated to ? wt., 17 gru 2024 o 19:28 Bill Bejeck napisał(a): > Hi, > > I think you could be hitting KAFKA-17635 > <https://issues.apache.org/jira/browse/KAFKA-17635> which has been fixed >

Re: Kafka streams lost messages on repartition topic during rebalancing

2024-12-17 Thread Bill Bejeck
Hi, I think you could be hitting KAFKA-17635 <https://issues.apache.org/jira/browse/KAFKA-17635> which has been fixed in Kafka Streams v 3.7.2 . It's been released this week, is it possible to upgrade and try it out? -Bill On Tue, Dec 17, 2024 at 4:10 AM TheKaztek wrote: >

Kafka streams lost messages on repartition topic during rebalancing

2024-12-17 Thread TheKaztek
Hi, we have a worrying problem in the project where we use kafka streams. We had an incident where during heavy load on our app (dozens of millions of records in 15 minutes span on an input topic of the stream) we decided to add additional instances of the app (5 -> 10) and some of the alre

Re: AW: Unexpected Tombstone records after kafka streams update 3.7.1

2024-08-30 Thread Matthias J. Sax
ticed something similar or knows more about this. After we updated our Spring Boot Kafka Streams application kafka-streams dependency from 3.6.2 to 3.7.1, we noticed some failing tests. The expected behavior of the streams-processor is, to join two input topics together and produce a single output r

AW: Unexpected Tombstone records after kafka streams update 3.7.1

2024-08-30 Thread Vogel, Kevin, DP EPS, BN, extern, external
An: users@kafka.apache.org Betreff: Re: Unexpected Tombstone records after kafka streams update 3.7.1 Sounds like https://issues.apache.org/jira/browse/KAFKA-16394 -Matthias On 8/29/24 02:35, Vogel, Kevin, DP EPS, BN, extern, external wrote: > Hello there, > > I searched the Apache Jira for a

Re: Unexpected Tombstone records after kafka streams update 3.7.1

2024-08-29 Thread Matthias J. Sax
knows more about this. After we updated our Spring Boot Kafka Streams application kafka-streams dependency from 3.6.2 to 3.7.1, we noticed some failing tests. The expected behavior of the streams-processor is, to join two input topics together and produce a single output record. Then the test inje

Unexpected Tombstone records after kafka streams update 3.7.1

2024-08-29 Thread Vogel, Kevin, DP EPS, BN, extern, external
Hello there, I searched the Apache Jira for a bug report on this topic but couldn't find one. Maybe anyone else has noticed something similar or knows more about this. After we updated our Spring Boot Kafka Streams application kafka-streams dependency from 3.6.2 to 3.7.1, we noticed

Re: Kafka Streams branching return type affects bindings when using multiple output bindings: intentional behaviour?

2024-06-27 Thread Jenny Qiao
iOS<https://aka.ms/o0ukef> From: Matthias J. Sax Sent: Thursday, June 20, 2024 9:25:10 PM To: users@kafka.apache.org Subject: Re: Kafka Streams branching return type affects bindings when using multiple output bindings: intentional behaviour? Jenny, thanks for reach

Re: Kafka Streams branching return type affects bindings when using multiple output bindings: intentional behaviour?

2024-06-20 Thread Matthias J. Sax
mproved the branching return type from KStream<>[] to HashMap. This changes the output binding when using multiple branches in the Kafka Streams binder of Spring Cloud Stream - the bindings are possibly wrong because the map storing the KStream branches is unordered. Before KIP-418, i

Kafka Streams branching return type affects bindings when using multiple output bindings: intentional behaviour?

2024-06-20 Thread Jenny Qiao
Hello, KIP-418 (>=2.8) improved the branching return type from KStream<>[] to HashMap. This changes the output binding when using multiple branches in the Kafka Streams binder of Spring Cloud Stream - the bindings are possibly wrong because the map storing the KStream branches is

Re: Kafka streams stores key in multiple state store instances

2024-05-16 Thread Matthias J. Sax
on count is the same for all input topic). To work around this, you would need to rewrite the program to use either `groupBy((k,v) -> k)` instead of `groupByKey()`, or do a `.repartition().groupByKey()`. Does this make sense? -Matthias On 5/16/24 2:11 AM, Kay Hannay wrote: Hi, we ha

Kafka streams stores key in multiple state store instances

2024-05-16 Thread Kay Hannay
Hi, we have a Kafka streams application which merges (merge, groupByKey, aggretgate) a few topics into one topic. The application is stateful, of course. There are currently six instances of the application running in parallel. We had an issue where one new Topic for aggregation did have

Re: Kafka streams state store return hostname as unavailable when calling queryMetadataForKey method

2024-05-13 Thread Sophie Blee-Goldman
Ah. Well this isn't anything new then since it's been the case since 2.6, but the default task assignor in Kafka Streams will sometimes assign partitions unevenly for a time if it's trying to move around stateful tasks and there's no copy of that task's state on the l

Re: Kafka streams state store return hostname as unavailable when calling queryMetadataForKey method

2024-05-09 Thread Penumarthi Durga Prasad Chowdary
ws and session windows. To > retrieve > > windows efficiently, I've established a remote query mechanism between > > Kafka Streams instances. By leveraging the queryMetadataForKey method on > > streams, I can retrieve the hostname where a specific key

Re: Kafka streams state store return hostname as unavailable when calling queryMetadataForKey method

2024-05-09 Thread Sophie Blee-Goldman
What version did you upgrade from? On Wed, May 8, 2024 at 10:32 PM Penumarthi Durga Prasad Chowdary < prasad.penumar...@gmail.com> wrote: > Hi Team, > I'm utilizing Kafka Streams to handle data from Kafka topics, running > multiple instances with the same applicat

Kafka streams state store return hostname as unavailable when calling queryMetadataForKey method

2024-05-08 Thread Penumarthi Durga Prasad Chowdary
Hi Team, I'm utilizing Kafka Streams to handle data from Kafka topics, running multiple instances with the same application ID. This enables distributed processing of Kafka data across these instances. Furthermore, I've implemented state stores with time windows and session windows. T

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-04-14 Thread Venkatesh Nagarajan
issue (which has most likely been resolved by correcting the METADATA_MAX_AGE_CONFIG setting) from surfacing. Thank you. Kind regards, Venkatesh From: Matthias J. Sax Date: Friday, 5 April 2024 at 3:59 AM To: users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems

Re: Can two different kafka clusters be used in kafka streams for consumption from 1 cluster and produce on another cluster

2024-04-11 Thread Bruno Cadonna
Hi Pushkar, unfortunately, cross cluster processing is currently not possible with Kafka Streams. Best, Bruno On 4/11/24 4:13 PM, Pushkar Deole wrote: Hi All, We are using a streams application and currently the application uses a common kafka cluster that is shared along with many other

Can two different kafka clusters be used in kafka streams for consumption from 1 cluster and produce on another cluster

2024-04-11 Thread Pushkar Deole
common cluster but produce the processed the data onto new cluster. Is it possible through kafka streams, wherein the consumer in the streams is consuming from 1 cluster while the sink happens onto another cluster?

Re: Messages disappearing from Kafka Streams topology

2024-04-10 Thread Karsten Stöckmann
Hi Mangat, back to work now. I've configured out Streams applications to use exacly-once semantics, but to no avail. Actually, after some. more investigation I've come to suspect that the issue is somehow related to rebalancing. The initially shown topology lives inside a Quarkus Kaf

kafka-streams stream-table join with a grace period does not respect passed serializer?

2024-04-08 Thread Mickey Donaghy
arting the topology. I've put a minimal reproduction below (against kafka-streams 3.7.0, please excuse the Scala). This fails with: org.apache.kafka.streams.errors.StreamsException: failed to initialize processor KSTREAM-LEFTJOIN-03 at org.apache.kafka.streams.processor.internals.Pro

Re: Fix slow processing rate in Kafka streams

2024-04-05 Thread Matthias J. Sax
Perf tuning is always tricky... 350 rec/sec sounds pretty low though. You would first need to figure out where the bottleneck is. Kafka Streams exposes all kind of metrics: https://kafka.apache.org/documentation/#kafka_streams_monitoring Might be good to inspect them as a first step -- maybe

Fix slow processing rate in Kafka streams

2024-04-04 Thread Nirmal Das
Hi All, My streams application is not processing more than 350 records/sec on a high load of 3milliom records produced every 2-3 minutes. My scenarios are as below - I am on Kafka and streams version of 3.5.1 . My key-value pair is in protobuf format . I do a groupbykey followed by TimeWindow of

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-04-04 Thread Matthias J. Sax
and yourself for the guidance on the stalling issue in the Kafka Streams client. After restoring the default value for the METADATA_MAX_AGE_CONFIG, I haven’t seen the issue happening. Heavy rebalancing (as mentioned before) continues to happen. I will refer to the link which mentions about certa

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-04-03 Thread Venkatesh Nagarajan
Apologies for the delay, Bruno. Thank you so much for the excellent link and for your inputs! Also, I would like to thank Matthias and yourself for the guidance on the stalling issue in the Kafka Streams client. After restoring the default value for the METADATA_MAX_AGE_CONFIG, I haven’t seen

Re: Messages disappearing from Kafka Streams topology

2024-03-28 Thread Karsten Stöckmann
ly-once > > > semantics > > > > already, was unsure whether it would help, and left it aside for once > > > then. > > > > I'll try that immediately when I get back to work. > > > > > > > > About snapshots and deserialization - I

Re: Messages disappearing from Kafka Streams topology

2024-03-27 Thread mangat rai
ork. > > > > > > About snapshots and deserialization - I doubt that the issue is caused > by > > > deserialization failures because: when taking another (i.e. at a later > > > point of time) snapshot of the exact same data, all messages fed into

Re: Messages disappearing from Kafka Streams topology

2024-03-27 Thread Karsten Stöckmann
would help, and left it aside for once > then. > > I'll try that immediately when I get back to work. > > > > About snapshots and deserialization - I doubt that the issue is caused by > > deserialization failures because: when taking another (i.e. at a later > > poin

Re: Messages disappearing from Kafka Streams topology

2024-03-26 Thread mangat rai
- I doubt that the issue is caused by > deserialization failures because: when taking another (i.e. at a later > point of time) snapshot of the exact same data, all messages fed into the > input topic pass the pipeline as expected. > > Logs of both Kafka and Kafka Streams show no signs

Re: Messages disappearing from Kafka Streams topology

2024-03-26 Thread Karsten Stöckmann
ed by deserialization failures because: when taking another (i.e. at a later point of time) snapshot of the exact same data, all messages fed into the input topic pass the pipeline as expected. Logs of both Kafka and Kafka Streams show no signs of notable issues as far as I can tell, apart from these

Re: Messages disappearing from Kafka Streams topology

2024-03-26 Thread mangat rai
Hey Karsten, There could be several reasons this could happen. 1. Did you check the error logs? There are several reasons why the Kafka stream app may drop incoming messages. Use exactly-once semantics to limit such cases. 2. Are you sure there was no error when deserializing the records from `fol

Re: Messages disappearing from Kafka Streams topology

2024-03-26 Thread Karsten Stöckmann
Hi, thanks for getting back. I'll try and illustrate the issue. I've got an input topic 'folderTopicName' fed by a database CDC system. Messages then pass a series of FK left joins and are eventually sent to an output topic like this ('agencies' and 'documents' being KTables): strea

Re: Messages disappearing from Kafka Streams topology

2024-03-25 Thread Bruno Cadonna
Hi, That sounds worrisome! Could you please provide us with a minimal example that shows the issue you describe? That would help a lot! Best, Bruno On 3/25/24 4:07 PM, Karsten Stöckmann wrote: Hi, are there circumstances that might lead to messages silently (i.e. without any logged warnin

Messages disappearing from Kafka Streams topology

2024-03-25 Thread Karsten Stöckmann
Hi, are there circumstances that might lead to messages silently (i.e. without any logged warnings or errors) disappearing from a topology? Specifically, I've got a rather simple topology doing a series of FK left joins and notice severe message loss in case the application is fired up for the fi

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-22 Thread Bruno Cadonna
/19/24 4:14 AM, Venkatesh Nagarajan wrote: Thanks very much for sharing the links and for your important inputs, Bruno! We recommend to use as many stream threads as cores on the compute node where the Kafka Streams client is run. How many Kafka Streams tasks do you have to distribute over

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-18 Thread Venkatesh Nagarajan
Thanks very much for sharing the links and for your important inputs, Bruno! > We recommend to use as many stream threads as cores on the compute node > where the Kafka Streams client is run. How many Kafka Streams tasks do you > have to distribute over the clients? We use 1vCPU (p

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-15 Thread Bruno Cadonna
Hi Venkatesh, As you discovered, in Kafka Streams 3.5.1 there is no stop-the-world rebalancing. Static group member is helpful when Kafka Streams clients are restarted as you pointed out. > ERROR org.apache.kafka.streams.processor.internals.StandbyTask - stream-thread [-StreamThrea

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-14 Thread Venkatesh Nagarajan
Just want to make a correction, Bruno - My understanding is that Kafka Streams 3.5.1 uses Incremental Cooperative Rebalancing which seems to help reduce the impact of rebalancing caused by autoscaling etc.: https://www.confluent.io/blog/incremental-cooperative-rebalancing-in-kafka/ Static

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-14 Thread Venkatesh Nagarajan
? Thank you very much. Kind regards, Venkatesh From: Bruno Cadonna Date: Wednesday, 13 March 2024 at 8:29 PM To: users@kafka.apache.org Subject: Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Hi Venkatesh, Extending on what Matthias replied, a metadata refresh might

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-13 Thread Bruno Cadonna
u have any suggestions on how to reduce such rebalancing, that will be very helpful. Thank you very much. Kind regards, Venkatesh From: Matthias J. Sax Date: Tuesday, 12 March 2024 at 1:31 pm To: users@kafka.apache.org Subject: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stall

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-12 Thread Venkatesh Nagarajan
: Kafka Streams 3.5.1 based app seems to get stalled Thanks very much for your important inputs, Matthias. I will use the default METADATA_MAX_AGE_CONFIG. I set it to 5 hours when I saw a lot of such rebalancing related messages in the MSK broker logs: INFO [GroupCoordinator 2]: Preparing to

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-12 Thread Venkatesh Nagarajan
, that will be very helpful. Thank you very much. Kind regards, Venkatesh From: Matthias J. Sax Date: Tuesday, 12 March 2024 at 1:31 pm To: users@kafka.apache.org Subject: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled Without detailed logs (maybe even DEBUG) hard to say. But

Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-11 Thread Matthias J. Sax
metadata refresh does not trigger a rebalance. -Matthias On 3/10/24 5:56 PM, Venkatesh Nagarajan wrote: Hi all, A Kafka Streams application sometimes stops consuming events during load testing. Please find below the details: Details of the app: * Kafka Streams Version: 3.5.1

Kafka Streams 3.5.1 based app seems to get stalled

2024-03-10 Thread Venkatesh Nagarajan
Hi all, A Kafka Streams application sometimes stops consuming events during load testing. Please find below the details: Details of the app: * Kafka Streams Version: 3.5.1 * Kafka: AWS MSK v3.6.0 * Consumes events from 6 topics * Calls APIs to enrich events * Sometimes

Re: Kafka Streams: understanding re-key operations for joins

2024-02-23 Thread Karsten Stöckmann
Case closed, behaviour is actually as expected. - The source topic contains multiplied data that gets propagated into the join just as it should. I'm leveraging a stream processor for deduplication now. Best wishes Karsten Vikram Singh schrieb am Fr., 23. Feb. 2024, 12:13: > +Ajit Kharpude > >

Re: Kafka Streams: understanding re-key operations for joins

2024-02-23 Thread Vikram Singh
+Ajit Kharpude On Fri, Feb 23, 2024 at 1:14 PM Karsten Stöckmann < karsten.stoeckm...@gmail.com> wrote: > Hi, > > I am observing somewhat unexpected (from my point of view) behaviour > while ke-key / re-partitioning operations in order to prepare a > KTable-KTable join. > > Assume two (simplifie

Kafka Streams: understanding re-key operations for joins

2024-02-22 Thread Karsten Stöckmann
Hi, I am observing somewhat unexpected (from my point of view) behaviour while ke-key / re-partitioning operations in order to prepare a KTable-KTable join. Assume two (simplified) source data structures from two respective topics: class User { Long id; // PK String name; } class Attribute

Kafka Streams output topic configuration (specifically: Quarkus)

2024-02-14 Thread Karsten Stöckmann
Hi, is anyone here familiar with Quarkus Kafka Streams applications? If so - is there a way to control output topic configuration when streaming aggregate data into a sink like so: KTable aggregate = ...; aggregate.toStream().to("topic", ); -> Can I programmatically (or by appli

Re: What does kafka streams groupBy does internally?

2024-01-30 Thread Matthias J. Sax
I want to understand whether kafka streams can solve this efficiently. Since the number of files is unbound how would kafka manage intermediate topics for groupBy operation? How many partitions will it use etc? Can't find this details in the docs. Also let's say chunk has a flag th

Re: Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Slathia p
discussion with Partner Manager. > On 01/24/2024 11:50 PM +08 Dharin Shah wrote: > > > Hi Karsten, > > Before delving deeper into Kafka Streams, it's worth considering if direct > aggregation in the database might be a more straightforward solution, > unless there'

Re: Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Slathia p
not yet familiar with size > and performance requirements in Kafka as we are still somewhere at the > beginning of implementing our indexing solution. Initial Debezium snapshots > were quite fast from my point of view, resulting in overall broker disk > usage of 35Gi on 3 replicas ea

Re: Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Karsten Stöckmann
n overall broker disk usage of 35Gi on 3 replicas each. The intended Kafka Streams application is based on Quarkus and its Streams extension. To my knowledge, it uses RocksDB internally. Concerning State Store management - I haven't applied any special configuration yet so I guess there are

Re: Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Dharin Shah
Hi Karsten, Before delving deeper into Kafka Streams, it's worth considering if direct aggregation in the database might be a more straightforward solution, unless there's a compelling reason to avoid it. Aggregating data at the database level often leads to more efficient and ma

Kafka Streams LEFT JOIN multiple tables

2024-01-24 Thread Karsten Stöckmann
ation and label values (i.e. aggregated as lists), what would be the most elegant solution leveraging Kafka Streams? Note that customers do not necessisarily have any communication or label at all, thus non-key joins are out of the game as far as I understand. Our initial (naive) solution was to r

What does kafka streams groupBy does internally?

2024-01-24 Thread warrior2031
m DSL that might look liketopic('chunks') .groupByKey((fileId, chunk) -> fileId) .sortBy((fileId, chunk) -> chunk.offset) .aggregate((fileId, chunk) -> store.append(fileId, chunk)); I want to understand whether kafka streams can solve this efficiently. Since the number

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-17 Thread Matthias J. Sax
no `transform()` because keying must be preserved -- if you want to change the keying you  need to use `KTable#groupBy()` (data needs to be repartitioned if you change the key). HTH. -Matthias On 1/12/24 11:47 AM, Igor Maznitsa wrote: Hello Is there any way in Kafka Streams API to define proces

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-13 Thread Igor Maznitsa
no `transform()` because keying must be preserved -- if you want to change the keying you  need to use `KTable#groupBy()` (data needs to be repartitioned if you change the key). HTH. -Matthias On 1/12/24 11:47 AM, Igor Maznitsa wrote: Hello Is there any way in Kafka Streams API to define p

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-12 Thread Matthias J. Sax
to change the keying you need to use `KTable#groupBy()` (data needs to be repartitioned if you change the key). HTH. -Matthias On 1/12/24 11:47 AM, Igor Maznitsa wrote: Hello Is there any way in Kafka Streams API to define processors for KTable and KGroupedStream like KStream#transform

[Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-12 Thread Igor Maznitsa
Hello Is there any way in Kafka Streams API to define processors for KTable and KGroupedStream like KStream#transform? How to provide a custom processor for KTable or KGroupedStream which could for instance provide way to not downstream selected events? -- Igor Maznitsa email: rrg4

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-11-02 Thread Debraj Manna
ion-handler via > > KafkaStreamssetUncaughtExceptionHandler(...) > > > -Matthias > > On 10/2/23 12:11 PM, Debraj Manna wrote: > > Are you suggesting to check the Kafka broker logs? I do not see any other > > errors logs on the client / application side. > > > > On Fri, 29 Sep, 2023, 22:01 Matthi

Kafka Streams consumer application observing hotspotting during deployments

2023-11-01 Thread Sabit Nepal
Hello, We have a Kafka Streams consumer application running Kafka Streams 3.4, with our Kafka brokers running 2.6. This consumer application consumes from two topics with 250 partitions each, and we co-partition them to ensure each task is consuming from the same partitions in each topic. Some

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-10-02 Thread Matthias J. Sax
the Kafka broker logs? I do not see any other errors logs on the client / application side. On Fri, 29 Sep, 2023, 22:01 Matthias J. Sax, wrote: In general, Kafka Streams should keep running. Can you inspect the logs to figure out why it's going into ERROR state to begin with? Maybe you ne

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-10-02 Thread Debraj Manna
Are you suggesting to check the Kafka broker logs? I do not see any other errors logs on the client / application side. On Fri, 29 Sep, 2023, 22:01 Matthias J. Sax, wrote: > In general, Kafka Streams should keep running. > > Can you inspect the logs to figure out why it's going in

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-09-29 Thread Matthias J. Sax
In general, Kafka Streams should keep running. Can you inspect the logs to figure out why it's going into ERROR state to begin with? Maybe you need to increase/change some timeouts/retries configs. The stack trace you shared, is a symptom, but not the root cause. -Matthias On 9/21/23

Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-09-21 Thread Debraj Manna
I am using Kafka broker 2.8.1 (from AWS MSK) with Kafka clients and Kafka stream 3.5.1. I am observing that whenever some rolling upgrade is done on AWS MSK our stream application reaches an error state. I get the below exception on trying to query the state store caused by: java.lang.IllegalStat

Re: adding enum value in kafka streams

2023-09-20 Thread M M
t; Bruno > > On 9/20/23 12:22 AM, M M wrote: > > Hello, > > > > This is my first time asking a question on a mailing list, so please > > forgive me any inaccuracies. > > > > I am having a Kafka Streams application with a Punctuator. > > Inside th

Re: adding enum value in kafka streams

2023-09-20 Thread Bruno Cadonna
Kafka Streams application with a Punctuator. Inside the punctuate() I have this code: // fooStore is a state store for (FooKey fooKey : fooKeysRepository.get(Type.FOO)) { Foo foo = fooStore.get(fooKey); // foo != null This returns a Foo which is not null. FooKey has fields: [string(nullable

adding enum value in kafka streams

2023-09-19 Thread M M
Hello, This is my first time asking a question on a mailing list, so please forgive me any inaccuracies. I am having a Kafka Streams application with a Punctuator. Inside the punctuate() I have this code: // fooStore is a state store for (FooKey fooKey : fooKeysRepository.get(Type.FOO)) { Foo

Re: Kafka Streams, read standby time window store

2023-09-07 Thread Bruno Cadonna
ameAndType(TABLE_NAME, queryableStoreType).enableStaleStores() as described in the blog post? 2. Since you are querying a stale store, could it be the the standby hasn't caught up yet? Best, Bruno On 9/6/23 8:32 AM, Igor Maznitsa wrote: Hello 1. I am starting two Kafka Streams applications

Re: Kafka Streams, read standby time window store

2023-09-06 Thread Igor Maznitsa
he standby hasn't caught up yet? Best, Bruno On 9/6/23 8:32 AM, Igor Maznitsa wrote: Hello 1. I am starting two Kafka Streams applications worked in same group with num.standby.replicas=1 2. Application A has active TimeWindow data store and application B has the standby version of the

Re: Kafka Streams, read standby time window store

2023-09-06 Thread Bruno Cadonna
are querying a stale store, could it be the the standby hasn't caught up yet? Best, Bruno On 9/6/23 8:32 AM, Igor Maznitsa wrote: Hello 1. I am starting two Kafka Streams applications worked in same group with num.standby.replicas=1 2. Application A has active TimeWindow data stor

Kafka Streams, read standby time window store

2023-09-05 Thread Igor Maznitsa
Hello 1. I am starting two Kafka Streams applications worked in same group with num.standby.replicas=1 2. Application A has active TimeWindow data store and application B has the standby version of the data store Is there any way to read the standby store time window data in bounds of B

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Matthias J. Sax
ed. That is the reason why you get those incorrect alerts -- Kafka cannot know that you stopped consuming from those topics. (That is what I tried to explain -- seems I did a bad job...) Changing the group.id is tricky because Kafka Streams uses it to identify internal topic names (for repartito

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Pushkar Deole
hose incorrect alerts -- Kafka cannot know > that you stopped consuming from those topics. (That is what I tried to > explain -- seems I did a bad job...) > > Changing the group.id is tricky because Kafka Streams uses it to > identify internal topic names (for repartiton and chagnelog topi

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-04 Thread Matthias J. Sax
Kafka Streams uses it to identify internal topic names (for repartiton and chagnelog topics), and thus your app would start with newly created (and thus empty topics). -- You might want to restart the app with `auto.offset.reset = "earliest"` and reprocess all available input to re-cr

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-19 Thread Pushkar Deole
@matthias what are the alternatives to get rid of this issue? When the lag starts increasing, we have alerts configured on our monitoring system in Datadog which starts sending alerts and alarms to reliability teams. I know in kafka the inactive consumer group is cleared up after 7 days however no

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-16 Thread Matthias J. Sax
Well, it's kinda expected behavior. It's a split brain problem. In the end, you use the same `application.id / group.id` and thus the committed offsets for the removed topics are still in `__consumer_offsets` topics and associated with the consumer group. If a tool inspects lags and compares

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-16 Thread Pushkar Deole
Hi streams Dev community @matthias, @bruno Any inputs on above issue? Is this a bug in the streams library wherein the input topic removed from streams processor topology, the underlying consumer group still reporting lag against those? On Wed, Aug 9, 2023 at 4:38 PM Pushkar Deole wrote: > Hi

kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-09 Thread Pushkar Deole
Hi All, I have a streams application with 3 instances with application-id set to applicationV1. The application uses processor API with reading from source topics, processing the data and writing to destination topic. Currently it consumes from 6 source topics however we don't need to process data

Re: kafka streams re-partitioning on incoming events

2023-07-24 Thread Pushkar Deole
process them in that order"? > >> > >> The order of the events from the same source partition (partition before > >> repartitioning) that have the same call ID (or more generally that end > >> up in the same partition after repartitioning) will be prese

Re: kafka streams re-partitioning on incoming events

2023-07-14 Thread Bruno Cadonna
served but Kafka does not guarantee the order of events from different source partitions. Best, Bruno On 7/9/23 2:45 PM, Pushkar Deole wrote: Hi, We have a kafka streams application that consumes from multiple topic with different keys. Before processing these events in the application, we w

Re: kafka streams re-partitioning on incoming events

2023-07-14 Thread Pushkar Deole
source > partitions. > > Best, > Bruno > > On 7/9/23 2:45 PM, Pushkar Deole wrote: > > Hi, > > > > We have a kafka streams application that consumes from multiple topic > with > > different keys. Before processing these events in the application, we >

Re: kafka streams re-partitioning on incoming events

2023-07-14 Thread Bruno Cadonna
Bruno On 7/9/23 2:45 PM, Pushkar Deole wrote: Hi, We have a kafka streams application that consumes from multiple topic with different keys. Before processing these events in the application, we want to repartition those events on a single key that will ensure related events are process

Re: kafka streams re-partitioning on incoming events

2023-07-13 Thread Pushkar Deole
Hello, *Kafka dev community, @matthiasJsax* Can you comment on below question? It is very important for us since we are getting inconsistencies due to current design On Sun, Jul 9, 2023 at 6:15 PM Pushkar Deole wrote: > Hi, > > We have a kafka streams application that consumes from

  1   2   3   4   5   6   7   8   9   10   >