Re: outerjoin not joining after window

2024-05-22 Thread Matthias J. Sax
of the application will process records? For example, if the input topics each have 6 partitions, and I use the repartition method to set the number of partitions for the streams to 2, how many instances of the application will process records? Thanks, Chad On Wed, May 1, 2024 at 6:47 PM Matthias J. Sax

Re: Request to be added to kafka contributors list

2024-05-21 Thread Matthias J. Sax
_____ From: Matthias J. Sax Sent: Saturday, May 18, 2024 4:06 To: users@kafka.apache.org Subject: Re: 回复: Request to be added to kafka contributors list Did you sign out and sign in again? On 5/17/24 9:49 AM, Yang Fan wrote: Thanks Matthias, I still can't find "Assign to me" button beside A

Re: Fwd: Request to be added to kafka contributors list

2024-05-20 Thread Matthias J. Sax
Done. You should be all set :) -Matthias On 5/20/24 10:10 AM, bou...@ulukai.net wrote: Dear Apache Kafka Team,     I hope to post in the right place: my name is Franck LEDAY, under Apache-Jira ID "handfreezer".     I opened an issue as Improvement KAFKA-16707 but I failed to assigned

Re: Request for contributor list

2024-05-20 Thread Matthias J. Sax
What is your Jira ID? -Matthias On 5/20/24 9:55 AM, Brenden Deluna wrote: Hello, I am requesting to be added to the contributor list to take care of some tickets. Thank you.

Re: Release plan required

2024-05-20 Thread Matthias J. Sax
Zookeeper is already deprecated (since 3.5): https://kafka.apache.org/documentation/#zk_depr It's planned to be fully removed in 4.0 release. It's not confirmed yet, but there is a high probability that there won't be a 3.9 release, and that 4.0 will follow 3.8. -Matthias On 5/20/24 2:11

Re: 回复: Request to be added to kafka contributors list

2024-05-17 Thread Matthias J. Sax
Did you sign out and sign in again? On 5/17/24 9:49 AM, Yang Fan wrote: Thanks Matthias, I still can't find "Assign to me" button beside Assignee and Reporter. Could you help me set it again? Best regards, Fan 发件人: Matthias J. Sax 发送时间: 2024年5月17

Re: Kafka streams stores key in multiple state store instances

2024-05-16 Thread Matthias J. Sax
Hello Kay, What you describe is "by design" -- unfortunately. The problem is, that when we build the `Topology` we don't know the partition count of the input topics, and thus, StreamsBuilder cannot insert a repartition topic for this case (we always assume that the partition count is the

Re: Request to be added to kafka contributors list

2024-05-16 Thread Matthias J. Sax
Thanks for reaching out Yang. You should be all set. -Matthias On 5/16/24 7:40 AM, Yang Fan wrote: Dear Apache Kafka Team, I hope this email finds you well. My name is Fan Yang, JIRA ID is fanyan, I kindly request to be added to the contributors list for Apache Kafka. Being part of this

Re: Query regarding groupbykey in streams

2024-05-15 Thread Matthias J. Sax
If I read this correctly, your upstream producer which writes into the input topic of you KS app is using a custom partitioner? If you do a `groupByKey()` and change the key upstream, it would result in a repartition step, which would fall back to the default partioner. If you want to use a

Re: Failed to initialize processor KSTREAM-AGGREGATE-0000000001

2024-05-03 Thread Matthias J. Sax
-3910-4c25-bfad-ea2b98953db3-StreamThread-9 Message: [Consumer clientId=kafka-streams-exec-0-test-store-6d676cf0-3910-4c25-bfad-ea2b98953db3-StreamThread-9-consumer, groupId=kafka-streams-exec-0-test-store ] Node 102 disconnected. On Mon, Apr 22, 2024 at 7:16 AM Matthias J. Sax wrote

Re: outerjoin not joining after window

2024-05-01 Thread Matthias J. Sax
oes process one sided joins after the skipped record. Do you have any docs on the "dropper records" metric? I did a Google search and didn't find many good results for that. Thanks, Chad On Tue, Apr 30, 2024 at 2:49 PM Matthias J. Sax wrote: Thanks for the information. I ran the cod

Re: outerjoin not joining after window

2024-04-30 Thread Matthias J. Sax
n on to tell me what is going on? Basically, I'm looking for some pointers on where I can start looking. Thanks, Chad On Tue, Apr 30, 2024 at 10:26 AM Matthias J. Sax wrote: I expect the join to execute after the 25 with one side of the join containing a record and the other being null Give

Re: outerjoin not joining after window

2024-04-30 Thread Matthias J. Sax
I expect the join to execute after the 25 with one side of the join containing a record and the other being null Given that you also have a grace period of 5 minutes, the result will only be emitted after the grace-period passed and the window is closed (not when window end time is reached).

Re: How to find out the end of the session window

2024-04-29 Thread Matthias J. Sax
Did you look into .windowedBy(...).emitStrategy(...) ? Using emit-final you would get an downstream even only after the window closed. -Matthias On 4/29/24 1:43 AM, Santhoshi Mekala wrote: Hi Team, We have the below requirement: We are processing batch logs in kstreams. Currently, we are

Re: [ANNOUNCE] New committer: Igor Soarez

2024-04-24 Thread Matthias J. Sax
Congrats! On 4/24/24 2:29 PM, Bill Bejeck wrote: Congrats Igor! -Bill On Wed, Apr 24, 2024 at 2:37 PM Tom Bentley wrote: Congratulations Igor! On Thu, 25 Apr 2024 at 6:27 AM, Chia-Ping Tsai wrote: Congratulations, Igor! you are one of the best Kafka developers!!! Mickael Maison 於

Re: Failed to initialize processor KSTREAM-AGGREGATE-0000000001

2024-04-21 Thread Matthias J. Sax
Not sure either, but it sounds like a bug to me. Can you reproduce this reliably? What version are you using? It would be best if you could file a Jira ticket and we can take it from there. -Matthias On 4/21/24 5:38 PM, Penumarthi Durga Prasad Chowdary wrote: Hi , I have an issue in

Re: Streams group final result: EmitStrategy vs Suppressed

2024-04-18 Thread Matthias J. Sax
The main difference is the internal implementation. Semantically, both are equivalent. suppress() uses an in-memory buffer, while `emitStrategy()` does not, but modifies the upstream aggregation operator impl, and waits to send results downstream, and thus, it's RocksDB based. -Matthias

Re: Is there any recommendation about header max size?

2024-04-18 Thread Matthias J. Sax
I don't think that there is any specific recommendation. However, there is an overall max-message-size config that you need to keep in mind. -Matthias On 4/16/24 9:42 AM, Gabriel Giussi wrote: I have logic in my service to capture exceptions being thrown during message processing and produce

Re: [ANNOUNCE] New Kafka PMC Member: Greg Harris

2024-04-18 Thread Matthias J. Sax
Congrats Greg! On 4/15/24 10:44 AM, Hector Geraldino (BLOOMBERG/ 919 3RD A) wrote: Congrats! Well deserved From: d...@kafka.apache.org At: 04/13/24 14:42:22 UTC-4:00To: d...@kafka.apache.org Subject: [ANNOUNCE] New Kafka PMC Member: Greg Harris Hi all, Greg Harris has been a Kafka

Re: Fix slow processing rate in Kafka streams

2024-04-05 Thread Matthias J. Sax
Perf tuning is always tricky... 350 rec/sec sounds pretty low though. You would first need to figure out where the bottleneck is. Kafka Streams exposes all kind of metrics: https://kafka.apache.org/documentation/#kafka_streams_monitoring Might be good to inspect them as a first step -- maybe

Re: outerJoin confusion

2024-04-04 Thread Matthias J. Sax
Yeah, that is some quirk of KS runtime... There is some internal config (for perf reasons) that delays emitting results... An alternative to advancing wall-clock time would be to set this internal config to zero, to disable the delay. Maybe we should disable this config when topology test

Re: [EXTERNAL] Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-04-04 Thread Matthias J. Sax
nce group in state PreparingRebalance with old generation (__consumer_offsets-nn) (reason: Updating metadata for member during Stable; client reason: need to revoke partitions and re-join) (kafka.coordinator.group.GroupCoordinator) I am guessing that the two are unrelated. If you have any

Re: [ANNOUNCE] New committer: Christo Lolov

2024-03-27 Thread Matthias J. Sax
Congrats! On 3/26/24 9:39 PM, Christo Lolov wrote: Thank you everyone! It wouldn't have been possible without quite a lot of reviews and extremely helpful inputs from you and the rest of the community! I am looking forward to working more closely with you going forward :) On Tue, 26 Mar 2024

Re: Kafka Streams 3.5.1 based app seems to get stalled

2024-03-11 Thread Matthias J. Sax
Without detailed logs (maybe even DEBUG) hard to say. But from what you describe, it could be a metadata issue? Why are you setting METADATA_MAX_AGE_CONFIG (consumer and producer): 5 hours in millis (to make rebalances rare) Refreshing metadata has nothing to do with rebalances, and a

Re: Join request

2024-02-24 Thread Matthias J. Sax
To subscribe, please follow instructions from the webpage https://kafka.apache.org/contact -Matthias On 2/23/24 1:15 AM, kashi mori wrote: Hi, please add my email to the mailin list

Re: GlobalKTable with RocksDB - queries before state RUNNING?

2024-02-21 Thread Matthias J. Sax
Filed https://issues.apache.org/jira/browse/KAFKA-16295 Also, for global store support, we do have a ticket already: https://issues.apache.org/jira/browse/KAFKA-13523 It's actually a little bit more involved due to position tracking... I guess we might need a KIP to fix this. And yes: if

Re: EOS date for Kafka 3.5.1

2024-02-12 Thread Matthias J. Sax
https://cwiki.apache.org/confluence/display/KAFKA/Time+Based+Release+Plan#TimeBasedReleasePlan-WhatIsOurEOLPolicy? On 2/11/24 8:08 PM, Sahil Sharma D wrote: Hi team, Can you please share the EOS date for Kafka Version 3.5.1? Regards, Sahil

Re: Re-key by multiple properties without composite key

2024-02-07 Thread Matthias J. Sax
regate person. There are 14 sub topologies... - measuring the e2e latency shows values around 600ms which seems rather high to me. Does that sound crazy? ;) Best wishes Karsten Am Do., 1. Feb. 2024 um 19:02 Uhr schrieb Matthias J. Sax : I see. You need to ensure that you get _all_ Person. For this c

Re: Re-key by multiple properties without composite key

2024-02-01 Thread Matthias J. Sax
nd then using that for three independent re-key-operations is not allowed. Best wishes, Karsten Am Do., 1. Feb. 2024 um 02:16 Uhr schrieb Matthias J. Sax : Thanks for the details. This does make sense. So it seems you can read all topic as table (ie, builder.table("topic") -- no need to so `b

Re: Re-key by multiple properties without composite key

2024-01-31 Thread Matthias J. Sax
I, as I've specified no extra configuration for them. Best wishes, Karsten Am Di., 30. Jan. 2024 um 20:03 Uhr schrieb Matthias J. Sax : Both fk1 and fk2 point to the PK of another entity (not shown for brevity, of no relevance to the question). It this two independent FK, or one two-column FK? Ing

Re: What does kafka streams groupBy does internally?

2024-01-30 Thread Matthias J. Sax
Did reply on SO. -Matthias On 1/24/24 2:18 AM, warrior2...@gmail.com wrote: Let's say there's a topic in which chunks of different files are all mixed up represented by a tuple |(FileId, Chunk)|. Chunks of a same file also can be a little out of order. The task is to aggregate all files and

Re: Re-key by multiple properties without composite key

2024-01-30 Thread Matthias J. Sax
Both fk1 and fk2 point to the PK of another entity (not shown for brevity, of no relevance to the question). It this two independent FK, or one two-column FK? Ingesting the topic into a Kafka Streams application, how can I re-key the resulting KTable by both fk1 and fk2? If you read the

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-17 Thread Matthias J. Sax
tream here ???/* On 2024-Jan-13 01:22, Matthias J. Sax wrote: `KGroupedStream` is just an "intermediate representation" to get a better flow in the DSL. It's not a "top level" abstraction like KStream/KTable. For `KTable` there is `transformValue()` -- there is no `tra

Re: [PROPOSAL] Add commercial support page on website

2024-01-12 Thread Matthias J. Sax
François, thanks for starting this initiative. Personally, I don't think it's necessarily harmful for the project to add such a new page, however, I share the same concerns others raised already. I understand your motivation that people had issues finding commercial support, but I am not

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-12 Thread Matthias J. Sax
`KGroupedStream` is just an "intermediate representation" to get a better flow in the DSL. It's not a "top level" abstraction like KStream/KTable. For `KTable` there is `transformValue()` -- there is no `transform()` because keying must be preserved -- if you want to change the keying you

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-10-02 Thread Matthias J. Sax
the Kafka broker logs? I do not see any other errors logs on the client / application side. On Fri, 29 Sep, 2023, 22:01 Matthias J. Sax, wrote: In general, Kafka Streams should keep running. Can you inspect the logs to figure out why it's going into ERROR state to begin with? Maybe you need

Re: Kafka Streams reaching ERROR state during rolling upgrade / restart of brokers

2023-09-29 Thread Matthias J. Sax
In general, Kafka Streams should keep running. Can you inspect the logs to figure out why it's going into ERROR state to begin with? Maybe you need to increase/change some timeouts/retries configs. The stack trace you shared, is a symptom, but not the root cause. -Matthias On 9/21/23 12:56

Re: Can a message avoid loss occur in Kafka

2023-09-29 Thread Matthias J. Sax
For the config you provide, data loss should not happen (as long as you don't allow for unclean leader election, which is disabled by default). But you might be subject to unavailability for some partitions if a broker fails. -Matthias On 9/17/23 7:49 AM, 陈近南 wrote: Hello, Can a

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-05 Thread Matthias J. Sax
t is the case with topics that were consumed previously and not consumed now. Does creation of new consumer group (setting a different application.id) on streams application an option here? On Thu, Aug 17, 2023 at 7:03 AM Matthias J. Sax wrote: Well, it's kinda expected behavior. I

Re: AW: Table updates are not consistent when doing a join with a Stream

2023-09-04 Thread Matthias J. Sax
Your update to the KTable is async when you send data back to the KTable input topic. So your program is subject to race-conditions. So switching to the PAPI was the right move: it make the update to the state store sync and thus fixes the issue. -Matthias On 9/4/23 5:53 AM, Mauricio Lopez

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-09-04 Thread Matthias J. Sax
e inactive consumer group is cleared up after 7 days however not sure if that is the case with topics that were consumed previously and not consumed now. Does creation of new consumer group (setting a different application.id) on streams application an option here? On Thu, Aug 17, 2023 at 7:03 AM

Re: kafka streams consumer group reporting lag even on source topics removed from topology

2023-08-16 Thread Matthias J. Sax
Well, it's kinda expected behavior. It's a split brain problem. In the end, you use the same `application.id / group.id` and thus the committed offsets for the removed topics are still in `__consumer_offsets` topics and associated with the consumer group. If a tool inspects lags and compares

Re: Consuming an entire partition with control messages

2023-07-27 Thread Matthias J. Sax
"1" in this kind of situation Best regards, Vincent On 13/06/2023 17:27, Matthias J. Sax wrote: Sounds like a bug in aiokafka library to me. If the last message in a topic partition is a tx-marker, the consumer should step over it, and report the correct position after the marker.

Re: [ANNOUNCE] New committer: Greg Harris

2023-07-10 Thread Matthias J. Sax
Congrats! On 7/10/23 8:45 AM, Chris Egerton wrote: Hi all, The PMC for Apache Kafka has invited Greg Harris to become a committer, and we are happy to announce that he has accepted! Greg has been contributing to Kafka since 2019. He has made over 50 commits mostly around Kafka Connect and

Re: Kafka Streaming: RocksDbSessionBytesStoreSupplier seems lost data in Kubernetes

2023-06-29 Thread Matthias J. Sax
The class `RocksDbSessionBytesStoreSupplier` is in package `internal` and thus, you should not use it directly. Instead, you should use the public factory class `org.apache.kafka.streams.state.Stores` However, your usage seems correct in general. Not sure why you pass-in the supplier directly

Re: Consuming an entire partition with control messages

2023-06-13 Thread Matthias J. Sax
Sounds like a bug in aiokafka library to me. If the last message in a topic partition is a tx-marker, the consumer should step over it, and report the correct position after the marker. The official KafkaConsumer (ie, the Java one), does the exact same thing. -Matthias On 5/30/23 8:41 AM,

Re: [VOTE] 3.4.1 RC0

2023-05-22 Thread Matthias J. Sax
Thanks a lot! -Matthias On 5/21/23 7:27 PM, Luke Chen wrote: Hi Matthias, Yes, I agree we should get this hotfix into 3.4.1. I've backported into the 3.4 branch. I'll create a new RC for 3.4.1. Thanks. Luke On Mon, May 22, 2023 at 5:13 AM Matthias J. Sax wrote: Hi Luke, RC0 for 3.4.1

Re: [VOTE] 3.4.1 RC0

2023-05-21 Thread Matthias J. Sax
Hi Luke, RC0 for 3.4.1 includes a fix for https://issues.apache.org/jira/browse/KAFKA-14862. We recently discovered that tge fix itself introduces a regression. We have already a PR to fix-forward the regression: https://github.com/apache/kafka/pull/13734 I think we should get the open PR

Re: Some questions on Kafka on order of messages with mutiple partitions

2023-05-12 Thread Matthias J. Sax
Does having 9 partitions with 9 replication factors make sense here? A replication factor of 9 sounds very high. For production, replication factor of 3 is recommended. How many partitions you want/need is a different question, and cannot be answered in a general way. "Yes" to all other

Re: [DISCUSS] Re-visit end of life policy

2023-04-25 Thread Matthias J. Sax
the users about the community's 12 month EOL policy. I will get back on this thread once I have more data to support the proposal. -- Divij Vaidya On Thu, Apr 20, 2023 at 3:52 AM Matthias J. Sax wrote: While I understand the desire, I tend to agree with Ismael. In general, it's a significant

Re: [ANNOUNCE] New PMC chair: Mickael Maison

2023-04-21 Thread Matthias J. Sax
Congrats Mickael! And thanks a lot for taking on this additional task! Glad to have you! -Matthias On 4/21/23 9:40 AM, Viktor Somogyi-Vass wrote: Jun, thank you for all your hard work! Also, congrats Mickael, it is very well deserved :) Best, Viktor On Fri, Apr 21, 2023, 18:15 Adam

Re: can Kafka streams support ordering across 2 different topics when consuming from multiple source topics?

2023-03-21 Thread Matthias J. Sax
In general there is no ordering guarantee between topics. So it might depend a lot ofnthe details of your use case. For example, if you know that it will be always two event, you could buffer the first one in a state-store, and wait for the second one to arrive and decide in which order to

Re: [ANNOUNCE] New Kafka PMC Member: Chris Egerton

2023-03-09 Thread Matthias J. Sax
Congrats! On 3/9/23 2:59 PM, José Armando García Sancio wrote: Congrats Chris. On Thu, Mar 9, 2023 at 2:01 PM Kowshik Prakasam wrote: Congrats Chris! On Thu, Mar 9, 2023 at 1:33 PM Divij Vaidya wrote: Congratulations Chris! I am in awe with the amount of effort you put in code reviews

Re: [ANNOUNCE] New Kafka PMC Member: David Arthur

2023-03-09 Thread Matthias J. Sax
Congrats! On 3/9/23 2:59 PM, José Armando García Sancio wrote: Congrats David! On Thu, Mar 9, 2023 at 2:00 PM Kowshik Prakasam wrote: Congrats David! On Thu, Mar 9, 2023 at 12:09 PM Lucas Brutschy wrote: Congratulations! On Thu, Mar 9, 2023 at 8:37 PM Manikumar wrote: Congrats

Re: Kafka Streams 2.7.1 to 3.3.1 rolling upgrade

2023-02-27 Thread Matthias J. Sax
Hmmm... that's interesting... It seems that Kafka Streams "version probing" does not play well static group membership... Sounds like a "bug" to me -- well, more like a missing integration. Not sure right now, if/how we could fix it. Can you file a ticket? For now, I don't think you can

Re: Coralogix Logo on Powered By Page

2023-02-01 Thread Matthias J. Sax
Thanks for reaching out. Can you open a PR against https://github.com/apache/kafka-site updating `powered-by.html`? -Matthias On 2/1/23 1:13 AM, Tali Soroker wrote: Hi, I am writing on behalf of Coralogix to request adding us to the Powered By page on the Apache Kafka website. I am

Re: Custom Kafka Streams State Restore Logic

2023-01-23 Thread Matthias J. Sax
​  | Senior Software Developer  | *ude...@itrsgroup.com* <mailto:ude...@itrsgroup.com> *www.itrsgroup.com* <https://www.itrsgroup.com/> <https://www.itrsgroup.com/> *From: *Matthias J. Sax *Date: *Wednesday, January 18, 2023 at 12:50 AM *To: *users@kafka.ap

Re: Custom Kafka Streams State Restore Logic

2023-01-17 Thread Matthias J. Sax
Guess it depends what you actually want to achieve? Also note: `InMemoryWindowStore` is an internal class, and thus might change at any point, and it was never designed to be extended... -Matthias On 1/13/23 2:55 PM, Upesh Desai wrote: Hello all, I am currently working on creating a new

Re: [ANNOUNCE] New committer: Stanislav Kozlovski

2023-01-17 Thread Matthias J. Sax
Congrats! On 1/17/23 1:26 PM, Ron Dagostino wrote: Congratulations, Stan! Ron On Jan 17, 2023, at 12:29 PM, Mickael Maison wrote: Congratulations Stanislav! On Tue, Jan 17, 2023 at 6:06 PM Rajini Sivaram wrote: Congratulations, Stan! Regards, Rajini On Tue, Jan 17, 2023 at 5:04 PM

[ANNOUNCE] New committer: Walker Carlson

2023-01-17 Thread Matthias J. Sax
Dear community, I am pleased to announce Walker Carlson as a new Kafka committer. Walker has been contributing to Apache Kafka since November 2019. He made various contributions including the following KIPs. KIP-671: Introduce Kafka Streams Specific Uncaught Exception Handler KIP-696: Update

Re: [ANNOUNCE] New committer: Edoardo Comar

2023-01-06 Thread Matthias J. Sax
Congrats! On 1/6/23 5:15 PM, Luke Chen wrote: Congratulations, Edoardo! Luke On Sat, Jan 7, 2023 at 7:58 AM Mickael Maison wrote: Congratulations Edo! On Sat, Jan 7, 2023 at 12:05 AM Jun Rao wrote: Hi, Everyone, The PMC of Apache Kafka is pleased to announce a new Kafka committer

Re: [ANNOUNCE] New committer: Justine Olshan

2023-01-03 Thread Matthias J. Sax
Congrats! On 12/29/22 6:47 PM, ziming deng wrote: Congratulations Justine! — Best, Ziming On Dec 30, 2022, at 10:06, Luke Chen wrote: Congratulations, Justine! Well deserved! Luke On Fri, Dec 30, 2022 at 9:15 AM Ron Dagostino wrote: Congratulations, Justine!Well-deserved., and I’m

Re: Kafka Stream: The state store, wkstore, may have migrated to another instance

2022-12-29 Thread Matthias J. Sax
Sounds like a SpringBoot issue rather than a KS issues. -Matthias On 12/29/22 2:45 AM, Nawal Sah wrote: Hi, My SpringBoot stream application works fine in a fresh start of the clustered environment. But when I restart one of the pods out of two pods, I start getting the below exception from

Re: [ANNOUNCE] New committer: Satish Duggana

2022-12-27 Thread Matthias J. Sax
Congrats! On 12/27/22 10:20 AM, Kirk True wrote: Congrats, Satish! On Fri, Dec 23, 2022, at 10:07 AM, Jun Rao wrote: Hi, Everyone, The PMC of Apache Kafka is pleased to announce a new Kafka committer Satish Duggana. Satish has been a long time Kafka contributor since 2017. He is the main

Re: [ANNOUNCE] New committer: Josep Prat

2022-12-20 Thread Matthias J. Sax
Congrats! On 12/20/22 12:01 PM, Josep Prat wrote: Thank you all! ——— Josep Prat Aiven Deutschland GmbH Immanuelkirchstraße 26, 10405 Berlin Amtsgericht Charlottenburg, HRB 209739 B Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen m: +491715557497 w: aiven.io e: josep.p...@aiven.io On

Re: [ANNOUNCE] New committer: Ron Dagostino

2022-12-16 Thread Matthias J. Sax
Congrats! On 12/15/22 7:09 AM, Rajini Sivaram wrote: Congratulations, Ron! Well deserved!! Regards, Rajini On Thu, Dec 15, 2022 at 11:42 AM Ron Dagostino wrote: Thank you, everyone! Ron On Dec 15, 2022, at 5:09 AM, Bruno Cadonna wrote: Congrats Ron! Best, Bruno On 15.12.22 10:23,

Re: [ANNOUNCE] New committer: Viktor Somogyi-Vass

2022-12-16 Thread Matthias J. Sax
Congrats! On 12/15/22 7:10 AM, Rajini Sivaram wrote: Congratulations, Viktor! Regards, Rajini On Thu, Dec 15, 2022 at 11:41 AM Ron Dagostino wrote: Congrats to you too, Victor! Ron On Dec 15, 2022, at 4:59 AM, Viktor Somogyi-Vass < viktor.somo...@cloudera.com.invalid> wrote: Thank

Re: Thread Safety of Punctuator Functions and Processor Contexts

2022-11-07 Thread Matthias J. Sax
don't think that there is any guarantee that you might "see" concurrent modification (IIRC, RocksDB uses snapshot isolation for iterators). But maybe that's good enough for you? -Matthias On 11/7/22 11:13 AM, Joshua Suskalo wrote: "Matthias J. Sax" writes: In ge

Re: Thread Safety of Punctuator Functions and Processor Contexts

2022-11-07 Thread Matthias J. Sax
used iterator is concurrent, there is no API contract about it. -Matthias On 11/7/22 7:41 AM, Joshua Suskalo wrote: Hello Matthias, thanks for the response! "Matthias J. Sax" writes: Spanning your own thread and calling context.forward() is _not_ safe, and there is currently no w

Re: Thread Safety of Punctuator Functions and Processor Contexts

2022-11-04 Thread Matthias J. Sax
Your observation is correct. The Processor#process() and punctuation callback are executed on a single thread. It's by design to avoid the issue of concurrency (writing thread safe code is hard and we want to avoid putting this burden onto the user). There is currently no plans to make

Re: [ANNOUNCE] New Kafka PMC Member: Bruno Cadonna

2022-11-02 Thread Matthias J. Sax
Congrats! On 11/1/22 7:08 PM, Luke Chen wrote: Congrats Bruno! Well deserved! Luke On Wed, Nov 2, 2022 at 10:07 AM John Roesler wrote: Congratulations, Bruno!!! On Tue, Nov 1, 2022, at 15:16, Lucas Brutschy wrote: Wow, congratulations! On Tue, Nov 1, 2022 at 8:55 PM Chris Egerton

Re: How it is safe to break message ordering but not idempotency after getting an OutOfOrderSequenceException?

2022-06-15 Thread Matthias J. Sax
es include a message being written to topic A, could messages from batch with sn X+1 end up being persisted with an offset lesser than the ones from the batch with sn X? Does this question make sense? El mar, 7 jun 2022 a las 16:13, Matthias J. Sax () escribió: Yes, the broker de-dupes using

Re: How it is safe to break message ordering but not idempotency after getting an OutOfOrderSequenceException?

2022-06-07 Thread Matthias J. Sax
Yes, the broker de-dupes using the sequence number. But for example, if a sequence number is skipped, you could get this exception: the current batch of messages cannot be appended to the log, as one batch is missing, and the producer would need to re-send the previous/missing batch with

Re: Newbie how to get key/value pojo out of a stream?

2022-06-07 Thread Matthias J. Sax
`enable.auto.commit` is a Consumer config and does not apply to Kafka Stream. In Kafka Streams, you basically always have auto commit enabled, and you can control how frequently commits happen via `commit.interval.ms`. Also on `close()` Kafka Streams would commit offsets. -Matthias On

Re: kafka stream - sliding window - getting unexpected output

2022-05-20 Thread Matthias J. Sax
ane wrote: @Matthias J. Sax / All Have added below line : .suppress(Suppressed.untilWindowCloses(Suppressed.BufferConfig.unbounded())) Here is the output : (for uuid (*2cbef750-325b-4a2f-ac39-b2c23fa0313f)*, expecting single output but that is not the case here. Which 1 is the final output from tho

Re: kafka stream - sliding window - getting unexpected output

2022-05-18 Thread Matthias J. Sax
Emitting intermediate result is by-design. If you don't want to get intermediate result, you can add `suppress()` after the aggregation and configure it to only "emit on window close". -Matthias On 5/17/22 3:20 AM, Shankar Mane wrote: Hi All, Our use case is to use sliding window. (for

Re: How to achieve high availability in a Kafka Streams app during deployment?

2022-03-05 Thread Matthias J. Sax
Hard to answer from a 10,000ft view. In general, a rolling upgrade (ie, bounce one instance at a time) is recommended. If you have state, you would need to ensure that state is not lost during a bounce. As you are using Kubernetes, using stateful sets that allow you to re-attach disk should

Re: How to achieve high availability in a Kafka Streams app during deployment?

2022-03-05 Thread Matthias J. Sax
Hard to answer from a 10,000ft view. In general, a rolling upgrade (ie, bounce one instance at a time) is recommended. If you have state, you would need to ensure that state is not lost during a bounce. As you are using Kubernetes, using stateful sets that allow you to re-attach disk should

Re: [ANNOUNCE] New committer: Luke Chen

2022-02-09 Thread Matthias J. Sax
Congratulations! Glad to have you onboard, Luke! -Matthias On 2/9/22 16:37, Bill Bejeck wrote: Congrats Luke! Well deserved. -Bill On Wed, Feb 9, 2022 at 7:25 PM Israel Ekpo wrote: Congratulations Luke! Thank you for your service On Wed, Feb 9, 2022 at 6:22 PM Guozhang Wang wrote:

Re: Kafka Streams - one topic moves faster the other one

2022-01-04 Thread Matthias J. Sax
If you observer timestamps based synchronization issues, you might also consider to switch to 3.0 release, that closes a few more gaps to this end. Cf https://cwiki.apache.org/confluence/display/KAFKA/KIP-695%3A+Further+Improve+Kafka+Streams+Timestamp+Synchronization -Matthias On 12/29/21

Re: How to properly use a clean a TimestampedKeyValueStore

2022-01-04 Thread Matthias J. Sax
Not 100% sure. From what you describe it should work as expected. It seems `delete()` does not delete the key from the store (ie, RocksDB) itself (for unknown reasons)? Are you closing all your iterators correctly? (More or less a wild guess at the moment.) Did you enable caching for the

Re: [ANNOUNCE] New Kafka PMC member: David Jacot

2021-12-18 Thread Matthias J. Sax
Congrats! On 12/17/21 15:46, Bill Bejeck wrote: Congratulations David! Well deserved. -Bill On Fri, Dec 17, 2021 at 6:43 PM José Armando García Sancio wrote: Congrats David! On Fri, Dec 17, 2021 at 3:09 PM Gwen Shapira wrote: Hi everyone, David Jacot has been an Apache Kafka committer

Re: Kafka Streams app process records until certain date

2021-12-08 Thread Matthias J. Sax
Hard to achieve. I guess a naive approach would be to use a `flatMapTransform()` to implement a filter that drops all record that are not in the desired time range. pause() and resume() are not available in Kafka Streams, but only on the KafkaConsumer (The Spring docs you cite is also about

Re: Event order in Kafka Streams after Left Join

2021-12-06 Thread Matthias J. Sax
I had heard when doing a join, the timestamp of the generated message is taken from the message triggering the join or the biggest timestamp of the two. In older versions it was the timestamp of the record that triggered the join. Since 2.3, it is the maximum of both (cf

Re: Kafka Streams - left join behavior

2021-12-06 Thread Matthias J. Sax
It's fixed in upcoming 3.1 release. Cf https://issues.apache.org/jira/browse/KAFKA-10847 A stream-(global)table join has different semantics, so I am not sure if it would help. One workaround would be to apply a stateful` faltTransformValues()` after the join to "buffer" all NULL-results

Re: Pause/Restart a Kafka streams app

2021-11-22 Thread Matthias J. Sax
You can only close() the Kafka Streams client and create a new one to resume (offsets are committed on close() and thus would be picked up on restart). Closing and restarting would result in rebalancing thought, so to really pause/resume you would need to close() all instances. There is no

Re: Stream to KTable internals

2021-11-19 Thread Matthias J. Sax
satisfying solution. No matter what time I put in there it seems possible that I will miss a join. Respectfully, Chad On Fri, Nov 5, 2021 at 3:07 PM Matthias J. Sax wrote: The log clearly indicates that you hit enforced processing. We record the metric and log: Cf https://github.com/apache/kafka

Re: Endless loop restoring changelog topic

2021-11-16 Thread Matthias J. Sax
Not sure. Can you enable DEBUG logging on `org.apache.kafka.streams.processor.internals.StoreChangelogReader` to see if restore does make any progress? -Matthias On 7/20/21 5:41 AM, Alessandro Tagliapietra wrote: I've tried to restart the streams application using at_least_once processing

Re: Please add me to JIRA contributor list

2021-11-09 Thread Matthias J. Sax
Done. On 11/9/21 5:06 PM, Liam Clarke-Hutchinson wrote: Hi, My JIRA username is lclarkenz. Many thanks, Liam Clarke-Hutchinson

Re: Stream to KTable internals

2021-11-05 Thread Matthias J. Sax
sk.idle.ms: 2000. Current wall-clock time: 1635881277998. On Wed, Nov 3, 2021 at 6:11 PM Matthias J. Sax wrote: Can you check if the program ever does "enforced processing", ie, `max.task.idle.ms` passed, and we process despite an empty input buffer. Cf https://kafka.apac

Re: Stream to KTable internals

2021-11-03 Thread Matthias J. Sax
se. The table record is written about 20 seconds before the stream record. I’ll crank up the time tomorrow and see what happens. On Tue, Nov 2, 2021 at 6:24 PM Matthias J. Sax wrote: Hard to tell, but as it seems that you can reproduce the issue, it might be worth a try to increase the idle time further.

Re: Kafka streams event deduplication keeping last event in window

2021-11-03 Thread Matthias J. Sax
! :) On 2021/11/02 23:34:33 "Matthias J. Sax" wrote: I did not study your code snippet, but yes, it sounds like a valid approach from your description. How can I be sure that the start of the window will coincide with the Punctuator's scheduled interval? For punctuations, there is a

Re: Kafka streams event deduplication keeping last event in window

2021-11-02 Thread Matthias J. Sax
I did not study your code snippet, but yes, it sounds like a valid approach from your description. How can I be sure that the start of the window will coincide with the Punctuator's scheduled interval? For punctuations, there is always some jitter, because it's not possible to run a

Re: Stream to KTable internals

2021-11-02 Thread Matthias J. Sax
message, would increasing the max.task.idle.ms help? Thanks, Chad On Mon, Nov 1, 2021 at 10:03 PM Matthias J. Sax wrote: Timestamp synchronization is not perfect, and as a matter of fact, we fixed a few gaps in 3.0.0 release. We actually hope, that we closed the last gaps in 3.0.0... *fingers

Re: Producer Timeout issue in kafka streams task

2021-11-01 Thread Matthias J. Sax
The `Producer#send()` call is actually not covered by the KIP because it may result in data loss if we try to handle the timeout directly. -- Kafka Streams does not have a copy of the data in the producer's send buffer and thus we cannot retry the `send()`. -- Instead, it's necessary to

Re: Producer Timeout issue in kafka streams task

2021-11-01 Thread Matthias J. Sax
As the error message suggests, you can increase `max.block.ms` for this case: If a broker is down, it may take some time for the producer to fail over to a different broker (before the producer can fail over, the broker must elect a new partition leader, and only afterward can inform the

Re: Stream to KTable internals

2021-11-01 Thread Matthias J. Sax
e KTable record and therefore the join is getting missed? If you don't agree, what do you think is going on? Is there a way to prove this out? Thanks, Chad On Sat, Oct 30, 2021 at 2:09 PM Matthias J. Sax wrote: Yes, a StreamThread has one consumer. The number of StreamThreads per instance is configura

Re: Stream to KTable internals

2021-10-30 Thread Matthias J. Sax
Yes, a StreamThread has one consumer. The number of StreamThreads per instance is configurable via `num.stream.threads`. Partitions are assigned to threads similar to consumer is a plain consumer group. It seems you run with the default of one thread per instance. As you spin up 12 instances,

Re: Contributors list

2021-10-28 Thread Matthias J. Sax
Done. On 10/27/21 11:55 PM, Michael Negodaev wrote: Hello! Please add me to the contributor list in JIRA with username "mnegodaev". Thank you.

Re: Kafka streams - message not materialized in intermediate topics

2021-10-27 Thread Matthias J. Sax
For this case, you can call `aggregate(...).suppress()`. -Matthias On 10/27/21 12:42 PM, Tomer Cohen wrote: Hi Bill, Thanks for the prompt reply. Setting to 0 forces a no collection window, so if I get 10 messages to aggregate for example, it will send 10 updates. But I only want to publish

  1   2   3   4   5   6   7   8   9   10   >