Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-05-06 Thread Sophie Blee-Goldman
Thanks guys. Updated the error codes in both the code and the explanation under "Public Changes". To sum up, here are the error codes listed in the KIP: enum AssignmentError { NONE, ACTIVE_TASK_ASSIGNED_MULTIPLE_TIMES, ACTIVE_AND_STANDBY_TASK_ASSIGNED_TO_SAME_KAFKASTREAMS,

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-05-03 Thread Matthias J. Sax
117f: Good point by Bruno. We should check for this, and could have an additional `INVALID_STANDBY_TASK` error code? -Matthias On 5/3/24 5:52 AM, Guozhang Wang wrote: Hi Sophie, Re: As for the return type of the TaskAssignmentUtils, I think that makes sense. LGTM. On Fri, May 3, 2024 at

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-05-03 Thread Guozhang Wang
Hi Sophie, Re: As for the return type of the TaskAssignmentUtils, I think that makes sense. LGTM. On Fri, May 3, 2024 at 2:26 AM Bruno Cadonna wrote: > > Hi Sophie, > > 117f: > I think, removing the STATEFUL and STATELESS types is not enough to > avoid the error Guozhang mentioned. The

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-05-03 Thread Bruno Cadonna
Hi Sophie, 117f: I think, removing the STATEFUL and STATELESS types is not enough to avoid the error Guozhang mentioned. The StreamsPartitionAssignor passes the information whether a task is stateless or stateful into the task assignor. However, the task assignor can return a standby task for

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-05-02 Thread Matthias J. Sax
Thanks Sophie. My bad. You are of course right about `TaskAssignment` and the StreamsPartitionAssignor's responsibitliy to map tasks of a instance to consumers. When I wrote my reply, I forgot about this detail. Seems you did not add `UNKNOWN_TASK_ID` error yet as proposed by Guozhang?

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-05-02 Thread Sophie Blee-Goldman
Guozhang: 117. All three additions make sense to me. However, while thinking about how users would actually produce an assignment, I realized that it seems silly to make it their responsibility to distinguish between a stateless and stateful task when they return the assignment. The

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-05-02 Thread Sophie Blee-Goldman
Matthias: Thanks for the naming suggestions for the error codes. I was definitely not happy with my original naming but couldn't think of anything better. I like your proposals though, will update the KIP names. I'll also add a "NONE" option as well -- much better than just passing in null for

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-05-01 Thread Guozhang Wang
Jumping back to the party here :) 107: I agree with the rationale behind this, and `numProcessingThreads` looks good to me as it covers both the current and future scenarios. 117: I agree with Lucas and Bruno, and would add: * 117e: unknown taskID: fail * 117f: inconsistent task types (e.g.

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Matthias J. Sax
I like the idea of error codes. Not sure if the name are ideal? UNKNOWN_PROCESS_ID makes sense, but the other two seems a little bit difficult to understand? Should we be very descriptive (and also try to avoid coupling it to the threading model -- important for the first error code): -

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Sophie Blee-Goldman
One last thing: I added an error code enum to be returned from the #onAssignmentComputed method in case of an invalid assignment. I created one code for each of the invalid cases we described above. The downside is that this means we'll have to go through a deprecation cycle if we want to loosen

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Sophie Blee-Goldman
Thanks guys! I agree with what Lucas said about 117c, we can always loosen a restriction later and I don't want to do anything now that might get in the way of the new threading models. With that I think we're all in agreement on 117. I'll update the KIP to include what we've discussed (and will

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Bruno Cadonna
Hi again, I forgot to ask whether you could add the agreement about handling invalid assignment to the KIP. Best, Bruno On 4/30/24 2:00 PM, Bruno Cadonna wrote: Hi all, I think we are converging! 117 a) fail: Since it is an invalid consumer assignment b) pass: I agree that not assigning a

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Bruno Cadonna
Hi all, I think we are converging! 117 a) fail: Since it is an invalid consumer assignment b) pass: I agree that not assigning a task might be reasonable in some situations c) fail: For the reasons Lucas pointed out. I am missing a good use case here. d) fail: It is invalid Somewhere in

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-30 Thread Lucas Brutschy
Hi, Looks like a great KIP to me! I'm late, so I'm only going to comment on the last open point 117. I'm against any fallbacks like "use the default assignor if the custom assignment is invalid", as it's just going to hide bugs. For the 4 cases mentioned by Sophie: 117a) I'd fail immediately

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-29 Thread Sophie Blee-Goldman
Yeah I think that sums it up well. Either you computed a *possible* assignment, or you returned something that makes it literally impossible for the StreamsPartitionAssignor to decipher/translate into an actual group assignment, in which case it should just fail That's more or less it for the

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-26 Thread Rohan Desai
117: as Sophie laid out, there are two cases here right: 1. cases that are considered invalid by the existing assignors but are still valid assignments in the sense that they can be used to generate a valid consumer group assignment (from the perspective of the consumer group protocol). An

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-25 Thread Matthias J. Sax
Thanks for the details! 107: I like `numProcessingThreads()` proposal. 108(a): it was not really a concern, but it was rather a question if we could/should simplify it, so it's easier to implement a custom task assignor. But if we believe that it's an integral component, I am fine with

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-25 Thread Sophie Blee-Goldman
104. Fair enough -- also happy to defer to Rohan on this (or Bruno if he feels super strongly) 107. That's a good point . Ultimately the task load should reflect the processing capacity, and that's something that will exist in both the new and old threading model. I like #processingCapacity for

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-24 Thread Matthias J. Sax
104: I also don't feel super strong about it. Not sure if `onAssignment()` might overload the name in a confusing way? In the end, when the method is called, we don't assign anything? -- Guess, I am fine with whatever Rohan picks as a name from the suggestions we have so far. 107: Did not

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-24 Thread Sophie Blee-Goldman
Now to respond to Matthias: FYI, I'm following the numbering scheme from your email but added to mark responses with further questions or feedback and/or aren't yet addressed in the KIP and need to be followed up on. You can more or less just skip over the ones without stars to save time

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-24 Thread Sophie Blee-Goldman
Responding to Bruno first: (1) I actually think "KafkaStreams" is exactly right here -- for the reason you said, ultimately this is describing a literal instance of the "KafkaStreams" class. Glad we hashed this out! (I saw that Rohan went with StreamsClient but i also prefer KafkaStreams) (4)

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-19 Thread Matthias J. Sax
One more thing. It might be good to clearly call out, which interfaced a user would implement, vs the other ones Kafka Streams implements and TaskAssignor only uses. My understanding is, that users would implement `TaskAssignor`, `TaskAssignment`, and `StreamsClientAssignment`. For

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-19 Thread Matthias J. Sax
Great KIP. I have some minor comments/questions: 100 The KIP says: "In the future, additional plugins can use the same partition.assignor prefix". What does this mean? 101 (nit) The KIP says: "Note that the thread-level assignment will remain an un-configurable internal implementation

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-19 Thread Rohan Desai
Bruno, I've incorporated your feedback into the KIP document. On Fri, Apr 19, 2024 at 3:55 PM Rohan Desai wrote: > Thanks for the feedback Bruno! For the most part I think it makes sense, > but leaving a couple follow-up thoughts/questions: > > re 4: I think Sophie's point was slightly

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-19 Thread Rohan Desai
Thanks for the feedback Bruno! For the most part I think it makes sense, but leaving a couple follow-up thoughts/questions: re 4: I think Sophie's point was slightly different - that we might want to wrap the return type for `assign` in a class so that its easily extensible. This makes sense to

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-18 Thread Bruno Cadonna
Hi Sophie, Thanks for the clarifications! (1) What about replacing Node* with KafkaStreams* or StreamsClient*? I prefer KafkaStreams* since that class represents the Kafka Streams client. I am also fine with KafkaStreamsClient*. I really would like to avoid introducing a new term in Kafka

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-17 Thread Sophie Blee-Goldman
Thanks Bruno! I can provide a bit of context behind some of these decisions but I just want to say up front that I agree with every single one of your points, though I am going to push back a bit on the first one. [1] The idea here is to help avoid some confusion around the overloaded term

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-17 Thread Bruno Cadonna
Hi, sorry, I am late to the party. I have a couple of comments: (1) I would prefer Client* instead of Node* in the names. In Kafka Streams we do not really have the concept of node but we have the concept of client (admittedly, we sometimes also use instance). I would like to avoid

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-05 Thread Sophie Blee-Goldman
Cool, looks good to me! Seems like there is no further feedback, so maybe we can start to call for a vote? However, since as noted we are setting aside time to discuss this during the sync next Thursday, we can also wait until after that meeting to officially kick off the vote. On Fri, Apr 5,

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-05 Thread Rohan Desai
Thanks for the feedback Sophie! re1: Totally agree. The fact that it's related to the partition assignor is clear from just `task.assignor`. I'll update. re3: This is a good point, and something I would find useful personally. I think its worth adding an interface that lets the plugin observe the

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2024-04-04 Thread Sophie Blee-Goldman
Before I go into my own feedback, which I know not everyone has the time to read, I just want to announce that we plan to go over this KIP during the next Streams project sync on April 11. If you are interested in joining the discussion and need an invitation, just reach out to me in a separate

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2023-11-18 Thread Guozhang Wang
Hi Rohan, I took another look at the updated wiki page and do not have any major questions. Regarding returning a plugin object v.s. configuring a plugin object, I do not have a strong opinion except that the latter seems more consistent with existing patterns. Just curious, any other motivations

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2023-11-09 Thread Rohan Desai
Thanks for the feedback so far! I think pretty much all of it is reasonable. I'll reply to it inline: > 1. All the API logic is granular at the Task level, except the previousOwnerForPartition func. I’m not clear what’s the motivation behind it, does our controller also want to change how the

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2023-11-09 Thread Almog Gavra
Hello Rohan, Thanks for the KIP! Thoughts below (seems I have similar comments to Guozhang, but I had already written this before reading his reply haha!). They're basically all minor suggestions for improvements, I wouldn't consider any of them blocking. 1. For the API, thoughts on changing the

Re: [DISCUSS] KIP-924: customizable task assignment for Streams

2023-11-09 Thread Guozhang Wang
Hello Rohan, Thanks for the KIP! Overall it looks very nice. Just some quick thoughts : 1. All the API logic is granular at the Task level, except the previousOwnerForPartition func. I’m not clear what’s the motivation behind it, does our controller also want to change how the partitions->tasks

[DISCUSS] KIP-924: customizable task assignment for Streams

2023-11-07 Thread Rohan Desai
Hello All, I'd like to start a discussion on KIP-924 ( https://cwiki.apache.org/confluence/display/KAFKA/KIP-924%3A+customizable+task+assignment+for+Streams) which proposes an interface to allow users to plug into the streams partition assignor. The motivation section in the KIP goes into some