Thanks guys. Updated the error codes in both the code and the explanation
under "Public Changes". To sum up, here are the error codes listed in the
KIP:
enum AssignmentError {
NONE,
ACTIVE_TASK_ASSIGNED_MULTIPLE_TIMES,
ACTIVE_AND_STANDBY_TASK_ASSIGNED_TO_SAME_KAFKASTREAMS,
117f: Good point by Bruno. We should check for this, and could have an
additional `INVALID_STANDBY_TASK` error code?
-Matthias
On 5/3/24 5:52 AM, Guozhang Wang wrote:
Hi Sophie,
Re: As for the return type of the TaskAssignmentUtils, I think that
makes sense. LGTM.
On Fri, May 3, 2024 at
Hi Sophie,
Re: As for the return type of the TaskAssignmentUtils, I think that
makes sense. LGTM.
On Fri, May 3, 2024 at 2:26 AM Bruno Cadonna wrote:
>
> Hi Sophie,
>
> 117f:
> I think, removing the STATEFUL and STATELESS types is not enough to
> avoid the error Guozhang mentioned. The
Hi Sophie,
117f:
I think, removing the STATEFUL and STATELESS types is not enough to
avoid the error Guozhang mentioned. The StreamsPartitionAssignor passes
the information whether a task is stateless or stateful into the task
assignor. However, the task assignor can return a standby task for
Thanks Sophie. My bad. You are of course right about `TaskAssignment`
and the StreamsPartitionAssignor's responsibitliy to map tasks of a
instance to consumers. When I wrote my reply, I forgot about this detail.
Seems you did not add `UNKNOWN_TASK_ID` error yet as proposed by Guozhang?
Guozhang:
117. All three additions make sense to me. However, while thinking about
how users would actually produce an assignment, I realized that it seems
silly to make it their responsibility to distinguish between a stateless
and stateful task when they return the assignment. The
Matthias:
Thanks for the naming suggestions for the error codes. I was definitely not
happy with my original naming but couldn't think of anything better. I
like your proposals though, will update the KIP names. I'll also add a
"NONE" option as well -- much better than just passing in null for
Jumping back to the party here :)
107: I agree with the rationale behind this, and
`numProcessingThreads` looks good to me as it covers both the current
and future scenarios.
117: I agree with Lucas and Bruno, and would add:
* 117e: unknown taskID: fail
* 117f: inconsistent task types (e.g.
I like the idea of error codes. Not sure if the name are ideal?
UNKNOWN_PROCESS_ID makes sense, but the other two seems a little bit
difficult to understand?
Should we be very descriptive (and also try to avoid coupling it to the
threading model -- important for the first error code):
-
One last thing: I added an error code enum to be returned from the
#onAssignmentComputed method in case of an invalid assignment. I created
one code for each of the invalid cases we described above. The downside is
that this means we'll have to go through a deprecation cycle if we want to
loosen
Thanks guys! I agree with what Lucas said about 117c, we can always loosen
a restriction later and I don't want to do anything now that might get in
the way of the new threading models.
With that I think we're all in agreement on 117. I'll update the KIP to
include what we've discussed
(and will
Hi again,
I forgot to ask whether you could add the agreement about handling
invalid assignment to the KIP.
Best,
Bruno
On 4/30/24 2:00 PM, Bruno Cadonna wrote:
Hi all,
I think we are converging!
117
a) fail: Since it is an invalid consumer assignment
b) pass: I agree that not assigning a
Hi all,
I think we are converging!
117
a) fail: Since it is an invalid consumer assignment
b) pass: I agree that not assigning a task might be reasonable in some
situations
c) fail: For the reasons Lucas pointed out. I am missing a good use case
here.
d) fail: It is invalid
Somewhere in
Hi,
Looks like a great KIP to me!
I'm late, so I'm only going to comment on the last open point 117. I'm
against any fallbacks like "use the default assignor if the custom
assignment is invalid", as it's just going to hide bugs. For the 4
cases mentioned by Sophie:
117a) I'd fail immediately
Yeah I think that sums it up well. Either you computed a *possible* assignment,
or you returned something that makes it literally impossible for the
StreamsPartitionAssignor to decipher/translate into an actual group
assignment, in which case it should just fail
That's more or less it for the
117: as Sophie laid out, there are two cases here right:
1. cases that are considered invalid by the existing assignors but are
still valid assignments in the sense that they can be used to generate a
valid consumer group assignment (from the perspective of the consumer group
protocol). An
Thanks for the details!
107: I like `numProcessingThreads()` proposal.
108(a): it was not really a concern, but it was rather a question if we
could/should simplify it, so it's easier to implement a custom task
assignor. But if we believe that it's an integral component, I am fine
with
104. Fair enough -- also happy to defer to Rohan on this (or Bruno if he
feels super strongly)
107. That's a good point . Ultimately the task load should reflect the
processing capacity, and that's something that will exist in both the new
and old threading model. I like #processingCapacity for
104: I also don't feel super strong about it. Not sure if
`onAssignment()` might overload the name in a confusing way? In the end,
when the method is called, we don't assign anything? -- Guess, I am fine
with whatever Rohan picks as a name from the suggestions we have so far.
107: Did not
Now to respond to Matthias:
FYI, I'm following the numbering scheme from your email but added to
mark responses with further questions or feedback and/or aren't yet
addressed in the KIP and need to be followed up on. You can more or less
just skip over the ones without stars to save time
Responding to Bruno first:
(1) I actually think "KafkaStreams" is exactly right here -- for the reason
you said, ultimately this is describing a literal instance of the
"KafkaStreams" class. Glad we hashed this out! (I saw that Rohan went with
StreamsClient but i also prefer KafkaStreams)
(4)
One more thing. It might be good to clearly call out, which interfaced a
user would implement, vs the other ones Kafka Streams implements and
TaskAssignor only uses.
My understanding is, that users would implement `TaskAssignor`,
`TaskAssignment`, and `StreamsClientAssignment`.
For
Great KIP. I have some minor comments/questions:
100 The KIP says: "In the future, additional plugins can use the same
partition.assignor prefix". What does this mean?
101 (nit) The KIP says: "Note that the thread-level assignment will
remain an un-configurable internal implementation
Bruno, I've incorporated your feedback into the KIP document.
On Fri, Apr 19, 2024 at 3:55 PM Rohan Desai wrote:
> Thanks for the feedback Bruno! For the most part I think it makes sense,
> but leaving a couple follow-up thoughts/questions:
>
> re 4: I think Sophie's point was slightly
Thanks for the feedback Bruno! For the most part I think it makes sense,
but leaving a couple follow-up thoughts/questions:
re 4: I think Sophie's point was slightly different - that we might want to
wrap the return type for `assign` in a class so that its easily extensible.
This makes sense to
Hi Sophie,
Thanks for the clarifications!
(1)
What about replacing Node* with KafkaStreams* or StreamsClient*? I
prefer KafkaStreams* since that class represents the Kafka Streams
client. I am also fine with KafkaStreamsClient*. I really would like to
avoid introducing a new term in Kafka
Thanks Bruno! I can provide a bit of context behind some of these
decisions but I just want to say up front that I agree with every single one
of your points, though I am going to push back a bit on the first one.
[1] The idea here is to help avoid some confusion around the overloaded
term
Hi,
sorry, I am late to the party.
I have a couple of comments:
(1)
I would prefer Client* instead of Node* in the names. In Kafka Streams
we do not really have the concept of node but we have the concept of
client (admittedly, we sometimes also use instance). I would like to
avoid
Cool, looks good to me!
Seems like there is no further feedback, so maybe we can start to call for
a vote?
However, since as noted we are setting aside time to discuss this during
the sync next Thursday, we can also wait until after that meeting to
officially kick off the vote.
On Fri, Apr 5,
Thanks for the feedback Sophie!
re1: Totally agree. The fact that it's related to the partition assignor is
clear from just `task.assignor`. I'll update.
re3: This is a good point, and something I would find useful personally. I
think its worth adding an interface that lets the plugin observe the
Before I go into my own feedback, which I know not everyone has the time to
read, I just want to announce that we plan to go over this KIP during the
next Streams project sync on April 11. If you are interested in joining the
discussion and need an invitation, just reach out to me in a separate
Hi Rohan,
I took another look at the updated wiki page and do not have any major
questions. Regarding returning a plugin object v.s. configuring a
plugin object, I do not have a strong opinion except that the latter
seems more consistent with existing patterns. Just curious, any other
motivations
Thanks for the feedback so far! I think pretty much all of it is
reasonable. I'll reply to it inline:
> 1. All the API logic is granular at the Task level, except the
previousOwnerForPartition func. I’m not clear what’s the motivation behind
it, does our controller also want to change how the
Hello Rohan,
Thanks for the KIP! Thoughts below (seems I have similar comments to
Guozhang, but I had already written this before reading his reply haha!).
They're basically all minor suggestions for improvements, I wouldn't
consider any of them blocking.
1. For the API, thoughts on changing the
Hello Rohan,
Thanks for the KIP! Overall it looks very nice. Just some quick thoughts :
1. All the API logic is granular at the Task level, except the
previousOwnerForPartition func. I’m not clear what’s the motivation
behind it, does our controller also want to change how the
partitions->tasks
Hello All,
I'd like to start a discussion on KIP-924 (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-924%3A+customizable+task+assignment+for+Streams)
which proposes an interface to allow users to plug into the streams
partition assignor. The motivation section in the KIP goes into some
36 matches
Mail list logo