Hi Bruno,
Thanks for the feedback, that makes sense.
I’ve updated the KIP based on suggestions [1]
Best,
Levani
[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awarness+for+Kafka+Streams
> On 9. Mar 2021, at 11:48, Bruno Cadonna wrote:
>
> Hi Levani,
>
> The KIP
Hi Levani,
The KIP looks good!
I have two comments:
1. In the example of the ideal standby task distribution, you should
make clear that the algorithm will either choose distributions Node-1,
Node-5, Node-9 or Node-1, Node-6, Node-8, but not both.
2. Could you formulate a bit more generic
Hello all,
Bumping this thread in case there’s any other feedback around KIP-708 [1].
If not, I will start voting thread sometime this week.
Best,
Levani
[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awarness+for+Kafka+Streams
Hi Bruno,
Thanks a lot for the feedback.
I’ve updated KIP [1] based on suggestions.
Regards,
Levani
[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awarness+for+Kafka+Streams
> On 1. Mar 2021, at 22:55, Bruno Cadonna wrote:
>
> clientTagPrefix
Thank Levani for the update on the KIP!
The KIP looks good!
I have just a couple of comments.
1. Could you try to formulate "The ideal distribution means there is no
repeated client dimension amongst clients assigned to the active task
and all standby tasks." a bit differently. I find it a
Thanks Levani for the explanation. I think I understand.
Is "rack" still a useful term in this context? I think my concept of "rack"
made it hard for me to wrap my head around the multiple tags approach. For
example, how can a node be in different racks at the same time? And why
would multiple
Hello Ryanne,
Thanks for the question.
Tag approach gives more flexibility, which otherwise could have been only
possible with pluggable custom logic Kafka Streams's user must provide (it is
briefly described in "Rejected Alternatives" section).
For instance, if we append multiple tags to form
I guess I don't understand how multiple tags work together to achieve rack
awareness. I realize I could go look at how Elasticseach works, but ideally
this would be more plain in the KIP.
In particular I'm not sure how the tag approach is different than appending
multiple tags together, e.g. how
Hi Bruno,
Thanks for the feedback. I think it makes sense.
I’ve updated the KIP [1] and tried to omit implementation details around the
algorithm.
Please let me know if the latest version looks OK.
Regards,
Levani
[1]
Hi Levani,
I discussed your KIP with John the other day and we both think it is a
really interesting KIP and you did a good job in writing it. However, we
think that the KIP exposes to many implementation details. That makes
future changes to the implementation of the distribution algorithm
Hi Bruno,
Thanks for the quick reply
5.Sorry, maybe I am not making it clear.
What you have described is how it should work, yes. As it is stated in KIP,
with the importance semantics in standby.replicas.awareness,
if we have an active task on Node-1 and the first standby task on Node-5, the
Hi Levani,
Thanks for the modifications!
I have some follow up questions/comments:
5. Something is not clear to me. If the active is on Node-1 and the
first replica is on Node-5 (different cluster, different zone), why
would the second replica go to Node-4 that has a different cluster than
Hi Bruno,
Thanks for the feedback. Please check my answers below:
1. No objections; sounds good. Updated KIP
2. No objections; sounds good. Updated KIP
3. Thanks for the information; I can change KIP only to expose prefix method
instead of a constant if it’s the way forward.
4. Done. Updated
Hi Levani,
Thank you for the KIP.
Really interesting!
Here my comments:
1. To be consistent with the other configs that involve standbys , I
would rename
standby.task.assignment.awareness -> standby.replicas.awareness
2. I would also rename the prefix
instance.tag -> client.tag
3. The
Hello all,
I’ve updated KIP-708 [1] to reflect the latest discussion outcomes.
I’m looking forward to your feedback.
Regards,
Levani
[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-708%3A+Rack+awarness+for+Kafka+Streams
> On 2. Feb 2021, at 22:03, Levani Kokhreidze wrote:
>
>
Hi John.
Thanks a lot for this detailed analysis!
Yes, that is what I had in mind as well.
I also like that idea of having “task.assignment.awareness” configuration
to tell which instance tags can be used for rack awareness.
I may borrow it for this KIP if you don’t mind :)
Thanks again John
Hi Levani,
1. Thanks for the details.
I figured it must be something like this two-dimensional definition of "rack".
It does seem like, if we make the config take a list of tags, we can define
the semantics to be that the system will make a best effort to distribute
the standbys over each rack
Hi John,
1. Main reason was that it seemed easier change compared to having multiple
tags assigned to each host.
---
Answering your question what use-case I have in mind:
Lets say we have two Kubernetes clusters running the same Kafka Streams
application.
And each Kubernetes cluster is
Hello Levani,
Thanks for the reply.
1. Interesting; why did you change your mind?
I have a gut feeling that we can achieve pretty much any rack awareness need
that people have by using purely config, which is obviously much easier to use.
But if you had a case in mind where this wouldn’t
Hi John,
Thanks a lot for thorough feedback, it’s really valuable.
1. Agree with this. Had the same idea initially.
We can set some upper limit in terms of what’s
the max number of tags users can set to make
sure it’s not overused. By default, we can create
standby tasks where tags are
Thanks, Levani!
I was reflecting more on your KIP last night.
One thing I should mention is that I have previously used
the rack awareness feature of Elasticsearch, and found it to
be pretty intuitive and also capable of what we needed in
our AWS clusters. As you look at related work, you might
Hi John
Thanks for the feedback (and for the great work on KIP441 :) ).
Makes sense, will add a section in the KIP explaining rack awarenesses on high
level and how it’s implemented in the different distributed systems.
Thanks,
Levani
> On 27. Jan 2021, at 16:07, John Roesler wrote:
>
> Hi
Hi Levani,
Thanks for this KIP! I think this is really high value; it was something I was
disappointed I didn’t get to do as part of KIP-441.
Rack awareness is a feature provided by other distributed systems as well. I
wonder if your KIP could devote a section to summarizing what rack
Hello all,
I’d like to start discussion on KIP-708 [1] that aims to introduce rack aware
standby task distribution in Kafka Streams.
In addition to changes mentioned in the KIP, I’d like to get some ideas on
additional change I have in mind.
Assuming KIP moves forward, I was wondering if it
24 matches
Mail list logo