Thanks for the background. Was just curious about the details. I agree
that we should not add a new backoff config at this point.
-Matthias
On 12/2/22 4:47 PM, Sophie Blee-Goldman wrote:
I missed the default config values as they were put into comments...
You don't read code comments? (
>
> I missed the default config values as they were put into comments...
You don't read code comments? (jk...sorry, wasn't sure where the best
place for this would be, suppose I could've just included the full config
definition
About the default timeout: what is the follow up rebalance cadenc
Thanks Sophie.
Good catch on the default partitioner issue!
I missed the default config values as they were put into comments...
About the default timeout: what is the follow up rebalance cadence (I
though it would be 10 minutes?). For this case, a default timeout of 15
minutes would imply th
Thanks again for the responses -- just want to say up front that I realized
the concept of a
default partitioner is actually substantially more complicated than I first
assumed due to
key/value typing, so I pulled it from this KIP and filed a ticket for it
for now.
Bruno,
What is exactly the moti
Thanks for updating the KIP Sophie.
I have the same question as Bruno. How can the user use the failure
metric and what actions can be taken to react if the metric increases?
Plus a few more:
(1) Do we assume that user can reason about `subtopology-parallelism`
metric to figure out if auto
Hi Sophie,
Thanks for the updates!
I also feel the KIP is much cleaner now.
I have one question:
What is exactly the motivation behind metric num-autoscaling-failures?
Actually, to realise that autoscaling did not work, we only need to
monitor subtopology-parallelism over partition.autoscaling
Thanks for the feedback everyone. I went back to the drawing board with a
different guiding
philosophy: that the users of this feature will generally be fairly
advanced, and we should
give them full flexibility to implement whatever they need while trusting
them to know
what they are doing.
With t
Thanks for the KIP Sophie. Seems there is a lively discussion going on.
I tried to read up on the history and I hope I don't repeat what was
already discussed.
And sorry for the quite long email...
(1) Stateless vs Stateful
I agree that stateless apps should be supported, even if I am not su
Hi Sophie,
Thanks for the KIP. A very useful proposal!
Some questions:
1. the staticPartition method in the interface is commented out.
2. For error handling, as you can imagine, there could be errors happening
during partition expansion.That means, the operation would be (1) take long
time to c
Hi Sophie,
Thank you for the KIP!
1.
I do not understand how autoscaling should work with a Streams topology
with a stateful sub-topology that reads from the input topics. The
simplest example is a topology that consists of only one stateful
sub-topology. As far as I understand the upstream p
Thanks all! I'll try to address everything but don't hesitate to call me
out if anything is missed
Colt/Lucas:
Thanks for clarifying, I think I understand your example now. Something I
didn't think to mention
earlier but hopefully clears up how this would be used in practice is that
the partition
Hey Sophie,
This looks like a very nice feature. Going through the comments, I agree
with Bill above that there could be a case for skew on keys given the
earlier partitions would have the data which it already had and get some
more. Do you think that's a concern/side-effect that this feature coul
Hey Sophie,
Thanks for the KIP. I think this could be useful for a lot of cases. I also
think that this could cause a lot of confusion.
Just to make sure we are doing our best to prevent people from
misusing this feature, I wanted to clarify a couple of things.
1) There will be only an interface
Hi Sophie,
Thanks for the KIP! I think this is a worthwhile feature to add. I have
two main questions about how this new feature will work.
1. You mention that for stateless applications auto-scaling is a sticker
situation. But I was thinking that the auto-scaling would actually benefit
Hi all,
thanks, Sophie, this makes sense. I suppose then the way to help the user
not apply this in the wrong setting is having good documentation and a one
or two examples of good use cases.
I think Colt's time-based partitioning is a good example of how to use
this. It actually doesn't have to
Sophie,
Regarding item "3" (my last paragraph from the previous email), perhaps I
should give a more general example now that I've had more time to clarify
my thoughts:
In some stateful applications, certain keys have to be findable without any
information about when the relevant data was created
Thanks for the responses guys! I'll get the easy stuff out of the way first:
1) Fixed the KIP so that StaticStreamPartitioner extends StreamPartitioner
2) I totally agree with you Colt, the record value might have valuable (no
pun) information
in it that is needed to compute the partition without
Sophie,
Thank you for your detailed response. That makes sense (one partition per
user seems like a lot of extra metadata if you've got millions of users,
but I'm guessing that was just for illustrative purposes).
In this case I'd like to question one small detail in your kip. The
StaticPartition
Hi Sophie,
This looks like a good improvement (given my limited knowledge, at least).
As I understand it, in the subset of use cases where it can be used, it
will make scaling up the #partitions basically frictionless.
Three questions, and forgive me if something doesn't make sense at all:
1) Fr
Thanks for your questions, I would say that your understanding sounds
correct based
on what you described but I'll try to add some clarity. The basic idea is
that, as you said,
any keys that are processed before time T will go to partition 1. All of
those keys should
then continue to be routed to p
Sophie,
Thank you for the KIP! Choosing the number of partitions in a Streams app
is a tricky task because of how difficult it is to re-partition; I'm glad
you're working on an improvement. I've got two questions:
First, `StaticStreamsPartitioner` is an interface that we (Streams users)
must impl
Hey all,
I'd like to propose a new autoscaling feature for Kafka Streams
applications which can follow the constraint of static partitioning. For
further details please refer to the KIP document:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-878%3A+Autoscaling+for+Statically+Partitioned+S
22 matches
Mail list logo