Re: [DISCUSS] KIP-759: Unneeded repartition canceling

Matthias J. Sax Fri, 22 Apr 2022 16:43:15 -0700

Ivan,

are you still interested in this KIP? I think it would be a good addition.



-Matthias

On 8/16/21 5:30 PM, Matthias J. Sax wrote:

Your point about the IQ problem is an interesting one. I missed the
point that the "new key" would be a "superkey", and thus, it should
always be possible to compute the original key from the superkey. (As a
matter of fact, for windowed-table the windowed-key is also a superkey...)

I am not sure if we need to follow the "use the head idea" or if we need
a "CompositeKey" interface? It seems we can just allow for any types and
we can be agnostic to it?

KStream<K, V> stream = ...
KStream<SK, V> stream2 =
   stream.selectKey(/*set superkey*/)
         .markAsPartitioned()

We only need a `Function<SK, K>` without any restrictions on the type,
to map the "superkey" to the original "partition key"?


Do you propose to provide the "revers mapper" via the
`markAsPartitioned()` method (or config object), or via the IQ methods?
Not sure which one is better?


However, I am not sure if it would solve the join problem? At least not
easily: if one has two KStream<Tuple,...> and one is properly
partitioned by `Tuple` while the other one is "marked-as-partitoned",
the join would just fail. -- Similar for a stream-table join. -- The
only fix would be to do the re-partitioning anyway, effectively ignoring
the "user hint", but it seems to defeat the purpose? Again, I would
argue that it is ok to not handle this case, but leave it as the
responsibility for the user to not mess it up.


-Matthias

On 8/9/21 2:32 PM, Ivan Ponomarev wrote:

Hi Matthias and Sophie!

==1.  markAsPartitioned vs extending `selectKey(..)` etc. with a config.==

I don't have a strong opinion here, both Sophie's and Matthias' points
look convincing for me.

I think we should estimate the following: what is the probability that
we will ever need to extend `selectKey` etc. with a config for the
purposes other than `markAsPartitioned`?

If we find this probability high, then it's just a refactoring to
deprecate overloads with `Named` and introduce overloads with dedicated
configs, and we should do it this way.

If it's low or zero, maybe it's better not to mess with the existing
APIs and to introduce a single `markAsPartitioned()` method, which
itself can be easily deprecated if we find a better solution later!


==2. The IQ problem==

it then has to be the case that

Partitioner.partition(key) == Partitioner.partition(map(key))



Sophie, you got this wrong, and Matthias already explained why.

The actual required property for the mapping function is:

\forall k1, k2 (map(k1) = map(k2) => partition(k1) = partition(k2))

or, by contraposition law,

\forall k1, k2 (partition(k1) =/= partition(k2) => map(k1) =/= map(k2) )


(look at the whiteboard photo that I attached to the KIP).

There is a big class of such mappings: key -> Tuple(key, anyValue). This
is actually what we often do before aggregation, and this mapping does
not require repartition.

But of course we can extract the original key from Tuple(key, anyValue),
and this can save IQ and joins!

This is what I'm talking about when I talk about 'CompositeKey' idea.

We can do the following:

1. implement a 'partitioner wrapper' that recognizes tuples
(CompositeKeys) and uses only the 'head' to calculate the partition,

2. implement

selectCompositeKey(BiFunction<K, V> tailSelector) {
   selectKey((k, v) -> new CompositeKey(k, tailSelector.apply(k, v));
   //MARK_AS_PARTITIONED call here,
   //but this call is an implementation detail and we do not expose
   //markAsPartitioned publicly!
}

WDYT? (it's just a brainstorming idea)

09.08.2021 2:38, Matthias J. Sax пишет:

Hi,

I originally had a similar thought about `markAsPartitioned()` vs
extending `selectKey()` et al. with a config. While I agree that it
might be conceptually cleaner to use a config object, I did not propose
it as the API impact (deprecating stuff and adding new stuff) is quite
big... If we think it's an acceptable price to pay, I am ok with it
though.

I also do think, that `markAsPartitioned()` could actually be
categorized as an operator... We don't expose it in the API as
first-class citizen atm, but in fact we have two types of `KStream` -- a
"PartitionedKStream" and a "NonPartitionedKStream". Thus,
`markAsPartitioned()` can be seen as a "cast operator" that converts the
one into the other.

I also think that the raised concern about "forgetting to remove
`markAsPartitioned()`" might not be very strong though. If you have
different places in the code that link stuff together, a call to eg.
`selectKey().markAsPartitioned()` must always to together. If you have
some other place in the code that get a `KStream` passed an input, it
would be "invalid" to blindly call `markAsPartitioned()` as you don't
know anything about the upstream code. Of course, it requires some
"coding discipline" to follow this pattern... Also, you can shoot
themselves into the foot if they want with the config object pattern,
too: if you get a `KStream` passed in, you can skip repartitioning via
`selectKey((k,v) -> k, Config.markAsPartitioned())`. -- Thus, I still
slightly prefer to add `markAsPartitioned()` as an operator.

(Maybe we should have expose a `PartitionedKStream` as first class
object to begin with... Hard to introduce now I guess...)


The concern about IQ is interesting -- I did not realize this impact.
Thanks for bringing it up.

a repartition would be a no-op, ie that the stream (and its
partitioning)
would be the same
whether or not a repartition is inserted. For this to be true, it
then has
to be the case that

Partitioner.partition(key) == Partitioner.partition(map(key))


@Sophie: I don't think this statement is correct. A `markAsPartition()`
only means, that the existing partitioning ensure that all messages of
the same new key are still in the same partition. Ie, it cannot happen
that two new keys (that are the same) are in a different partition.

However, if you would physically repartitiong on the new key using the
same hash-function as for the old key, there is no guarantee that the
same partitions would be picked... And that is why IQ breaks downstream.

Btw: using `markAsPartitioned()` could also be an issue for joins for
the same reason... I want to call out, that the Jira tickets that did
raise the concern about unnecessary repartitioning are about downstream
aggregations though...

Last but not least: we actually have a similar situation for
windowed-aggregations: The result of a window aggregation is partitioned
by the "plain key": if we write the result into a topic using the same
partitioning function, we would write to different partitions... (I
guess it was never an issue so far, as we don't have KIP-300 in place
yet...)

It's also not an issue for IQ, because we know the plain key, and thus
can route to the right task.


About a solution: I think it might be ok to say we don't need to solve
this problem, but it's the users responsibility to take IQ into account.
Ie, if they want to use IQ downstream, the need to repartition: for this
case, repartitioning is _NOT_ unnecessary... The same argument seems to
apply for the join case I mentioned above. -- Given that
`markAsPartitioned()` is an advanced feature, it seems ok to leave it to
the user to use correctly (we should of course call it out in the docs!).



-Matthias



On 8/7/21 7:45 PM, Sophie Blee-Goldman wrote:

Before I dive in to the question of IQ and the approaches you
proposed, can
you just
elaborate on the problem itself? By definition, the `markAsPartitioned`
flag means that
a repartition would be a no-op, ie that the stream (and its
partitioning)
would be the same
whether or not a repartition is inserted. For this to be true, it
then has
to be the case that

Partitioner.partition(key) == Partitioner.partition(map(key))

The left-hand side of the above is precisely how we determine the
partition
number that
a key belongs to when using IQ. It shouldn't matter whether the user is
querying a key
in a store upstream of the key-changing operation or a mapped key
downstream of it
-- either way we just apply the given Partitioner.

See StreamsMetadataState#getKeyQueryMetadataForKey
<https://github.com/apache/kafka/blob/6854eb8332d6ef1f1c6216d2f67d6e146b1ef60f/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamsMetadataState.java#L283>

for where this happens


If we're concerned that users might try to abuse the new
`markAsPartitioned` feature,
or accidentally misuse it, then we could add a runtime check that
applies
the Partitioner
associated with that subtopology to the key being processed and the
mapped
key result
to assert that they do indeed match. Imo this is probably overkill, just
putting it out there.

On Sat, Aug 7, 2021 at 1:42 PM Ivan Ponomarev
<[email protected]>
wrote:

Hi Sophie,

thanks for your reply! So your proposal is:

1). For each key-changing operation, deprecate the existing overloads
that accept a Named, and replace them with overloads that take an
operator-specific config object.
2). Add `markAsPartitioned` flag to these configs.

IMO, this looks much better than the original proposal, I like it very
much and I think I will rewrite the KIP soon. I absolutely agree with
your points. Repartition logic is not a part of the public contract,
and
it's much better to give it correct hints instead of telling explicitly
what it should do.

...

Since we're generating such bright ideas, maybe we should also
brainstorm the interactive query problem?

The problem is that interactive queries will not work properly when
`markAsPartitioned` is used. Although original key and mapped key will
be in the same partition, we will no longer be able to guess this
partition given the mapped key only.

The possible approaches are:

1) Give up and don't use interactive queries together with
`markAsPartitioned`. This is what I suppose now. But can we do better?

2) Maybe we should ask the user to provide 'reverse mapping' that will
allow IQ to restore the original key in order to choose the correct
partition. We can place this mapping in our new configuration
object. Of
course, there is no way for KStreams to verify in compile time/startup
time that the this function is actually the reverse mapping that
extract
the old key from the new one. Users will forget to provide this
function. Users will provide wrong functions. This all looks too
fragile.

3) Maybe there can be a completely different approach. Let's
introduce a
new entity -- composite keys, consisting of "head" and "tail". The
partition for the composite key is calculated based on its 'head' value
only. If we provide a key mapping in form key -> CompositeKey(key,
tail), then it's obvious that we do not need a repartition. When an
interactive query needs to guess the partition for CompositeKey, it
just
extracts its head and calculates the correct partition.

We can select CompositeKey before groupByKey() and aggregation
operations, and this will not involve repartition. And IQ will work.

Is it too daring idea, WDYT? My concern: will it cover all the cases
when we want to choose a different key, but also avoid repartition?

Regards,

Ivan



06.08.2021 23:19, Sophie Blee-Goldman пишет:

Hey Ivan

I completely agree that adding it as a config to Grouped/Joined/etc
isn't
much better, I was just
listing it for completeness, and that I would prefer to make it a
configuration of the key-changing
operation itself -- that's what I meant by

a better alternative might be to introduce this ... to the config
object

of

the operator that's actually


doing the key changing operation


I personally believe this is the semantically "correct" way to
approach
this, since "preserves partitioning"
or "does not preserve partitioning" is a property of a key-changing
operation and not an operation on the
stream itself. Also, this way the user need only tell Streams which
operations do or do not preserve the
partitioning, and Streams can figure out where to insert a
repartition in
the topology as it does today.

Otherwise, we're rendering this particularly useful feature of the
DSL --
automatic repartitioning -- pretty
much useless, since the user now has to figure out whether a
repartition

is

needed. On top of that, they
need to have some understanding of where and when this internal
automatic
repartitioning logic is going
to insert that repartition in order to cancel it in the appropriate

place.

Which is pretty unfortunate, since
that logic is not part of the public contract: it can change at any
time,
for example as it did when we introduced
the repartition merging optimization.

All that said, those are valid concerns regarding the expansion of the
API's surface area. Since none of
the key-changing operations currently have a config object like some

other

operations (for example Grouped
or Consumed, etc), this would double the number of overloads. But
maybe
this is a good opportunity to fix
that problem, rather than keep digging ourselves into holes by
trying to
work around it.

It looks like all of those key-changing operations have two
overloads at
the moment, one with no parameters
beyond the operation itself (eg KeyValueMapper for #selectKey) and the
other with an additional Named
parameter, which is itself another kind of configuration. What if we
instead deprecate the existing overloads
that accept a Named, and replace them with overloads that take an
operator-specific config object like we do
elsewhere (eg Grouped for #groupByKey). Then we can have both Named
and
this  `markAsPartitioned` flag
be part of the general config object, which (a) does not expand the
API
surface area at all in this KIP, and (b)
also protects future KIPs from needing to have this same conversation

over

and over, because we can now
stick any additional operator properties into that same config object.

WDYT? By the way, the above idea (introducing a single config
object to
wrap all operator properties) was also
raised by John Roesler a while back. Let's hope he hasn't changed his

mind

since then :)


On Fri, Aug 6, 2021 at 3:01 AM Ivan Ponomarev
<[email protected]

wrote:

Hi Matthias,

Concerning the naming: I like `markAsPartitioned`, because it
describes
what this operation is actually doing!

Hi Sophie,

I see the concern about poor code cohesion. We declare key mapping in
one place of code, then later in another place we say
"markAsPartitioned()". When we change the code six months later, we
might forget to remove markAsPartitioned(), especially if it's
placed in
another method or class. But I don't understand why do you propose to
include this config into Grouped/Joined/StreamJoined, because from
this
point of view it's not a better solution?

The best approach regarding the cohesion might be to to add an extra
'preservePartition' flag to every key-changing operation, that is

1) selectKey
2) map
3) flatMap
4) transform
5) flatTransform

in order to tell if the provided mapping require repartition or not.
Indeed, this is a mapping operation property, not grouping one!
BTW: the
idea of adding extra parameter to `selectKey` was once coined by John
Roesler.

Arguments in favour for this approach: 1) better code cohesion
from the
point of view of the user, 2) 'smarter' code (the decision is taken
depending on metadata provided for all the upstream mappings), 3)
overall safer for the user.

Arguments against: invasive KStreams API change, 5 more method
overloads. Further on, when we add a new key-changing operation to
KStream, we must add an overloaded version with 'preservePartition'.
When we add a new overloaded version for existing operation, we
actually
might need to add two or more overloaded versions. This will soon
become
a mess.

I thought that since `markAsPartitioned` is intended for advanced
users,
they will use it with care. When you're in a position where every
serialization/deserialization round matters for the latency, you're
extremely careful with the topology and you will not thoughtlessly
add
new key-changing operations without controlling how it's going to
change
the overall topology.

By the way, if we later find a better solution, it's way more easy to
deprecate a single `markAsPartitioned` operation than 5 method

overloads.


What do you think?




04.08.2021 4:23, Sophie Blee-Goldman пишет:

Do we really need a whole DSL operator for this? I think the
original

name

for this
operator -- `cancelRepartition()` -- is itself a sign that this
is not

an

operation on the
stream itself but rather a command/request to whichever operator
would

have

otherwise triggered this repartition.

What about instead adding a new field to the

Grouped/Joined/StreamJoined

config
objects that signals them to skip the repartitioning?

The one downside to this specific proposal is that you would then
need

to

specify
this for every stateful operation downstream of the key-changing

operation.

So a
better alternative might be to introduce this `skipRepartition`
field,

or

whatever we
want to call it, to the config object of the operator that's
actually

doing

the key
changing operation which is apparently preserving the partitioning.

Imo this would be more "safe" relative to the current proposal,
as the

user

has to
explicitly consider whether every key changing operation is indeed
preserving the
partitioning. Otherwise you could code up a topology with several
key
changing
operations at the beginning which do require repartitioning. Then
you

get

to the end
of the topology and insert one final key changing operation that

doesn't,

assume
you can just cancel the repartition, and suddenly you're
wondering why

your

results
are all screwed up

On Tue, Aug 3, 2021 at 6:02 PM Matthias J. Sax <[email protected]>

wrote:

Thanks for the KIP Ivan!

I think it's a good feature to give advanced users more control,
and
allow them to build more efficient application.

Not sure if I like the proposed named though (the good old "naming
things" discussion :))

Did you consider alternatives? What about

     - markAsPartitioned()
     - markAsKeyed()
     - skipRepartition()

Not sure if there are other idea on a good name?



-Matthias

On 6/24/21 7:45 AM, Ivan Ponomarev wrote:

Hello,

I'd like to start a discussion for KIP-759:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-759%3A+Unneeded+repartition+canceling



This is an offshoot of the discussion of KIP-655 for a `distinct`
operator, which turned out to be a separate proposal.

The proposal is quite trivial, however, we still might consider
alternatives (see 'Possible Alternatives' section).

Regards,

Ivan

Re: [DISCUSS] KIP-759: Unneeded repartition canceling

Reply via email to