mjsax commented on code in PR #13496:
URL: https://github.com/apache/kafka/pull/13496#discussion_r1163195888
##########
streams/src/main/java/org/apache/kafka/streams/kstream/internals/KTableKTableInnerJoin.java:
##########
@@ -153,11 +154,37 @@ public void init(final ProcessorContext<?, ?> context) {
@Override
public ValueAndTimestamp<VOut> get(final K key) {
- final ValueAndTimestamp<V1> valueAndTimestamp1 =
valueGetter1.get(key);
+ return computeJoin(key, valueGetter1::get, valueGetter2::get);
+ }
+
+ @Override
+ public ValueAndTimestamp<VOut> get(final K key, final long
asOfTimestamp) {
Review Comment:
> The first option is nice in that now stream-(table-table) and
(stream-table)-table joins with no intermediate materialization produce the
same results,
But are both joins really the same if the intermediate table-table result is
not materialized? Semantically, the intermediate table-table result is a
non-versioned store, and thus we cannot do a lookup into the history of it, ie,
we have a stream-tsTable join. The second query is two `stream-vTable` joins so
it seems ok if they produce different results?
> but it's also confusing because stream-(table-table) produces different
results if the user materializes the result of the table-table join as a
versioned store (which is wrong).
I don't see it as confusing (it might be very subtle to be fair...) -- the
intermediate result of a non-materialized t-t-join is semantically a tsTable
(or course, it does not get out-of-order updates, because the _join_ that
computes it has two versioned tables as input and thus drop out-of-order
updates) -- if the intermediate result is materialized as tsKV-store, semantics
should not change. If one materialized it as vKV store though, it seem ok that
semantics change, because the semantics of the intermediate result change from
being non-versioned to versioned, and thus the join changed from
`stream-tsTable` to `stream-vTable`.
My point is, that for a table-table join, there are 4 entities: both input
tables, the join operator, plus the result table. The two input table (v-table
vs ts-table) determine what join operator we pick (ie, drop out-of-order
updates yes/no), and the join produces an result that we know feed into the
result table with it's own semantics (by default, ts-semantics, not
v-semantics) -- Of course, depending on the used join semantics, we apply
different updates to the table, but we don't change the table semantics itself.
I guess bottom line is: this PR should be fine, but we need to ensure to use
the right version of `get` upstream?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]