[
https://issues.apache.org/jira/browse/KAFKA-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias J. Sax updated KAFKA-18731:
------------------------------------
Description:
We introduce "key access" (rich-)functions in the Kafka Streams DSL via
KIP-149:
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner]
In particular, we added `ValueJoinerWithKey` interface. While the KIP is not
very specific about it, in the motivation section it says:
{quote}it seems like extending the interface to pass the join key along as well
would be helpful
{quote}
The underlying implementation has `KStreamKTableJoinProcessor` which is used
for both stream-table and stream-globalTable join. For stream-table join, the
stream-key y (which is the join key) is passed into `ValueJoinerWithKey`.
For stream-globalTable join, we thus also pass the stream-key, however, for
this case, the stream-key is not the join-key, but the join-key is computed on
the fly using the provided `keySelector` (a `KeyValueMapper`).
This seems to be a bug and/or bad design (well, I am sure it was just a small
detail which was missed).
Overall there is three options:
# keep the code as-is (does not seem to be "correct", in the spirit of the KIP)
# declare it as a bug, and just change it
# do a follow up KIP to change stream-globalTable join to pass in both, the
stream-key and the join-key (which would require a KIP)
For other joins, ie, stream-stream or table-table there is no such issue, as
the input-key is the join-key (similar to stream-table join). The only other
exception would be fk-table-table join which was added much later; however, it
only takes a `ValueJoiner` and thus also does not see this issue. If we go with
option (3), and do a KIP, we should consider to support `ValueJoinerWithKey`
for FK-table-table joins, too. If we go with option (2), we should file a
follow up ticket for this idea to extend FK-table-table join in this way.
was:
We introduce "key access" (rich-)functions in the Kafka Streams DSL via
KIP-149:
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner]
In particular, we added `ValueJoinerWithKey` interface. While the KIP is not
very specific about it, in the motivation section it says:
{quote}it seems like extending the interface to pass the join key along as well
would be helpful
{quote}
The underlying implementation has `KStreamKTableJoinProcessor` which is used
for both stream-table and stream-globalTable join. For stream-table join, the
stream-key y (which is the join key) is passed into `ValueJoinerWithKey`.
For stream-globalTable join, we thus also pass the stream-key, however, for
this case, the stream-key is not the join-key, but the join-key is computed on
the fly using the provided `keySelector` (a `KeyValueMapper`).
This seems to be a bug and/or bad design (well, I am sure it was just a small
detail which was missed).
Overall there is three options:
# keep the code as-is (does not seem to be "correct", in the spirit of the KIP)
# declare it as a bug, and just change it
# do a follow up KIP to change stream-globalTable join to pass in both, the
stream-key and the join-key (which would require a KIP)
For other joins, ie, stream-stream or table-table there is no such issue, as
the input-key is the join-key (similar to stream-table join). The only other
exception would be fk-table-table join which was added much later; however, it
only takes a `ValueJoiner` and thus also does not see this issue. If we go with
option (3), and do KIP, we should consider to support `ValueJoinerWithKey` for
FK-table-table joins, too. If we go with option (2), we should file a follow up
ticket for this idea to extend FK-table-table join in this way.
> KStream-GlobalKTabel ValueJoinerWithKey passed incorrect key
> ------------------------------------------------------------
>
> Key: KAFKA-18731
> URL: https://issues.apache.org/jira/browse/KAFKA-18731
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Reporter: Matthias J. Sax
> Priority: Minor
>
> We introduce "key access" (rich-)functions in the Kafka Streams DSL via
> KIP-149:
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner]
>
> In particular, we added `ValueJoinerWithKey` interface. While the KIP is not
> very specific about it, in the motivation section it says:
> {quote}it seems like extending the interface to pass the join key along as
> well would be helpful
> {quote}
> The underlying implementation has `KStreamKTableJoinProcessor` which is used
> for both stream-table and stream-globalTable join. For stream-table join, the
> stream-key y (which is the join key) is passed into `ValueJoinerWithKey`.
> For stream-globalTable join, we thus also pass the stream-key, however, for
> this case, the stream-key is not the join-key, but the join-key is computed
> on the fly using the provided `keySelector` (a `KeyValueMapper`).
> This seems to be a bug and/or bad design (well, I am sure it was just a small
> detail which was missed).
> Overall there is three options:
> # keep the code as-is (does not seem to be "correct", in the spirit of the
> KIP)
> # declare it as a bug, and just change it
> # do a follow up KIP to change stream-globalTable join to pass in both, the
> stream-key and the join-key (which would require a KIP)
> For other joins, ie, stream-stream or table-table there is no such issue, as
> the input-key is the join-key (similar to stream-table join). The only other
> exception would be fk-table-table join which was added much later; however,
> it only takes a `ValueJoiner` and thus also does not see this issue. If we go
> with option (3), and do a KIP, we should consider to support
> `ValueJoinerWithKey` for FK-table-table joins, too. If we go with option (2),
> we should file a follow up ticket for this idea to extend FK-table-table join
> in this way.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)