[ 
https://issues.apache.org/jira/browse/KAFKA-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140328#comment-17140328
 ] 

Bruno Cadonna commented on KAFKA-10179:
---------------------------------------

[~desai.p.rohan] While I find the idea of optimizing the materialization in the 
deserializer intriguing, I think the performance penalty that we would pay by 
deserializing and serializing each record during restoration is not worthwhile. 
Additionally -- if optimization is turned on -- we would need to read the 
original data from the source topic instead of the projected data from the 
changelog topic during each restoration which would again hit performance. Of 
course, we would need experiments to better understand the implications. 

An alternative idea would be to allow to plugin a byte-based transformation 
that does not need to deserialize and serialize each record. However, that 
would not solve the issue of having to read the unprojected data during each 
restoration.
 
If you are concerned with the amount of data to materialize a solution could be 
to optimize on topology-level by introducing a {{map()}} that makes the 
projection followed by a {{toTable()}} to materialize the data. That data read 
from the input topic would be the unprojected data but the one materialized is 
the projected one and also during restoration we would just read the projected 
data. An additional advantage of this method is that you can leave the source 
table optimization turned on, because it would not apply to this case.

In summary, the source table optimization was not introduced for the case you 
describe. IMO, it is not even an optimization in that case. 

> State Store Passes Wrong Changelog Topic to Serde for Optimized Source Tables
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-10179
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10179
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.5.0
>            Reporter: Bruno Cadonna
>            Assignee: Bruno Cadonna
>            Priority: Major
>             Fix For: 2.7.0
>
>
> {{MeteredKeyValueStore}} passes the name of the changelog topic of the state 
> store to the state store serdes. Currently, it always passes {{<application 
> ID>-<store name>-changelog}} as the changelog topic name. However, for 
> optimized source tables the changelog topic is the source topic. 
> Most serdes do not use the topic name passed to them. However, if the serdes 
> actually use the topic name for (de)serialization, e.g., when Kafka Streams 
> is used with Confluent's Schema Registry, a 
> {{org.apache.kafka.common.errors.SerializationException}} is thrown.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to