[jira] [Commented] (KAFKA-12453) Guidance on whether a topology is eligible for optimisation

Matthias J. Sax (Jira) Wed, 31 Mar 2021 09:20:08 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-12453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312523#comment-17312523
 ]


Matthias J. Sax commented on KAFKA-12453:
-----------------------------------------

{quote}Are we saying that its safe to enable topology optimisation when we have 
a toTable() operation, regardless of whether it is the first node in a 
sub-topology or if it is preceded by a key changing operator?{quote}
Exactly. (The original worry was, that we might not create a dedicated 
changelog topic for the later case, but we always do as ensured by our test 
`shouldNotReuseRepartitionTopicAsChangelogs`).
{quote}If so, does this mean there are no caveats re topology optimisation, and 
its safe to enable in any topology? 
{quote}
Sounds about right. (At least we are not aware of any caveats...) – Of course 
you still need to ensure that your topic are correctly configured... If you use 
`builder.table()` with optimization enabled, the input topic must be configures 
with log-compaction enabled (not with the default config of a topic retention 
time).
{quote}In which case, why not make it a non-configurable default?
{quote}
On issues (when we introduced the feature was) backward compatibility. 
Optimization rewrites the topology and if users wanted to upgrade to the new 
version, we need to ensure that the topology does not change – otherwise, the 
upgrade breaks. Thus, those use need to be able do disable the optimization 
(and to ensure they cannot forget it, we disabled it by default and made it 
opt-in...)

The second issue is for the cases when you cannot control input topic configs: 
assume you have an input topic that you want to read as a table, but it's 
configured with retention time, and you don't own the topic but another team 
does (so you cannot change the topic config): for this case, it's not safe to 
use `builder.table()` and enable the optimization. – with `toTable()` (that was 
introduced later) you have a workaround now: you can still enable optimization 
but instead of `buidler.table()` you would use `builder.stream().toTable()`.

Maybe it still worth to add a section to the docs to explain how optimization 
works in more details. Will leave the ticket open for now.

> Guidance on whether a topology is eligible for optimisation
> -----------------------------------------------------------
>
>                 Key: KAFKA-12453
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12453
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Patrick O'Keeffe
>            Priority: Major
>
> Since the introduction of KStream.toTable() in Kafka 2.6.x, the decision 
> about whether a topology is eligible for optimisation is no longer a simple 
> one, and is related to whether toTable() operations are preceded by key 
> changing operators.
> This decision requires expert level knowledge, and there are serious 
> implications associated with getting it wrong in terms of fault tolerance
> Some ideas spring to mind around how to guide developers to make the correct 
> decision:
>  # Topology.describe() could indicate whether this topology is eligible for 
> optimisation
>  # Topologies could be automatically optimised - note this may have an impact 
> at deployment time, in that an application reset may be required. The 
> developer would need to made aware of this and adjust the deployment plan 
> accordingly
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-12453) Guidance on whether a topology is eligible for optimisation

Reply via email to