Hi all,
Thanks for the KIP and the ongoing discussion. I have a few questions
about the draft:
LB1: How should state store access work in multi-partition mode? Today
getKeyValueStore("myStore") returns the one store owned by the single
task. With multiple partitions, there are multiple store instances,
one per task. Would it make sense to add a getKeyValueStore(name,
partition) overload, and/or have the no-argument form return a
composite read-only view? Related: how should this interact with
mutating operations like put?
LB2: Could the KIP describe how repartition routing works internally?
Today the code re-enqueues records from internal topics back into
partition 0. In multi-partition mode, records written to a repartition
topic would need to be routed through the partitioner on the new key
so they land on the correct downstream task. Since this is the core
scenario the KIP tries to enable, understanding this mechanism would
be helpful.
LB3: How are multi-subtopology topologies handled? The driver today
creates one task per subtopology (currently just one, with TaskId(0,
0)). In multi-partition mode, a topology with N subtopologies and P
partitions would logically need N × P tasks — so the "one task per
partition" framing in the KIP may be understating it. Could the KIP
clarify the task model for multi-subtopology cases, including
processing order across tasks after a single pipeInput?
Thanks,
Lucas
On Thu, Apr 16, 2026 at 11:48 AM Alieh Saeedi via dev
<[email protected]> wrote:
>
> Hi all
>
> Thanks for the KIP and the interesting discussion!
>
> About backward compatibility:
> - What Matthias suggested reminded me of the state machine. It's a very
> clever idea but I think it's too complex for the user and also seems a bit
> error-prone. We must address all edge cases.
> - The alternative of keeping two classes around until AK 5.0 feels like a
> heavy maintenance burden for roughly two years, and it would also require
> significant code changes once AK 5.0 is released. The question is how much
> code reuse can we achieve (to avoid duplication)?
> - I’ve tried to follow the reasoning, but I may be missing the point. My
> question is: why don’t we introduce an explicit multi-partition mode with a
> clean separation between setup and execution phases, while still
> maintaining 100% backward compatibility? We can use the builder pattern
> like `TopologyTestDriver driver = new TopologyTestDriver(topology,
> config).withMultiPartitionMode().declareTopic("input", 3)....build();` for
> multi-partition usages. Create the tasks in `build()` and keep the
> `createInputTopic` as-is and both users (single and multi-partition users)
> can still call it. Inside `createInputTopic`, we check if the topic is in
> the list/map of multi-partition topics, we create it with the specified
> number of partitions and if not, we create it single partition.
>
> nit: It would be great if you could update the KIP with the correct
> discussion thread link:
>
> https://lists.apache.org/thread/7nmfsf3fcgdqdp4t2y14y7npfwgrv0yp (maybe?!).
> I usually follow the discussion in the thread rather than via the mailing
> list:)
>
> Bests,
> Alieh
>
> On Mon, Mar 9, 2026 at 3:13 PM Marie-Laure Momplot <
> [email protected]> wrote:
>
> > Big thanks for your feedback, Matthias — much appreciated !
> >
> > MJS1:
> > Regarding backward compatibility, another option could be to introduce a
> > new class, "MultiPartitionTopologyTestDriver".
> > This would allow us to keep the current behavior where input and output
> > topics must be created before sending records.
> > It would also avoid introducing a multi/active mode in the existing
> > "TopologyTestDriver".
> > The new class could coexist alongside TopologyTestDriver starting in Kafka
> > 4.x (for example 4.3 or 4.4).
> > TopologyTestDriver" could then be deprecated in Kafka 5, while
> > MultiPartitionTopologyTestDriver would become the default and only driver
> > in Kafka 6.
> > In this new class, topics would be multi-partitioned by default, which
> > would align with the intended behavior of a test driver for kafka.
> >
> > MJS2:
> >
> > We agree that using ConsumerRecords-like semantics could be an interesting
> > alternative for reading from multi-partition output topics — it’s closer to
> > the Kafka consumer API and may feel more familiar. As you said, there are
> > trade-offs: it could be more verbose for simple test scenarios, and we
> > would need to decide whether to expose partition-level access directly.
> >
> > I do not think the second example here
> >
> > >> // Optional helper: read directly from a specific partition
> > >> TestOutputTopic<String, String> partition1Topic =
> > outputTopic.partition(1);
> > >> List<TestRecord<String, String>> partition1Records =
> > partition1Topic.readRecordsToList();
> >
> > is really necessary when testing and reading results. With the default
> > partitioner, the first example is sufficient and correct.
> >
> > Thank you for your help.
> >
> > Best regards,
> >
> >
> > Adam, Julien, Sébastien and Marie-Laure
> >