[ https://issues.apache.org/jira/browse/KAFKA-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286047#comment-17286047 ]
Matthias J. Sax commented on KAFKA-12328: ----------------------------------------- It's correct that a task may process multiple partitions from different topics; however, if you have two topics A and B with 10 partitions each, task-0 processed tA-p0 and tB-p0, and task-1 process tA-p1 and tB-p1 and so on, ie, there is a 1:1 mapping from partition number to task-id over all input topics. A single task would never process multiple partitions with different partition numbers. Also, if you have one task that processes multiple partitions (note, if a task processes multiple partitions, it's always partitions of different topics) and you have a state store, the task has one state-store shard – you don't have a shard per partition within a task. Thus, for an iterator over a state-store, it's tied to the task itself, but not tied to a single partition within the task (of course it's still tied to a single partition per topic, but it could be multiple partitions from different topics, with all those partitions having the same partition number). > Find out partition of a store iterator > -------------------------------------- > > Key: KAFKA-12328 > URL: https://issues.apache.org/jira/browse/KAFKA-12328 > Project: Kafka > Issue Type: Wish > Components: streams > Reporter: fml2 > Priority: Major > > This question was posted [on > stakoverflow|https://stackoverflow.com/questions/66032099/kafka-streams-how-to-get-the-partition-an-iterartor-is-iterating-over] > and got an answer but the solution is quite complicated hence this ticket. > > In my Kafka Streams application, I have a task that sets up a scheduled (by > the wall time) punctuator. The punctuator iterates over the entries of a > store and does something with them. Like this: > {code:java} > var store = context().getStateStore("MyStore"); > var iter = store.all(); > while (iter.hasNext()) { > var entry = iter.next(); > // ... do something with the entry > } > // Print a summary (now): N entries processed > // Print a summary (wish): N entries processed in partition P > {code} > Is it possible to find out which partition the punctuator operates on? The > java docs for {{ProcessorContext.partition()}} states that this method > returns {{-1}} within punctuators. > I've read [Kafka Streams: Punctuate vs > Process|https://stackoverflow.com/questions/50776987/kafka-streams-punctuate-vs-process] > and the answers there. I can understand that a task is, in general, not tied > to a particular partition. But an iterator should be tied IMO. > How can I find out the partition? > Or is my assumption that a particular instance of a store iterator is tied to > a partion wrong? > What I need it for: I'd like to include the partition number in some log > messages. For now, I have several nearly identical log messages stating that > the punctuator does this and that. In order to make those messages "unique" > I'd like to include the partition number into them. > Since I'm working with a single store here (which might be partitioned), I > assume that every single execution of the punctuator is bound to a single > partition of that store. > > It would be cool if there were a method {{iterator.partition}} (or similar) > to get this information. -- This message was sent by Atlassian Jira (v8.3.4#803005)