[ https://issues.apache.org/jira/browse/KAFKA-7566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16702656#comment-16702656 ]
Guozhang Wang commented on KAFKA-7566: -------------------------------------- Were you triggering this job via the punctuation function or make it data driven (i.e. in the process() call check if it is time to do gc)? In the latter case, you could actually be relying on the {{ProcessorContext.topic()/partition()}} and only do the job if the returned topic name is a specific value (in case you have multiple topics as sources) and the number is 0 (or actually, any number is fine, but 0 is always safe without knowing the total number of partitions), so that only one task at a time will be doing this. On punctuation though, there will be no record context, and the above function will return `-1` indicating "not known". > Add sidecar job to leader (or a random single follower) only > ------------------------------------------------------------ > > Key: KAFKA-7566 > URL: https://issues.apache.org/jira/browse/KAFKA-7566 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Boyang Chen > Priority: Minor > > Hey there, > recently we need to add an archive job to a streaming application. The caveat > is that we need to make sure only one instance is doing this task to avoid > potential race condition, and we also don't want to schedule it as a regular > stream task so that we will be blocking normal streaming operation. > Although we could do so by doing a zk lease, I'm raising the case here since > this could be some potential use case for streaming job also. For example, > there are some `leader specific` operation we could schedule in DSL instead > of adhoc manner. > Let me know if you think this makes sense to you, thank you! -- This message was sent by Atlassian JIRA (v7.6.3#76005)