[ 
https://issues.apache.org/jira/browse/KAFKA-6323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293719#comment-16293719
 ] 

Frederic Arno commented on KAFKA-6323:
--------------------------------------

Although I've got very little experience with Kafka (and none at all with 
STREAM_TIME punctuation), Guozhang's proposal sounds good to me.

I think there's something to discuss about managing the gap between 
punctuations. Let's say the last punctuation happened at T0, and the next is 
planned at T1. Let's also consider T2 = T1 + x*interval, where x >= 2.

With STREAM_TIME, if we get no data until T2, the gap is bigger than interval 
and according to Guozhang's proposal we punctuate only once at T2. And will 
punctuate next at T2+interval.

With WALL_CLOCK_TIME it could also happen that we don't effectively punctuate 
before T2 (GC pause, overload, ...). If that happens should we also only 
punctuate once at T2 and next at T2+interval?

> punctuate with WALL_CLOCK_TIME triggered immediately
> ----------------------------------------------------
>
>                 Key: KAFKA-6323
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6323
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 1.0.0
>            Reporter: Frederic Arno
>            Assignee: Frederic Arno
>             Fix For: 1.1.0, 1.0.1
>
>
> When working on a custom Processor from which I am scheduling a punctuation 
> using WALL_CLOCK_TIME. I've noticed that whatever the punctuation interval I 
> set, a call to my Punctuator is always triggered immediately.
> Having a quick look at kafka-streams' code, I could find that all 
> PunctuationSchedule's timestamps are matched against the current time in 
> order to decide whether or not to trigger the punctuator 
> (org.apache.kafka.streams.processor.internals.PunctuationQueue#mayPunctuate). 
> However, I've only seen code that initializes PunctuationSchedule's timestamp 
> to 0, which I guess is what is causing an immediate punctuation.
> At least when using WALL_CLOCK_TIME, shouldn't the PunctuationSchedule's 
> timestamp be initialized to current time + interval?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to