Hello,
I am using low level processor and I set the context.schedule(1),
assuming that punctuate() method is invoked every 10 sec .
I have set
configProperties.put(StreamsConfig.TIMESTAMP_EXTRACTOR_CLASS_CONFIG,
WallclockTimestampExtractor.class.getCanonicalName()) )
Although data is keep co
Your understanding is correct:
Punctuate is not triggered base on wall-clock time, but based in
internally tracked "stream time" that is derived from TimestampExtractor.
Even if you use WallclockTimestampExtractor, "stream time" is only
advance if there are input records.
Not sure why punctuate()
If you are consuming from more than one topic/partition, punctuate is triggered
when the “smallest” time-value changes. So, if there is a partition that
doesn’t have any more messages on it, it will always have the smallest
time-value and that time value won’t change…hence punctuate never gets
Thanks for the comments.
@David: yes, I have a source which is reading data from two topics and one
of them were empty while the second one was loaded with plenty of data.
So what do you suggest to solve this ?
Here is snippet of my code:
StreamsConfig config = new StreamsConfig(configProperties);
I know that the Kafka team is working on a new way to reason about time. My
team's solution was to not use punctuate...but this only works if you have
guarantees that all of the tasks will receive messages..if not all the
partitions. Another solution is to periodically send canaries to all
pa
Yes we are considering to differentiate "process" and "punctuate" function
to be "data-driven" and "time-driven" computations. That is, triggering of
punctuate should NOT be depending on the arrival of messages, or the
message's associated timestamps.
As for now I think periodically inserting the