Thanks for sharing.

As punctuate is called with "streams time" you see the same time value
multiple times. It's again due to the coarse grained advance of "stream
time".

@Thomas: I think, the way we handle it just simplifies the
implementation of punctuations. I don't see any other "advantage".


I will create a JIRA to track this -- we are currently working on some
improvements of punctuation and time management already, and it seems to
be another valuable improvement.


-Matthias


On 5/12/17 10:07 AM, Peter Sinoros Szabo wrote:
> Well, this is also a good question, because it is triggered with the same 
> timestamp 3 times, so in order to create my update for both three seconds, 
> I will have to count the number of punctuations and calculate the missed 
> stream times for myself. It's ok for me to trigger it 3 times, but the 
> timestamp should not be the same in each, but should be increased by the 
> schedule time in each punctuate.
> 
> - Sini
> 
> 
> 
> From:   Thomas Becker <tobec...@tivo.com>
> To:     "users@kafka.apache.org" <users@kafka.apache.org>
> Date:   2017/05/12 18:57
> Subject:        RE: Order of punctuate() and process() in a stream 
> processor
> 
> 
> 
> I'm a bit troubled by the fact that it fires 3 times despite the stream 
> time being advanced all at once; is there a scenario when this is 
> beneficial?
> 
> ________________________________________
> From: Matthias J. Sax [matth...@confluent.io]
> Sent: Friday, May 12, 2017 12:38 PM
> To: users@kafka.apache.org
> Subject: Re: Order of punctuate() and process() in a stream processor
> 
> Hi Peter,
> 
> It's by design. Streams internally tracks time progress (so-called
> "streams time"). "streams time" get advanced *after* processing a record.
> 
> Thus, in your case, "stream time" is still at its old value before it
> processed the first message of you send "burst". After that, "streams
> time" is advanced by 3 seconds, and thus, punctuate fires 3 time.
> 
> I guess, we could change the design and include scheduled punctuations
> when advancing "streams time". But atm, we just don't do this.
> 
> Does this make sense?
> 
> Is this critical for your use case? Or do you just want to understand
> what's happening?
> 
> 
> -Matthias
> 
> 
> On 5/12/17 8:59 AM, Peter Sinoros Szabo wrote:
>> Hi,
>>
>>
>> Let's assume the following case.
>> - a stream processor that uses the Processor API
>> - context.schedule(1000) is called in the init()
>> - the processor reads only one topic that has one partition
>> - using custom timestamp extractor, but that timestamp is just a wall
>> clock time
>>
>>
>> Image the following events:
>> 1., for 10 seconds I send in 5 messages / second
>> 2., does not send any messages for 3 seconds
>> 3., starts the 5 messages / second again
>>
>> I see that punctuate() is not called during the 3 seconds when I do not
>> send any messages. This is ok according to the documentation, because
>> there is not any new messages to trigger the punctuate() call. When the
>> first few messages arrives after a restart the sending (point 3. above) 
> I
>> see the following sequence of method calls:
>>
>> 1., process() on the 1st message
>> 2., punctuate() is called 3 times
>> 3., process() on the 2nd message
>> 4., process() on each following message
>>
>> What I would expect instead is that punctuate() is called first and then
>> process() is called on the messages, because the first message's 
> timestamp
>> is already 3 seconds older then the last punctuate() was called, so the
>> first message belongs after the 3 punctuate() calls.
>>
>> Please let me know if this is a bug or intentional, in this case what is
>> the reason for processing one message before punctuate() is called?
>>
>>
>> Thanks,
>> Peter
>>
>> Péter Sinóros-Szabó
>> Software Engineer
>>
>> Ustream, an IBM Company
>> Andrassy ut 39, H-1061 Budapest
>> Mobile: +36203693050
>> Email: peter.sinoros-sz...@hu.ibm.com
>>
> 
> ________________________________
> 
> This email and any attachments may contain confidential and privileged 
> material for the sole use of the intended recipient. Any review, copying, 
> or distribution of this email (or any attachments) by others is 
> prohibited. If you are not the intended recipient, please contact the 
> sender immediately and permanently delete this email and any attachments. 
> No employee or agent of TiVo Inc. is authorized to conclude any binding 
> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo 
> Inc. may only be made by a signed written agreement.
> 
> 
> 
> 
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to