Jonathan,

thanks for the KIP. Corner case question:

What happens if an application is stopped an restarted?

 - Should suppress() flush all records (would be _before_ the time elapsed)?
 - Or should it preserve buffered records and reload on restart? For
this case, should the record be flushed on reload (elapsed time is
unknown) or should we reset the timer to zero?


What is unclear to me atm, is the use-case you anticipate. If you assume
a live run of an applications, event-time and processing-time should be
fairly identical (at least with regard to data rates). Thus, suppress()
on event-time should give you about the same behavior as wall-clock
time? If you disagree, can you elaborate?

This leave the case for data reprocessing, for which event-time advances
much faster than wall-clock time. Is this the target use-case?


About the implementation: checking wall-clock time is an expensive
system call, so I am little worried about run-time overhead. This seems
not to be an implementation detail and thus, it might be worth to
includes is in the discussion. The question is, how strict the guarantee
when records should be flushed should be. Assume you set a timer of 1
seconds, and you have a data rate of 1000 records per second, with each
record arriving one ms after the other all each with different key. To
flush this data "correctly" we would need to check wall-clock time very
millisecond... Thoughts?

(We don't need to dive into all details, but a high level discussion
about the desired algorithm and guarantees would be good to have IMHO.)





-Matthias


On 1/30/19 12:16 PM, John Roesler wrote:
> Hi Jonathan,
> 
> Thanks for the KIP!
> 
> I think all the reviewers are heads-down right now reviewing code for the
> imminent 2.2 release, so this discussion may not get much traffic over the
> next couple of weeks. You might want to just keep bumping it once a week or
> so until people start finding time to review and respond.
> 
> Also, This message got marked as spam for me (which happens for mailing
> list messages sometimes, for some reason). I'm hoping that this response
> will hoist it into peoples' inboxes...
> 
> Thanks again for your work on this issue, and I look forward to the
> discussion!
> -John
> 
> On Wed, Jan 30, 2019 at 12:24 AM jonathangor...@newrelic.com <
> jonathangor...@newrelic.com> wrote:
> 
>> Hi all,
>>
>> I just published KIP-424: Allow suppression of intermediate events based
>> on wall clock time
>>
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-424%3A+Allow+suppression+of+intermediate+events+based+on+wall+clock+time
>>
>> I am eager to hear your feedback and concerns. Thanks John Roesler for
>> your guidance in shaping my first KIP!
>>
>> I look forward to working with the Kafka community to see this through,
>>
>> Jonathan
>>
>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to