Hi Aljoscha,

thanks for the speedy reply.

I am processing measurements delivered by smart meters. I use windows to 
gather measurements and calculate values such as average consumption. The key 
is simply the meter ID.

The challenge is that meters may undergo network partitioning, under which 
they fall back to local buffering. The data is then transmitted once 
connectivity has been re-established. I am using event time to obtain 
accurate calculations.

If a specific meter goes offline, and the watermark progresses to the next 
window for an operator instance, then all late data will be discarded once 
that meter is online again, until it has caught up to the event time. This is 
because I am using a custom EventTimeTrigger implementation that discards 
late elements. The reason for that is because Flink would otherwise 
immediately evaluate the window upon receiving a late element, which is a 
problem since my calculations (e.g. the average consumption) depend on 
multiple elements. I cannot calculate averages with that single late element.

Each individual meter guarantees in-order transmission of measurements. If 
watermarks progressed per key, then i would never have late elements because 
of that guarantee. I would be able to accurately calculate averages, with the 
trade-off that my results would arrive sporadically from the same operator 
instance.

I suppose I could bypass the use of windows by implementing a stateful map 
function that mimics windows to a certain degree. I implemented something 
similar in Storm, but the amount of application logic required is 
substantial.

I completely understand why Flink evaluates a window on a late element, since 
there is no other way to know when to evaluate the window as event time has 
already progressed.

Perhaps there is a way to gather/redirect late elements?

Regards
Leon

31. May 2016 13:37 by aljos...@apache.org:


> Hi,> I'm afraid this is impossible with the current design of Flink. Might 
> I ask what you want to achieve with this? Maybe we can come up with a 
> solution.
> -Aljoscha
> On Tue, 31 May 2016 at 13:24 <> leon_mcl...@tutanota.com> > wrote:
>
>>           >> My use case primarily concerns applying transformations per 
>> key, with the keys remaining fixed throughout the topology. I am using 
>> event time for my windows.
>>
>> The problem i am currently facing is that watermarks in windows propagate 
>> per operator instance, meaning the operator event time increases for all 
>> keys that the operator is in charge of. I wish for watermarks to progress 
>> per key, not per operator instance.
>>
>> Is this easily possible? I was unable to find an appropriate solution 
>> based on existing code recipes.
>>
>> Greetings
>> Leon
>>   

Reply via email to