Is there a way I can leverage OperatorState (instead of KeyState) to solve my 
issue?


> On Mar 19, 2018, at 09:00, Fabian Hueske <fhue...@gmail.com> wrote:
> 
> Hi,
> 
> Data is partitioned by key across machines and state is kept per key. It is 
> not possible to interact with two keys at the same time.
> 
> Best, Fabian
> 
> 2018-03-19 14:47 GMT+01:00 Dhruv Kumar <gargdhru...@gmail.com 
> <mailto:gargdhru...@gmail.com>>:
> In other words, while using the Flink streaming APIs, is it possible to take 
> a decision on emitting a particular key based on the state of some other key 
> present in the same window?
> 
> Thanks!
> --------------------------------------------------
> Dhruv Kumar
> PhD Candidate
> Department of Computer Science and Engineering
> University of Minnesota
> www.dhruvkumar.me <http://www.dhruvkumar.me/>
> 
>> On Mar 19, 2018, at 05:11, Dhruv Kumar <gargdhru...@gmail.com 
>> <mailto:gargdhru...@gmail.com>> wrote:
>> 
>> Task 1: I implemented it using a custom Trigger (see attached file). Looks 
>> like it is doing what I want it to. I copied the code from 
>> EventTimeTrigger.java and overwrote the onElement method. 
>> 
>> Task 2: I will need to maintain the state (this will be the LRU cache) for 
>> multiple keys in the same data structure. But it looks like that the Keyed 
>> states are on a per key basis. Should I use OperatorState in some way? Can I 
>> use a data structure not directly managed by Flink? What will happen in the 
>> case of keys across multiple machines?
>> 
>> <LazyAlgoTrigger.java>
>> 
>> 
>> Dhruv Kumar
>> PhD Candidate
>> Department of Computer Science and Engineering
>> University of Minnesota
>> www.dhruvkumar.me <http://www.dhruvkumar.me/>
>> 
>>> On Mar 19, 2018, at 02:04, Jörn Franke <jornfra...@gmail.com 
>>> <mailto:jornfra...@gmail.com>> wrote:
>>> 
>>> How would you start implementing it? Where are you stuck?
>>> 
>>> Did you already try to implement this?
>>> 
>>> On 18. Mar 2018, at 04:10, Dhruv Kumar <gargdhru...@gmail.com 
>>> <mailto:gargdhru...@gmail.com>> wrote:
>>> 
>>>> Hi
>>>> 
>>>> I am a CS PhD student at UMN Twin Cities. I am trying to use Flink for 
>>>> implementing some very specific use-cases: (They may not seem relevant but 
>>>> I need to implement them or I at least need to know if it is possible to 
>>>> implement them in Flink)
>>>> 
>>>> Assumptions:
>>>> 1. Data stream is of the form (key, value). We achieve this by the .key 
>>>> operation provided by Flink API.
>>>> 2. By emitting a key, I mean sending/outputting its aggregated value to 
>>>> any data sink. 
>>>> 
>>>> 1. For each Tumbling window in the Event Time space, for each key, I would 
>>>> like to aggregate its value until it crosses a particular threshold (same 
>>>> threshold for all the keys). As soon as the key’s aggregated value crosses 
>>>> this threshold, I would like to emit this key. At the end of every 
>>>> tumbling window, all the (key, value) aggregated pairs  would be emitted 
>>>> irrespective of whether they have crossed the threshold or not.
>>>> 
>>>> 2. For each Tumbling window in the event time space, I would like to 
>>>> maintain a LRU cache which stores the keys along with their aggregated 
>>>> values and their latest arrival time. The least recently used (LRU) key 
>>>> would be the key whose latest arrival time is earlier than the latest 
>>>> arrival times of all the other keys present in the LRU cache. The LRU 
>>>> cache is of a limited size. So, it is possible that the number of unique 
>>>> keys in a particular window is greater than the size of LRU cache. 
>>>> Whenever any (key, value) pair arrives, if the key already exists, its 
>>>> aggregated value is updated with the value of the newly arrived value and 
>>>> its latest arrival time is updated with the current event time. If the key 
>>>> does not exist and there is some free slot in the LRU cache, it is added 
>>>> into the LRU. As soon as the LRU cache gets occupied fully and a new key 
>>>> comes in which does not exist in the LRU cache, we would like to emit the 
>>>> least recently used key to accommodate the newly arrived key. As in the 
>>>> case of 1, at the end of every tumbling window, all the (key, value) 
>>>> aggregated pairs in the LRU cache would be emitted.  
>>>> 
>>>> Would like to know how can we implement these algorithms using Flink. Any 
>>>> help would be greatly appreciated.
>>>> 
>>>> Dhruv Kumar
>>>> PhD Candidate
>>>> Department of Computer Science and Engineering
>>>> University of Minnesota
>>>> www.dhruvkumar.me <http://www.dhruvkumar.me/>
>> 
> 
> 

Reply via email to