Hi Kien,

Thanks you so much for you answer !
Regards,
Nhan

De : Kien Truong [mailto:duckientru...@gmail.com]
Envoyé : vendredi 25 janvier 2019 13:47
À : Thanh-Nhan Vo <thanh-nhan...@bleckwen.ai>; user@flink.apache.org
Objet : Re: [Flink 1.6] How to get current total number of processed events


Hi Nhan,

To get a global view over all events, you can use a non-keyed TumblingWindow 
and a ProcessAllWindowFunction.

Inside the ProcessAllWindowFunction, you calculate the min/max/count of the 
elements of that window,

compared them to the existing values in the global state, then update the new 
min/max/count to global state to use in the next window.

You can also get the min/max/count downstream by emitting it together with the 
window's item.



Do note that non-keyed Window always run with a parallelism of 1, so it might 
create a hotspot/bottleneck in your stream.



Regards,

Kien


On 1/25/2019 3:17 PM, Thanh-Nhan Vo wrote:
Hi Kien,

Thank you for your answer.

Please correct me if I'm wrong. If I understand well, if I store the max/min 
value using the value states of a KeyedProcessFunction, this max/min value is 
calculated  per key?

Note that in my case, I expect that at every instant,  I can obtain the 
maximum/minimum number of processed messages for all keys. For example:



-        Input datastream : [ message1(k1, v1)  messages2(k2,v2)  message3(k1, 
v3)  message4(k4, v4)  message5(k1, v5) message6(k2, v6)  message7(k7, v7)]



-        When processing message7(k7, v7), I expect to obtain:



o   Maximum number of processed messages: 3 (corresponding to key k1)

o   Minimum number of processed messages: 1 (corresponding to keys 4 and 7)

Do you have any idea to obtain this, please?

Thank you so much !

Nhan
De : Kien Truong [mailto:duckientru...@gmail.com]
Envoyé : jeudi 24 janvier 2019 12:45
À : Thanh-Nhan Vo 
<thanh-nhan...@bleckwen.ai><mailto:thanh-nhan...@bleckwen.ai>; 
user@flink.apache.org<mailto:user@flink.apache.org>
Objet : Re: [Flink 1.6] How to get current total number of processed events


Hi Nhan,

You can store the max/min value using the value states of a 
KeyedProcessFunction,

or in the global state of a ProcessWindowFunction.



On processing each item, compare its value to the current max/min and update 
the stored value as needed.



Regards,

Kien


On 1/24/2019 12:37 AM, Thanh-Nhan Vo wrote:
Hi Kien Truong,

Thank you for your answer. I have another question, please !
If I count the number of messages processed for a given key j (denoted c_j), is 
there a way to retrieve max{c_j}, min{c_j}?

Thanks

De : Kien Truong [mailto:duckientru...@gmail.com]
Envoyé : mercredi 23 janvier 2019 16:04
À : user@flink.apache.org<mailto:user@flink.apache.org>
Objet : Re: [Flink 1.6] How to get current total number of processed events


Hi Nhan,

Logically, the total number of processed events before an event cannot be 
accurately calculated unless events processing are synchronized.

This is not scalable, so naturally I don't think Flink supports it.

Although, I suppose you can get an approximate count by using a non-keyed 
TumblingWindow, count the item inside the window, then use that value in the 
next window.



Regards,

Kien


On 1/21/2019 9:34 PM, Thanh-Nhan Vo wrote:
Hello all,

I have a question, please !
I'm using Flink 1.6 to process our data in streaming mode.
I wonder if at a given event, there is a way to get the current total number of 
processed events (before this event).
If possible, I want to get this total number of processed events as a value 
state in Keystream.
It means that for a given key in KeyStream, I want to retrieve not only the 
total number of processed events for this key but also the total number of 
processed events for all keys.

There is a way to do this in Flink 1.6, please!
Best regard,
Nhan

Reply via email to