Re: Question on Pattern Matching
Yes - I am able to process matched out patterns. Let's suppose I have an order fulfillment process. I want to know how many fulfillments have not met SLA and further how late they are and track until they are fulfilled. >From what I tried with samples, once the pattern timeout, it is discarded and events that come after that are ignored (not applied to the pattern). Is there a better way to do it using Table API? Where I am able to emit an event (alert) when the timeout happens, and it continues to alert me - hey you fulfillment is delayed by 6 hours, 12 hours and so on and also know when it is finally completed. On Thu, Jul 16, 2020 at 3:08 PM Chesnay Schepler wrote: > Have you read this part of the documentation > <https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/libs/cep.html#handling-timed-out-partial-patterns> > ? > > From what I understand, it provides you hooks for processing matched/timed > out patterns. > > On 16/07/2020 20:23, Basanth Gowda wrote: > > Hello, > > We have a use case where we want to know when a certain pattern doesn't > complete within a given time frame. > > For Example A -> B -> C -> D (needs to complete in 10 minutes) > > Now with Flink if event D doesn't happen in 10 minutes, the pattern is > discarded and we can get notified. We also want to track how many of them > completed (even if they meet SLA). How do we achieve this with FLINK CEP or > other mechanisms? > > thank you, > Basanth > > >
Question on Pattern Matching
Hello, We have a use case where we want to know when a certain pattern doesn't complete within a given time frame. For Example A -> B -> C -> D (needs to complete in 10 minutes) Now with Flink if event D doesn't happen in 10 minutes, the pattern is discarded and we can get notified. We also want to track how many of them completed (even if they meet SLA). How do we achieve this with FLINK CEP or other mechanisms? thank you, Basanth
Re: Flink CEP questions
Thank you very much Biplob and David Thanks David for those links . That is exactly what I was looking for. On Fri, Aug 18, 2017 at 5:16 AM, Dawid Wysakowicz < wysakowicz.da...@gmail.com> wrote: > Hi Basanth, > > Ad.3 Unfortunately right now, you cannot reset, but there is ongoing work > to introduce AfterMatchSkipStrategies(https://issues.apache.org/jira/ > browse/FLINK-7169?filter=12339990). This will allow the behaviour you > described with the SKIP_PAST_LAST strategy. > > Ad.4 If I understand correctly, you would like to trigger the Pattern > matching just at the end of the window. The CEP library emits the matches > as soon as they are found, so I don’t think you can implement that use case > with CEP. You can though try implement it yourself with e.g. > ProcessFunction (https://ci.apache.org/projects/flink/flink-docs- > release-1.3/dev/stream/process_function.html) > > > On 18 Aug 2017, at 03:28, Basanth Gowda <basanth.go...@gmail.com> wrote: > > > > Hi Kostas, > > > > For 3 -> I was able to do the following and it worked perfectly fine. Is > there a way we could reset? Looks like the following code behaves more like > a sliding count. What I want to do is reset the count once the alert has > matched, and start over the count. May be I will have to have some state > and do that in filter event myself ? > > > > Pattern.begin("rule").where(new IterativeCondition() { > > @Override > > public boolean filter(Event event, Context context) > throws Exception { > > event.getCpu() > 80.0; > > } > > }).times(5).within(Time.of(1, TimeUnit.MINUTES)); > > > > > > For 4 -> I wasn't able to do this. Assuming I don't know the frequency > of Input events, I want to tell every event that came in the 1 minute, > should have matched the pattern. It could be 10 events in one minute, may > be 50 the next minute. If all the events in the window match the pattern, > then we want an alert. > > > > thank you, > > Basanth > > > > On Thu, Aug 17, 2017 at 9:46 AM, Kostas Kloudas < > k.klou...@data-artisans.com> wrote: > > Hi Basanth, > > > > This is the documentation page can be found here: https://ci.apache.org/ > projects/flink/flink-docs-release-1.3/dev/libs/cep.html > > For: > > 3) you should use the times(N) and the within(TIME) clauses > > 4) if by continuously you mean without stopping, then you should > use the followedBy() or next() > > (check "Combining Patterns” https://ci.apache.org/ > projects/flink/flink-docs-release-1.3/dev/libs/cep.html#combining-patterns > in the docs above) > > > > I am not aware of any examples but you can check this slides: > > https://www.slideshare.net/dataArtisans/kostas-kloudas- > complex-event-processing-with-flink-the-state-of-flinkcep > > for an overview of the CEP library or you can watch the related video. > > > > Cheers, > > Kostas > > > >> On Aug 17, 2017, at 3:32 PM, Basanth Gowda <basanth.go...@gmail.com> > wrote: > >> > >> All, > >> New to Flink and more so with Flink CEP. > >> > >> I want to write a sample program that does the following : > >> > >> Lets suppose data cpu usage of a given server. > >> > >> • Want to Alert when CPU usage is above or below certain value > >> • Want to Alert when CPU usage falls in a range > >> • Want to Alert when the above condition matches n times in x > interval (could be seconds, minutes, hours) > >> • Want to Alert when the above condition happens continuously for > x interval (could be seconds, minutes or hours) > >> How would we achieve 3, 4 in the list above ? Any examples that I refer > to ? > >> > >> > >> thank you, > >> Basanth > > > > > >
Re: Flink CEP questions
Hi Kostas, For 3 -> I was able to do the following and it worked perfectly fine. Is there a way we could reset? Looks like the following code behaves more like a sliding count. What I want to do is reset the count once the alert has matched, and start over the count. May be I will have to have some state and do that in filter event myself ? Pattern.begin("rule").where(new IterativeCondition() { @Override public boolean filter(Event event, Context context) throws Exception { event.getCpu() > 80.0; } }).times(5).within(Time.of(1, TimeUnit.MINUTES)); For 4 -> I wasn't able to do this. Assuming I don't know the frequency of Input events, I want to tell every event that came in the 1 minute, should have matched the pattern. It could be 10 events in one minute, may be 50 the next minute. If all the events in the window match the pattern, then we want an alert. thank you, Basanth On Thu, Aug 17, 2017 at 9:46 AM, Kostas Kloudas <k.klou...@data-artisans.com > wrote: > Hi Basanth, > > This is the documentation page can be found here: https://ci.apache.org/ > projects/flink/flink-docs-release-1.3/dev/libs/cep.html > For: > 3) you should use the times(N) and the within(TIME) clauses > 4) if by continuously you mean without stopping, then you should use > the followedBy() or next() > (check "Combining Patterns” https://ci.apache.org/ > projects/flink/flink-docs-release-1.3/dev/libs/cep.html#combining-patterns in > the docs above) > > I am not aware of any examples but you can check this slides: > https://www.slideshare.net/dataArtisans/kostas-kloudas- > complex-event-processing-with-flink-the-state-of-flinkcep > for an overview of the CEP library or you can watch the related video. > > Cheers, > Kostas > > On Aug 17, 2017, at 3:32 PM, Basanth Gowda <basanth.go...@gmail.com> > wrote: > > All, > New to Flink and more so with Flink CEP. > > I want to write a sample program that does the following : > > Lets suppose data cpu usage of a given server. > > >1. Want to Alert when CPU usage is above or below certain value >2. Want to Alert when CPU usage falls in a range >3. Want to Alert when the above condition matches n times in x >interval (could be seconds, minutes, hours) >4. Want to Alert when the above condition happens continuously for x >interval (could be seconds, minutes or hours) > > How would we achieve 3, 4 in the list above ? Any examples that I refer to > ? > > > thank you, > Basanth > > >
Flink CEP questions
All, New to Flink and more so with Flink CEP. I want to write a sample program that does the following : Lets suppose data cpu usage of a given server. 1. Want to Alert when CPU usage is above or below certain value 2. Want to Alert when CPU usage falls in a range 3. Want to Alert when the above condition matches n times in x interval (could be seconds, minutes, hours) 4. Want to Alert when the above condition happens continuously for x interval (could be seconds, minutes or hours) How would we achieve 3, 4 in the list above ? Any examples that I refer to ? thank you, Basanth
Re: Aggregation by key hierarchy
Thanks Nico. As there are 2 ways to achieve this which is better ? 1st option -> dataStream.flatMap( ... ) -> this takes in out and provides me N number of outputs, depending on my key combination . On each of the output the same windowing logic is applied or the one you suggested 2nd option -> use keyBy to create N number of streams With the fist option I would use an external config, and it allows me to change the number of combinations dynamically at runtime. Would it be possible with 2nd option as well ? Can I modify or add data stream at runtime without restarting ? On Wed, Aug 16, 2017 at 4:37 AM, Nico Kruber <n...@data-artisans.com> wrote: > [back to the ml...] > > also including your other mail's additional content... > > I have been able to do this by the following and repeating this for every > > key + window combination. So in the above case there would be 8 blocks > like > > below. (4 combinations and 2 window period for each combination) > > > > modelDataStream.keyBy("campaiginId","addId") > > .timeWindow(Time.minutes(1)) > > .trigger(CountTrigger.of(2)) > > .reduce(..) > > As mentioned in my last email, I only see one way for reducing duplication > (for the key combinations) but this involves more handling from your side > and > I'd probably not recommend this. Regarding the different windows, I do not > see > something you may do otherwise here. > > Maybe Aljoscha (cc'd) has an idea of how to do this better > > > Nico > > On Monday, 14 August 2017 19:08:29 CEST Basanth Gowda wrote: > > Hi Nico, > > Thank you . This is pretty much what I am doing , was wondering if there > is > > a better way. > > > > If there are 10 dimensions on which I want to aggregate with 2 windows - > > this would become about 20 different combinations > > > > Thank you > > Basanth > > > > On Mon, Aug 14, 2017 at 12:50 PM Nico Kruber <n...@data-artisans.com> > wrote: > > > Hi Basanth, > > > Let's assume you have records of the form > > > Record = {timestamp, country, state, city, value} > > > Then you'd like to create aggregates, e.g. the average, for the > following > > > combinations? > > > 1) avg per country > > > 2) avg per state and country > > > 3) avg per city and state and country > > > > > > * You could create three streams and aggregate each individually: > > > DataStream ds = //... > > > DataStream ds1 = ds.keyBy("country"); > > > DataStream ds2 = ds.keyBy("country","state"); > > > DataStream ds3 = ds.keyBy("country","state","city"); > > > // + your aggregation per stream ds1, ds2, ds3 > > > > > > You probably want to do different things for each of the resulting > > > aggregations anyway, so having separate streams is probably right for > you. > > > > > > * Alternatively, you could go with ds1 only and create the aggregates > of > > > the > > > per-state (2) and per-city (3) ones in a stateful aggregation function > > > yourself, e.g. in a MapState [1]. At the end of your aggregation > window, > > > you > > > could then emit those with different keys to be able to distinguish > > > between > > > them. > > > > > > > > > Nico > > > > > > [1] > > > https://ci.apache.org/projects/flink/flink-docs- > release-1.3/dev/stream/ > > > state.html > > > <https://ci.apache.org/projects/flink/flink-docs- > release-1.3/dev/stream/st > > > ate.html>> > > > On Sunday, 13 August 2017 23:13:00 CEST Basanth Gowda wrote: > > > > For example - this is a sample model from one of the Apache Apex > > > > presentation. > > > > > > > > I would want to aggregate for different combinations, and different > time > > > > buckets. What is the best way to do this in Flink ? > > > > > > > > {"keys":[{"name":"campaignId","type":"integer"}, > > > > > > > > {"name":"adId","type":"integer"}, > > > > {"name":"creativeId","type":"integer"}, > > > > {"name":"publisherId","type":"integer"}, > > > > {"name":"adOrderId","type":"integer"}], > > > > "timeBuckets":["1h","1d"], > > > >
Aggregation on multiple Key combinations and multiple Windows
Hello, Posted this yesterday, but not sure if it went through or not. I am fairly new to Flink. I have a use case which needs aggregation on different combination of keys and windowing for different intervals. I searched through but couldn't find anything that could help. Came across this model on a presentation for Apex . This sums up what we are trying to achieve. What is the best way to do this in Flink {"keys":[{"name":"campaignId","type":"integer"}, {"name":"adId","type":"integer"}, {"name":"creativeId","type":"integer"}, {"name":"publisherId","type":"integer"}, {"name":"adOrderId","type":"integer"}], "timeBuckets":["1h","1d"], "values": [{"name":"impressions","type":"integer","aggregators":["SUM"]} , {"name":"clicks","type":"integer","aggregators":["SUM"]}, {"name":"revenue","type":"integer"}], "dimensions": [{"combination":["campaignId","adId"]}, {"combination":["creativeId","campaignId"]}, {"combination":["campaignId"]}, {"combination":["publisherId","adOrderId","campaignId"], "additionalValues":["revenue:SUM"]}] } I have been able to do this by the following and repeating this for every key + window combination. So in the above case there would be 8 blocks like below. (4 combinations and 2 window period for each combination) modelDataStream.keyBy("campaiginId","addId") .timeWindow(Time.minutes(1)) .trigger(CountTrigger.of(2)) .reduce(..)
Re: Aggregation by key hierarchy
For example - this is a sample model from one of the Apache Apex presentation. I would want to aggregate for different combinations, and different time buckets. What is the best way to do this in Flink ? {"keys":[{"name":"campaignId","type":"integer"}, {"name":"adId","type":"integer"}, {"name":"creativeId","type":"integer"}, {"name":"publisherId","type":"integer"}, {"name":"adOrderId","type":"integer"}], "timeBuckets":["1h","1d"], "values": [{"name":"impressions","type":"integer","aggregators":["SUM"]} , {"name":"clicks","type":"integer","aggregators":["SUM"]}, {"name":"revenue","type":"integer"}], "dimensions": [{"combination":["campaignId","adId"]}, {"combination":["creativeId","campaignId"]}, {"combination":["campaignId"]}, {"combination":["publisherId","adOrderId","campaignId"], "additionalValues":["revenue:SUM"]}] } thank you, B On Sun, Aug 13, 2017 at 2:09 PM, Basanth Gowda <basanth.go...@gmail.com> wrote: > Hi, > I want to aggregate hits by Country, State, City. I would these as tags in > my sample data. > > How would I do aggregation at different levels ? Input data would be > single record > > Should I do flatMap transformation first and create 3 records from 1 input > record, or is there a better way to do it ? > > thank you, > basanth >
Aggregation by key hierarchy
Hi, I want to aggregate hits by Country, State, City. I would these as tags in my sample data. How would I do aggregation at different levels ? Input data would be single record Should I do flatMap transformation first and create 3 records from 1 input record, or is there a better way to do it ? thank you, basanth
Apache beam and Flink
I wasn't able to find much info on Flink and Beam wrt CEP & Graph functionalities. Is it supported with Apache Beam ? Is it possible to mix and match Flink and Beam in a single use case ? thank you, Basanth
GRPC source in Flink
Hi, Is there a way to get data from GRPC source in flink. If we can how we guarantee that events are not lost once submitted to Flink. thank you
Flatbuffers and Flink
Hi, This is 1 of 3 questions I had for Flink. Didn't want to club all of them together, as this might be useful for some one else in the future. Do we have Flatbuffers support in Flink ? If there is no support, is there a way to implement it ? Trying to see if we could use the byte[] that has come from upstream, without converting it into POJO / other format. thank you