Re: Question on Pattern Matching

2020-07-23 Thread Basanth Gowda
Yes - I am able to process matched out patterns.

Let's suppose I have an order fulfillment process.

I want to know how many fulfillments have not met SLA and further how late
they are and track until they are fulfilled.

>From what I tried with samples, once the pattern timeout, it is discarded
and events that come after that are ignored (not applied to the pattern).

Is there a better way to do it using Table API? Where I am able to emit an
event (alert) when the timeout happens, and it continues to alert me - hey
you fulfillment is delayed by 6 hours, 12 hours and so on and also know
when it is finally completed.

On Thu, Jul 16, 2020 at 3:08 PM Chesnay Schepler  wrote:

> Have you read this part of the documentation
> <https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/libs/cep.html#handling-timed-out-partial-patterns>
> ?
>
> From what I understand, it provides you hooks for processing matched/timed
> out patterns.
>
> On 16/07/2020 20:23, Basanth Gowda wrote:
>
> Hello,
>
> We have a use case where we want to know when a certain pattern doesn't
> complete within a given time frame.
>
> For Example A -> B -> C -> D (needs to complete in 10 minutes)
>
> Now with Flink if event D doesn't happen in 10 minutes, the pattern is
> discarded and we can get notified. We also want to track how many of them
> completed (even if they meet SLA). How do we achieve this with FLINK CEP or
> other mechanisms?
>
> thank you,
> Basanth
>
>
>


Question on Pattern Matching

2020-07-16 Thread Basanth Gowda
Hello,

We have a use case where we want to know when a certain pattern doesn't
complete within a given time frame.

For Example A -> B -> C -> D (needs to complete in 10 minutes)

Now with Flink if event D doesn't happen in 10 minutes, the pattern is
discarded and we can get notified. We also want to track how many of them
completed (even if they meet SLA). How do we achieve this with FLINK CEP or
other mechanisms?

thank you,
Basanth


Re: Flink CEP questions

2017-08-18 Thread Basanth Gowda
Thank you very much Biplob and David

Thanks David for those links . That is exactly what I was looking for.




On Fri, Aug 18, 2017 at 5:16 AM, Dawid Wysakowicz <
wysakowicz.da...@gmail.com> wrote:

> Hi Basanth,
>
> Ad.3 Unfortunately right now, you cannot reset, but there is ongoing work
> to introduce AfterMatchSkipStrategies(https://issues.apache.org/jira/
> browse/FLINK-7169?filter=12339990). This will allow the behaviour you
> described with the SKIP_PAST_LAST strategy.
>
> Ad.4 If I understand correctly, you would like to trigger the Pattern
> matching just at the end of the window. The CEP library emits the matches
> as soon as they are found, so I don’t think you can implement that use case
> with CEP. You can though try implement it yourself with e.g.
> ProcessFunction (https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/dev/stream/process_function.html)
>
> > On 18 Aug 2017, at 03:28, Basanth Gowda <basanth.go...@gmail.com> wrote:
> >
> > Hi Kostas,
> >
> > For 3 -> I was able to do the following and it worked perfectly fine. Is
> there a way we could reset? Looks like the following code behaves more like
> a sliding count. What I want to do is reset the count once the alert has
> matched, and start over the count. May be I will have to have some state
> and do that in filter event myself ?
> >
> > Pattern.begin("rule").where(new IterativeCondition() {
> > @Override
> > public boolean filter(Event event, Context context)
> throws Exception {
> > event.getCpu() > 80.0;
> > }
> > }).times(5).within(Time.of(1, TimeUnit.MINUTES));
> >
> >
> > For 4 -> I wasn't able to do this. Assuming I don't know the frequency
> of Input events, I want to tell every event that came in the 1 minute,
> should have matched the pattern. It could be 10 events in one minute, may
> be 50 the next minute.  If all the events in the window match the pattern,
> then we want an alert.
> >
> > thank you,
> > Basanth
> >
> > On Thu, Aug 17, 2017 at 9:46 AM, Kostas Kloudas <
> k.klou...@data-artisans.com> wrote:
> > Hi Basanth,
> >
> > This is the documentation page can be found here: https://ci.apache.org/
> projects/flink/flink-docs-release-1.3/dev/libs/cep.html
> > For:
> >   3) you should use the times(N) and the within(TIME) clauses
> >   4) if by continuously you mean without stopping, then you should
> use the followedBy() or next()
> >   (check "Combining Patterns” https://ci.apache.org/
> projects/flink/flink-docs-release-1.3/dev/libs/cep.html#combining-patterns
> in the docs above)
> >
> > I am not aware of any examples but you can check this slides:
> > https://www.slideshare.net/dataArtisans/kostas-kloudas-
> complex-event-processing-with-flink-the-state-of-flinkcep
> > for an overview of the CEP library or you can watch the related video.
> >
> > Cheers,
> > Kostas
> >
> >> On Aug 17, 2017, at 3:32 PM, Basanth Gowda <basanth.go...@gmail.com>
> wrote:
> >>
> >> All,
> >> New to Flink and more so with Flink CEP.
> >>
> >> I want to write a sample program that does the following :
> >>
> >> Lets suppose data cpu usage of a given server.
> >>
> >>  • Want to Alert when CPU usage is above or below certain value
> >>  • Want to Alert when CPU usage falls in a range
> >>  • Want to Alert when the above condition matches n times in x
> interval (could be seconds, minutes, hours)
> >>  • Want to Alert when the above condition happens continuously for
> x interval (could be seconds, minutes or hours)
> >> How would we achieve 3, 4 in the list above ? Any examples that I refer
> to ?
> >>
> >>
> >> thank you,
> >> Basanth
> >
> >
>
>


Re: Flink CEP questions

2017-08-17 Thread Basanth Gowda
Hi Kostas,

For 3 -> I was able to do the following and it worked perfectly fine. Is
there a way we could reset? Looks like the following code behaves more like
a sliding count. What I want to do is reset the count once the alert has
matched, and start over the count. May be I will have to have some state
and do that in filter event myself ?

Pattern.begin("rule").where(new IterativeCondition() {
@Override
public boolean filter(Event event, Context context)
throws Exception {
event.getCpu() > 80.0;
}
}).times(5).within(Time.of(1, TimeUnit.MINUTES));



For 4 -> I wasn't able to do this. Assuming I don't know the frequency of
Input events, I want to tell every event that came in the 1 minute, should
have matched the pattern. It could be 10 events in one minute, may be 50
the next minute.  If all the events in the window match the pattern, then
we want an alert.

thank you,
Basanth

On Thu, Aug 17, 2017 at 9:46 AM, Kostas Kloudas <k.klou...@data-artisans.com
> wrote:

> Hi Basanth,
>
> This is the documentation page can be found here: https://ci.apache.org/
> projects/flink/flink-docs-release-1.3/dev/libs/cep.html
> For:
> 3) you should use the times(N) and the within(TIME) clauses
>   4) if by continuously you mean without stopping, then you should use
> the followedBy() or next()
> (check "Combining Patterns” https://ci.apache.org/
> projects/flink/flink-docs-release-1.3/dev/libs/cep.html#combining-patterns in
> the docs above)
>
> I am not aware of any examples but you can check this slides:
> https://www.slideshare.net/dataArtisans/kostas-kloudas-
> complex-event-processing-with-flink-the-state-of-flinkcep
> for an overview of the CEP library or you can watch the related video.
>
> Cheers,
> Kostas
>
> On Aug 17, 2017, at 3:32 PM, Basanth Gowda <basanth.go...@gmail.com>
> wrote:
>
> All,
> New to Flink and more so with Flink CEP.
>
> I want to write a sample program that does the following :
>
> Lets suppose data cpu usage of a given server.
>
>
>1. Want to Alert when CPU usage is above or below certain value
>2. Want to Alert when CPU usage falls in a range
>3. Want to Alert when the above condition matches n times in x
>interval (could be seconds, minutes, hours)
>4. Want to Alert when the above condition happens continuously for x
>interval (could be seconds, minutes or hours)
>
> How would we achieve 3, 4 in the list above ? Any examples that I refer to
> ?
>
>
> thank you,
> Basanth
>
>
>


Flink CEP questions

2017-08-17 Thread Basanth Gowda
All,
New to Flink and more so with Flink CEP.

I want to write a sample program that does the following :

Lets suppose data cpu usage of a given server.


   1. Want to Alert when CPU usage is above or below certain value
   2. Want to Alert when CPU usage falls in a range
   3. Want to Alert when the above condition matches n times in x interval
   (could be seconds, minutes, hours)
   4. Want to Alert when the above condition happens continuously for x
   interval (could be seconds, minutes or hours)

How would we achieve 3, 4 in the list above ? Any examples that I refer to ?


thank you,
Basanth


Re: Aggregation by key hierarchy

2017-08-16 Thread Basanth Gowda
Thanks Nico.

As there are 2 ways to achieve this which is better ?

1st option -> dataStream.flatMap( ... ) -> this takes in out and provides
me N number of outputs, depending on my key combination . On each of the
output the same windowing logic is applied

or the one you suggested

2nd option -> use keyBy to create N number of streams

With the fist option I would use an external config, and it allows me to
change the number of combinations dynamically at runtime. Would it be
possible with 2nd option as well ? Can I modify or add data stream at
runtime without restarting  ?

On Wed, Aug 16, 2017 at 4:37 AM, Nico Kruber <n...@data-artisans.com> wrote:

> [back to the ml...]
>
> also including your other mail's additional content...
> > I have been able to do this by the following and repeating this for every
> > key + window combination. So in the above case there would be 8 blocks
> like
> > below. (4 combinations and 2 window period for each combination)
> >
> > modelDataStream.keyBy("campaiginId","addId")
> > .timeWindow(Time.minutes(1))
> > .trigger(CountTrigger.of(2))
> > .reduce(..)
>
> As mentioned in my last email, I only see one way for reducing duplication
> (for the key combinations) but this involves more handling from your side
> and
> I'd probably not recommend this. Regarding the different windows, I do not
> see
> something you may do otherwise here.
>
> Maybe Aljoscha (cc'd) has an idea of how to do this better
>
>
> Nico
>
> On Monday, 14 August 2017 19:08:29 CEST Basanth Gowda wrote:
> > Hi Nico,
> > Thank you . This is pretty much what I am doing , was wondering if there
> is
> > a better way.
> >
> > If there are 10 dimensions on which I want to aggregate with 2 windows -
> > this would become about 20 different combinations
> >
> > Thank you
> > Basanth
> >
> > On Mon, Aug 14, 2017 at 12:50 PM Nico Kruber <n...@data-artisans.com>
> wrote:
> > > Hi Basanth,
> > > Let's assume you have records of the form
> > > Record = {timestamp, country, state, city, value}
> > > Then you'd like to create aggregates, e.g. the average, for the
> following
> > > combinations?
> > > 1) avg per country
> > > 2) avg per state and country
> > > 3) avg per city and state and country
> > >
> > > * You could create three streams and aggregate each individually:
> > > DataStream ds = //...
> > > DataStream ds1 = ds.keyBy("country");
> > > DataStream ds2 = ds.keyBy("country","state");
> > > DataStream ds3 = ds.keyBy("country","state","city");
> > > // + your aggregation per stream ds1, ds2, ds3
> > >
> > > You probably want to do different things for each of the resulting
> > > aggregations anyway, so having separate streams is probably right for
> you.
> > >
> > > * Alternatively, you could go with ds1 only and create the aggregates
> of
> > > the
> > > per-state (2) and per-city (3) ones in a stateful aggregation function
> > > yourself, e.g. in a MapState [1]. At the end of your aggregation
> window,
> > > you
> > > could then emit those with different keys to be able to distinguish
> > > between
> > > them.
> > >
> > >
> > > Nico
> > >
> > > [1]
> > > https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/dev/stream/
> > > state.html
> > > <https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/dev/stream/st
> > > ate.html>>
> > > On Sunday, 13 August 2017 23:13:00 CEST Basanth Gowda wrote:
> > > > For example - this is a sample model from one of the Apache Apex
> > > > presentation.
> > > >
> > > > I would want to aggregate for different combinations, and different
> time
> > > > buckets. What is the best way to do this in Flink ?
> > > >
> > > > {"keys":[{"name":"campaignId","type":"integer"},
> > > >
> > > >  {"name":"adId","type":"integer"},
> > > >  {"name":"creativeId","type":"integer"},
> > > >  {"name":"publisherId","type":"integer"},
> > > >  {"name":"adOrderId","type":"integer"}],
> > > >  "timeBuckets":["1h","1d"],
> > > >

Aggregation on multiple Key combinations and multiple Windows

2017-08-14 Thread Basanth Gowda
Hello,
Posted this yesterday, but not sure if it went through or not.

I am fairly new to Flink. I have a use case which needs aggregation on
different combination of keys and windowing for different intervals. I
searched through but couldn't find anything that could help.


Came across this model on a presentation for Apex . This sums up what we
are trying to achieve. What is the best way to do this in Flink



{"keys":[{"name":"campaignId","type":"integer"},
 {"name":"adId","type":"integer"},
 {"name":"creativeId","type":"integer"},
 {"name":"publisherId","type":"integer"},
 {"name":"adOrderId","type":"integer"}],
 "timeBuckets":["1h","1d"],
 "values":
[{"name":"impressions","type":"integer","aggregators":["SUM"]}
,
 {"name":"clicks","type":"integer","aggregators":["SUM"]},
 {"name":"revenue","type":"integer"}],
 "dimensions":
 [{"combination":["campaignId","adId"]},
 {"combination":["creativeId","campaignId"]},
 {"combination":["campaignId"]},
 {"combination":["publisherId","adOrderId","campaignId"],
"additionalValues":["revenue:SUM"]}]
}


I have been able to do this by the following and repeating this for every
key + window combination. So in the above case there would be 8 blocks like
below. (4 combinations and 2 window period for each combination)

modelDataStream.keyBy("campaiginId","addId")
.timeWindow(Time.minutes(1))
.trigger(CountTrigger.of(2))
.reduce(..)


Re: Aggregation by key hierarchy

2017-08-13 Thread Basanth Gowda
For example - this is a sample model from one of the Apache Apex
presentation.

I would want to aggregate for different combinations, and different time
buckets. What is the best way to do this in Flink ?

{"keys":[{"name":"campaignId","type":"integer"},
 {"name":"adId","type":"integer"},
 {"name":"creativeId","type":"integer"},
 {"name":"publisherId","type":"integer"},
 {"name":"adOrderId","type":"integer"}],
 "timeBuckets":["1h","1d"],
 "values":
[{"name":"impressions","type":"integer","aggregators":["SUM"]}
,
 {"name":"clicks","type":"integer","aggregators":["SUM"]},
 {"name":"revenue","type":"integer"}],
 "dimensions":
 [{"combination":["campaignId","adId"]},
 {"combination":["creativeId","campaignId"]},
 {"combination":["campaignId"]},
 {"combination":["publisherId","adOrderId","campaignId"],
"additionalValues":["revenue:SUM"]}]
}


thank you,
B

On Sun, Aug 13, 2017 at 2:09 PM, Basanth Gowda <basanth.go...@gmail.com>
wrote:

> Hi,
> I want to aggregate hits by Country, State, City. I would these as tags in
> my sample data.
>
> How would I do aggregation at different levels ? Input data would be
> single record
>
> Should I do flatMap transformation first and create 3 records from 1 input
> record, or is there a better way to do it ?
>
> thank you,
> basanth
>


Aggregation by key hierarchy

2017-08-13 Thread Basanth Gowda
Hi,
I want to aggregate hits by Country, State, City. I would these as tags in
my sample data.

How would I do aggregation at different levels ? Input data would be single
record

Should I do flatMap transformation first and create 3 records from 1 input
record, or is there a better way to do it ?

thank you,
basanth


Apache beam and Flink

2017-08-13 Thread Basanth Gowda
I wasn't able to find much info on Flink and Beam wrt CEP & Graph
functionalities.

Is it supported with Apache Beam ?

Is it possible to mix and match Flink and Beam in a single use case ?

thank you,
Basanth


GRPC source in Flink

2017-07-31 Thread Basanth Gowda
Hi,


Is there a way to get data from GRPC source in flink.  If we can how we
guarantee that events are not lost once submitted to Flink.

thank you


Flatbuffers and Flink

2017-07-31 Thread Basanth Gowda
Hi,
This is 1 of 3 questions I had for Flink. Didn't want to club all of them
together, as this might be useful for some one else in the future.

Do we have Flatbuffers support in Flink ? If there is no support, is there
a way to implement it ?

Trying to see if we could use the byte[] that has come from upstream,
without converting it into POJO / other format.


thank you