Re: Anomaly detection Apache Flink

2020-04-07 Thread Salvador Vigo
Ok, thanks for the clarification.


On Tue, Apr 7, 2020, 7:00 PM Nienhuis, Ryan  wrote:

> Vigo,
>
>
>
> I mean that the algorithm is a standalone piece of code. There are no
> examples that I am aware of for running it using Flink.
>
>
>
> Ryan
>
>
>
> *From:* Salvador Vigo 
> *Sent:* Saturday, April 4, 2020 12:26 AM
> *To:* Marta Paes Moreira 
> *Cc:* Nienhuis, Ryan ; user 
> *Subject:* RE: [EXTERNAL] Anomaly detection Apache Flink
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Thanks for answer.
>
>
>
> @Marta, First answer videos [1], [2]. It was interesting to see this two
> different approaches, although I was looking for some more specific
> implementation. Link number [3], I didn't know the existence of Kinesis, so
> maybe could be good for benchmarking and comparing my results with the
> Kinesis results. Then the approach of CEP, I am very related with this
> topic since my current work is based in the implementation of a CEP
> pipeline for monitoring. The only problem I see here is that you need in
> advance a predefined pattern. But it worth a try.
>
>
>
> @Ryan, I see this idea of the random cut forest algorithm more close to
> the idea I am looking for. What do you mean when you say that doesn't work
> getting it works with Flink?
>
>
>
> Best,
>
>
>
> On Fri, Apr 3, 2020 at 8:47 PM Marta Paes Moreira 
> wrote:
>
> Forgot to mention that you might also want to have a look into Flink CEP
> [1], Flink's library for Complex Event Processing.
>
> It allows you to define and detect event patterns over streams, which can
> come in pretty handy for anomaly detection.
>
>
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/cep.html
>
>
>
> On Fri, Apr 3, 2020 at 6:08 PM Nienhuis, Ryan  wrote:
>
> I would also have a look at the random cut forest algorithm. This is the
> base algorithm that is used for anomaly detection in several AWS services
> (Quicksight, Kinesis Data Analytics, etc.). It doesn’t help with getting it
> working with Flink, but may be a good place to start for an algorithm.
>
>
>
> https://github.com/aws/random-cut-forest-by-aws
>
>
>
> Ryan
>
>
>
> *From:* Marta Paes Moreira 
> *Sent:* Friday, April 3, 2020 5:25 AM
> *To:* Salvador Vigo 
> *Cc:* user 
> *Subject:* RE: [EXTERNAL] Anomaly detection Apache Flink
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi, Salvador.
>
> You can find some more examples of real-time anomaly detection with Flink
> in these presentations from Microsoft [1] and Salesforce [2] at Flink
> Forward. This blogpost [3] also describes how to build that kind of
> application using Kinesis Data Analytics (based on Flink).
>
> Let me know if these resources help!
>
> [1] https://www.youtube.com/watch?v=NhOZ9Q9_wwI
> [2] https://www.youtube.com/watch?v=D4kk1JM8Kcg
> [3]
> https://towardsdatascience.com/real-time-anomaly-detection-with-aws-c237db9eaa3f
>
>
>
> On Fri, Apr 3, 2020 at 11:37 AM Salvador Vigo 
> wrote:
>
> Hi there,
>
> I am working in an approach to make some experiments related with anomaly
> detection in real time with Apache Flink. I would like to know if there are
> already some open issues in the community.
>
> The only example I found was the one of Scott Kidder
> <https://mux.com/team/scott-kidder> and the Mux platform, 2017. If any
> one is already working in this topic or know some related work or
> publication I will be grateful.
>
> Best,
>
>


RE: Anomaly detection Apache Flink

2020-04-07 Thread Nienhuis, Ryan
Vigo,

I mean that the algorithm is a standalone piece of code. There are no examples 
that I am aware of for running it using Flink.

Ryan

From: Salvador Vigo 
Sent: Saturday, April 4, 2020 12:26 AM
To: Marta Paes Moreira 
Cc: Nienhuis, Ryan ; user 
Subject: RE: [EXTERNAL] Anomaly detection Apache Flink


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Thanks for answer.

@Marta, First answer videos [1], [2]. It was interesting to see this two 
different approaches, although I was looking for some more specific 
implementation. Link number [3], I didn't know the existence of Kinesis, so 
maybe could be good for benchmarking and comparing my results with the Kinesis 
results. Then the approach of CEP, I am very related with this topic since my 
current work is based in the implementation of a CEP pipeline for monitoring. 
The only problem I see here is that you need in advance a predefined pattern. 
But it worth a try.

@Ryan, I see this idea of the random cut forest algorithm more close to the 
idea I am looking for. What do you mean when you say that doesn't work getting 
it works with Flink?

Best,

On Fri, Apr 3, 2020 at 8:47 PM Marta Paes Moreira 
mailto:ma...@ververica.com>> wrote:
Forgot to mention that you might also want to have a look into Flink CEP [1], 
Flink's library for Complex Event Processing.

It allows you to define and detect event patterns over streams, which can come 
in pretty handy for anomaly detection.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/cep.html

On Fri, Apr 3, 2020 at 6:08 PM Nienhuis, Ryan 
mailto:nienh...@amazon.com>> wrote:
I would also have a look at the random cut forest algorithm. This is the base 
algorithm that is used for anomaly detection in several AWS services 
(Quicksight, Kinesis Data Analytics, etc.). It doesn’t help with getting it 
working with Flink, but may be a good place to start for an algorithm.

https://github.com/aws/random-cut-forest-by-aws

Ryan

From: Marta Paes Moreira mailto:ma...@ververica.com>>
Sent: Friday, April 3, 2020 5:25 AM
To: Salvador Vigo mailto:salvador...@gmail.com>>
Cc: user mailto:user@flink.apache.org>>
Subject: RE: [EXTERNAL] Anomaly detection Apache Flink


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Hi, Salvador.

You can find some more examples of real-time anomaly detection with Flink in 
these presentations from Microsoft [1] and Salesforce [2] at Flink Forward. 
This blogpost [3] also describes how to build that kind of application using 
Kinesis Data Analytics (based on Flink).

Let me know if these resources help!

[1] https://www.youtube.com/watch?v=NhOZ9Q9_wwI
[2] https://www.youtube.com/watch?v=D4kk1JM8Kcg
[3] 
https://towardsdatascience.com/real-time-anomaly-detection-with-aws-c237db9eaa3f

On Fri, Apr 3, 2020 at 11:37 AM Salvador Vigo 
mailto:salvador...@gmail.com>> wrote:
Hi there,
I am working in an approach to make some experiments related with anomaly 
detection in real time with Apache Flink. I would like to know if there are 
already some open issues in the community.
The only example I found was the one of Scott 
Kidder<https://mux.com/team/scott-kidder> and the Mux platform, 2017. If any 
one is already working in this topic or know some related work or publication I 
will be grateful.
Best,


Re: Anomaly detection Apache Flink

2020-04-04 Thread Salvador Vigo
Thanks for answer.

@Marta, First answer videos [1], [2]. It was interesting to see this two
different approaches, although I was looking for some more specific
implementation. Link number [3], I didn't know the existence of Kinesis, so
maybe could be good for benchmarking and comparing my results with the
Kinesis results. Then the approach of CEP, I am very related with this
topic since my current work is based in the implementation of a CEP
pipeline for monitoring. The only problem I see here is that you need in
advance a predefined pattern. But it worth a try.

@Ryan, I see this idea of the random cut forest algorithm more close to the
idea I am looking for. What do you mean when you say that doesn't work
getting it works with Flink?

Best,

On Fri, Apr 3, 2020 at 8:47 PM Marta Paes Moreira 
wrote:

> Forgot to mention that you might also want to have a look into Flink CEP
> [1], Flink's library for Complex Event Processing.
>
> It allows you to define and detect event patterns over streams, which can
> come in pretty handy for anomaly detection.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/cep.html
>
> On Fri, Apr 3, 2020 at 6:08 PM Nienhuis, Ryan  wrote:
>
>> I would also have a look at the random cut forest algorithm. This is the
>> base algorithm that is used for anomaly detection in several AWS services
>> (Quicksight, Kinesis Data Analytics, etc.). It doesn’t help with getting it
>> working with Flink, but may be a good place to start for an algorithm.
>>
>>
>>
>> https://github.com/aws/random-cut-forest-by-aws
>>
>>
>>
>> Ryan
>>
>>
>>
>> *From:* Marta Paes Moreira 
>> *Sent:* Friday, April 3, 2020 5:25 AM
>> *To:* Salvador Vigo 
>> *Cc:* user 
>> *Subject:* RE: [EXTERNAL] Anomaly detection Apache Flink
>>
>>
>>
>> *CAUTION*: This email originated from outside of the organization. Do
>> not click links or open attachments unless you can confirm the sender and
>> know the content is safe.
>>
>>
>>
>> Hi, Salvador.
>>
>> You can find some more examples of real-time anomaly detection with Flink
>> in these presentations from Microsoft [1] and Salesforce [2] at Flink
>> Forward. This blogpost [3] also describes how to build that kind of
>> application using Kinesis Data Analytics (based on Flink).
>>
>> Let me know if these resources help!
>>
>> [1] https://www.youtube.com/watch?v=NhOZ9Q9_wwI
>> [2] https://www.youtube.com/watch?v=D4kk1JM8Kcg
>> [3]
>> https://towardsdatascience.com/real-time-anomaly-detection-with-aws-c237db9eaa3f
>>
>>
>>
>> On Fri, Apr 3, 2020 at 11:37 AM Salvador Vigo 
>> wrote:
>>
>> Hi there,
>>
>> I am working in an approach to make some experiments related with anomaly
>> detection in real time with Apache Flink. I would like to know if there are
>> already some open issues in the community.
>>
>> The only example I found was the one of Scott Kidder
>> <https://mux.com/team/scott-kidder> and the Mux platform, 2017. If any
>> one is already working in this topic or know some related work or
>> publication I will be grateful.
>>
>> Best,
>>
>>


Re: Anomaly detection Apache Flink

2020-04-03 Thread Marta Paes Moreira
Forgot to mention that you might also want to have a look into Flink CEP
[1], Flink's library for Complex Event Processing.

It allows you to define and detect event patterns over streams, which can
come in pretty handy for anomaly detection.

[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/cep.html

On Fri, Apr 3, 2020 at 6:08 PM Nienhuis, Ryan  wrote:

> I would also have a look at the random cut forest algorithm. This is the
> base algorithm that is used for anomaly detection in several AWS services
> (Quicksight, Kinesis Data Analytics, etc.). It doesn’t help with getting it
> working with Flink, but may be a good place to start for an algorithm.
>
>
>
> https://github.com/aws/random-cut-forest-by-aws
>
>
>
> Ryan
>
>
>
> *From:* Marta Paes Moreira 
> *Sent:* Friday, April 3, 2020 5:25 AM
> *To:* Salvador Vigo 
> *Cc:* user 
> *Subject:* RE: [EXTERNAL] Anomaly detection Apache Flink
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi, Salvador.
>
> You can find some more examples of real-time anomaly detection with Flink
> in these presentations from Microsoft [1] and Salesforce [2] at Flink
> Forward. This blogpost [3] also describes how to build that kind of
> application using Kinesis Data Analytics (based on Flink).
>
> Let me know if these resources help!
>
> [1] https://www.youtube.com/watch?v=NhOZ9Q9_wwI
> [2] https://www.youtube.com/watch?v=D4kk1JM8Kcg
> [3]
> https://towardsdatascience.com/real-time-anomaly-detection-with-aws-c237db9eaa3f
>
>
>
> On Fri, Apr 3, 2020 at 11:37 AM Salvador Vigo 
> wrote:
>
> Hi there,
>
> I am working in an approach to make some experiments related with anomaly
> detection in real time with Apache Flink. I would like to know if there are
> already some open issues in the community.
>
> The only example I found was the one of Scott Kidder
> <https://mux.com/team/scott-kidder> and the Mux platform, 2017. If any
> one is already working in this topic or know some related work or
> publication I will be grateful.
>
> Best,
>
>


RE: Anomaly detection Apache Flink

2020-04-03 Thread Nienhuis, Ryan
I would also have a look at the random cut forest algorithm. This is the base 
algorithm that is used for anomaly detection in several AWS services 
(Quicksight, Kinesis Data Analytics, etc.). It doesn’t help with getting it 
working with Flink, but may be a good place to start for an algorithm.

https://github.com/aws/random-cut-forest-by-aws

Ryan

From: Marta Paes Moreira 
Sent: Friday, April 3, 2020 5:25 AM
To: Salvador Vigo 
Cc: user 
Subject: RE: [EXTERNAL] Anomaly detection Apache Flink


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


Hi, Salvador.

You can find some more examples of real-time anomaly detection with Flink in 
these presentations from Microsoft [1] and Salesforce [2] at Flink Forward. 
This blogpost [3] also describes how to build that kind of application using 
Kinesis Data Analytics (based on Flink).

Let me know if these resources help!

[1] https://www.youtube.com/watch?v=NhOZ9Q9_wwI
[2] https://www.youtube.com/watch?v=D4kk1JM8Kcg
[3] 
https://towardsdatascience.com/real-time-anomaly-detection-with-aws-c237db9eaa3f

On Fri, Apr 3, 2020 at 11:37 AM Salvador Vigo 
mailto:salvador...@gmail.com>> wrote:
Hi there,
I am working in an approach to make some experiments related with anomaly 
detection in real time with Apache Flink. I would like to know if there are 
already some open issues in the community.
The only example I found was the one of Scott 
Kidder<https://mux.com/team/scott-kidder> and the Mux platform, 2017. If any 
one is already working in this topic or know some related work or publication I 
will be grateful.
Best,


Re: Anomaly detection Apache Flink

2020-04-03 Thread Marta Paes Moreira
Hi, Salvador.

You can find some more examples of real-time anomaly detection with Flink
in these presentations from Microsoft [1] and Salesforce [2] at Flink
Forward. This blogpost [3] also describes how to build that kind of
application using Kinesis Data Analytics (based on Flink).

Let me know if these resources help!

[1] https://www.youtube.com/watch?v=NhOZ9Q9_wwI
[2] https://www.youtube.com/watch?v=D4kk1JM8Kcg
[3]
https://towardsdatascience.com/real-time-anomaly-detection-with-aws-c237db9eaa3f

On Fri, Apr 3, 2020 at 11:37 AM Salvador Vigo  wrote:

> Hi there,
> I am working in an approach to make some experiments related with anomaly
> detection in real time with Apache Flink. I would like to know if there are
> already some open issues in the community.
> The only example I found was the one of Scott Kidder
>  and the Mux platform, 2017. If any
> one is already working in this topic or know some related work or
> publication I will be grateful.
> Best,
>


Anomaly detection Apache Flink

2020-04-03 Thread Salvador Vigo
Hi there,
I am working in an approach to make some experiments related with anomaly
detection in real time with Apache Flink. I would like to know if there are
already some open issues in the community.
The only example I found was the one of Scott Kidder
 and the Mux platform, 2017. If any one
is already working in this topic or know some related work or publication I
will be grateful.
Best,