Thanks for answer. @Marta, First answer videos [1], [2]. It was interesting to see this two different approaches, although I was looking for some more specific implementation. Link number [3], I didn't know the existence of Kinesis, so maybe could be good for benchmarking and comparing my results with the Kinesis results. Then the approach of CEP, I am very related with this topic since my current work is based in the implementation of a CEP pipeline for monitoring. The only problem I see here is that you need in advance a predefined pattern. But it worth a try.
@Ryan, I see this idea of the random cut forest algorithm more close to the idea I am looking for. What do you mean when you say that doesn't work getting it works with Flink? Best, On Fri, Apr 3, 2020 at 8:47 PM Marta Paes Moreira <ma...@ververica.com> wrote: > Forgot to mention that you might also want to have a look into Flink CEP > [1], Flink's library for Complex Event Processing. > > It allows you to define and detect event patterns over streams, which can > come in pretty handy for anomaly detection. > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/cep.html > > On Fri, Apr 3, 2020 at 6:08 PM Nienhuis, Ryan <nienh...@amazon.com> wrote: > >> I would also have a look at the random cut forest algorithm. This is the >> base algorithm that is used for anomaly detection in several AWS services >> (Quicksight, Kinesis Data Analytics, etc.). It doesn’t help with getting it >> working with Flink, but may be a good place to start for an algorithm. >> >> >> >> https://github.com/aws/random-cut-forest-by-aws >> >> >> >> Ryan >> >> >> >> *From:* Marta Paes Moreira <ma...@ververica.com> >> *Sent:* Friday, April 3, 2020 5:25 AM >> *To:* Salvador Vigo <salvador...@gmail.com> >> *Cc:* user <user@flink.apache.org> >> *Subject:* RE: [EXTERNAL] Anomaly detection Apache Flink >> >> >> >> *CAUTION*: This email originated from outside of the organization. Do >> not click links or open attachments unless you can confirm the sender and >> know the content is safe. >> >> >> >> Hi, Salvador. >> >> You can find some more examples of real-time anomaly detection with Flink >> in these presentations from Microsoft [1] and Salesforce [2] at Flink >> Forward. This blogpost [3] also describes how to build that kind of >> application using Kinesis Data Analytics (based on Flink). >> >> Let me know if these resources help! >> >> [1] https://www.youtube.com/watch?v=NhOZ9Q9_wwI >> [2] https://www.youtube.com/watch?v=D4kk1JM8Kcg >> [3] >> https://towardsdatascience.com/real-time-anomaly-detection-with-aws-c237db9eaa3f >> >> >> >> On Fri, Apr 3, 2020 at 11:37 AM Salvador Vigo <salvador...@gmail.com> >> wrote: >> >> Hi there, >> >> I am working in an approach to make some experiments related with anomaly >> detection in real time with Apache Flink. I would like to know if there are >> already some open issues in the community. >> >> The only example I found was the one of Scott Kidder >> <https://mux.com/team/scott-kidder> and the Mux platform, 2017. If any >> one is already working in this topic or know some related work or >> publication I will be grateful. >> >> Best, >> >>