Thank you, Risto, such overview can serve as starting point, when designing
something yet more useful, than static configurations of correlations.

I am still not sure, what the optimal goal should be, I need to analyze and
study more about these topics.

I believe, that I will follow up on information you provided, in the
future, and further develop this thread.

št 23. 1. 2020 o 18:12 Risto Vaarandi <risto.vaara...@gmail.com> napísal(a):

> hi Richard,
>
> Next step would be integrating AI (machine learning) with SEC somehow, so
>> that user won't need to configure correlations statically, but they would
>> configure and self-optimize automatically. (There still could be some input
>> needed from the user, but system would be also able to react on changing
>> log traffic, and self-evolve.)
>>
>> Something like ELK+AI has usable in the log monitoring area.
>>
>> Maybe some integration with MXNet?
>>
>> http://blogs.perl.org/users/sergey_kolychev/2017/02/machine-learning-in-perl.html
>>
>> Does anybody have any experience in this area, to explain some more or
>> less theoretical or practical setup of AI-generated SEC rules? (I am pretty
>> sure, that this is out of scope of SEC itself, and SEC would'nt know, that
>> AI is dynamically generating its rules on the background and probably
>> nobody has working solution, but maybe we could invent something together.)
>>
>>
> Machine learning is a very wide area with a large number of different
> methods and algorithms around. These methods and algorithms are usually
> divided into two large classes:
> *) supervised algorithms which assume that you provide labeled data for
> learning (for example, a log file where some messages are labeled as
> "normal" and some messages as "system_fault"), so that the algorithm can
> learn from labeled examples how to distinguish normal messages from errors
> (note that in this simplified example, only two labels were used, but in
> more complex cases you could have more labels in play)
> *) unsupervised algorithms which are able to distinguish anomalous or
> abnormal messages without any previous training with labeled data
> So my first question is -- what is your actual setup and do you have the
> opportunity of using training data for supervised methods, or are
> unsupervised methods a better choice? After answering this question, you
> can start studying most promising methods more closely.
>
> Secondly, what is your actual goal? Do you want to:
> 1) detect an individual anomalous message or a time frame containing
> anomalous messages from event logs,
> 2) produce a warning if the number of messages from specific class (e.g.
> login failures) per N minutes increases suddenly to an unexpectedly large
> value,
> 3) use some tool for (semi)automated mining of new SEC rules,
> 4) something else?
>
> For achieving first goal, there is no silver bullet, but perhaps I can
> provide few pointers to some relevant research papers (note that there are
> many other papers in this area):
> https://ieeexplore.ieee.org/document/4781208
> https://ieeexplore.ieee.org/document/7367332
> https://dl.acm.org/doi/10.1145/3133956.3134015
>
> For achieving the second goal, you could consider using time series
> analysis methods. You could begin with a very simple moving average based
> method like the one described here:
>
> https://machinelearnings.co/data-science-tricks-simple-anomaly-detection-for-metrics-with-a-weekly-pattern-2e236970d77
> or you could employ more complex forecasting methods (before starting, it
> is probably a good idea to read this book on forecasting:
> https://otexts.com/fpp2/)
>
> If you want to mine new rules or knowledge for SEC (or for other tools)
> from event logs, I have actually done some previous research in this
> domain. Perhaps I can point you to a log mining utility called LogCluster (
> https://ristov.github.io/logcluster/) which allows for mining line
> patterns and outliers from textual events logs. Also, couple of years ago,
> an experimental system was created which was using LogCluster in a fully
> automated way for creating SEC Suppress rules, where these rules were
> essentially matching normal (expected) messages. Any message not matching
> these rules was considered an anomaly and was logged separately for manual
> review. Here is the paper that provides an overview of this system:
> https://ristov.github.io/publications/noms18-log-anomaly-web.pdf
>
> Hopefully these pointers will offer you some guidance what your precise
> research question could be, and what is the most promising avenue for
> continuing. My apologies if my answer was raising new questions, but
> machine learning is a very wide area with large number of methods for many
> different goals.
>
> kind regards,
> risto
>
>
_______________________________________________
Simple-evcorr-users mailing list
Simple-evcorr-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users

Reply via email to