Hi everyone, I want to ask for guidance for my log analyzer platform idea. I have an elasticsearch system which collects the logs from different platforms, and creates alerts. The system writes the alerts to an index on ES. Also, my alerts are stored in a folder as JSON (multi line format).
The Goals: 1. Read json folder or ES index as streaming (read in new entry within 5 min) 2. Select only alerts that I want to work on ( alert.id = 100 , status=true , ...) 3. Create a DataFrame + Window for 10 min period 4. Run a query fro that DataFrame by grupping by IP ( If same IP gets 3 alerts then show me the result) 5. All the coding should be in python The ideas is something like this, my question is how should I proceed to this task. What are the technologies that I should use? *Apache Spark + Python + Pyspark + Kaola *can handle this ? Best regards, *Suat Toksoz*