Pieter,
Can you create a jira with your use case?  It is important to capture.  We
have some outstanding jira’s around graph support.


On January 2, 2019 at 04:40:23, Stefan Kupstaitis-Dunkler (
stefan....@gmail.com) wrote:

Hi Pieter,



Happy new year!



I believe that always depends on a lot of factors and applies to any kind
of visualization problem with big amounts of data:

   - How fast do you need the visualisations available?
   - How up-to-date do they need to be?
   - How complex?
   - How beautiful/custom modified?
   - How familiar are you with these frameworks? (could be a reason not to
   use a lib if they are otherwise equal in capabilities)



It sounds like you want to create a simple histogram across the full
history of stored data. So I’ll throw in another option, that is commonly
used for such use cases:

   - Zeppelin notebook:
      - Access data stored in HDFS via Hive.
      - A bit of preparation in Hive is required (and can be scheduled),
      e.g. creating external tables and converting data into a more efficient
      format, such as ORC.



Best,

Stefan



*From: *Pieter Baele <pieter.ba...@gmail.com>
*Reply-To: *"user@metron.apache.org" <user@metron.apache.org>
*Date: *Wednesday, 2. January 2019 at 07:50
*To: *"user@metron.apache.org" <user@metron.apache.org>
*Subject: *Graphs based on Metron or PCAP data



Hi,



(and good New Year to all as well!)



What would you consider as the easiest approach to create a Graph based
primarly on ip_dst and ip_src adresses and the number (of connections) of
those?



I was thinking:

- graph functionality in Elastic stack, but limited (ex only recent data in
1 index?)

- interfacing with Neo4J

- GraphX using Spark?

- using R on data stored in HDFS?

- using Python: plotly? Pandas?







Sincerely

Pieter

Reply via email to