To be honest, primary use case is just for the fancy representation of some data. A lot of people like nice graphs (in both meanings) I am also not yet sure on which functionality is most important (to create a ticket)
I saw an example, written in a C# framework on a subset of a security related dataset. The databackend in that case really can't be used for large scale analysis. I also saw a demo movie of a Metron-based distribution. Thanks a lot for the responses! On Wed, Jan 2, 2019 at 1:37 PM Simon Elliston Ball < si...@simonellistonball.com> wrote: > Graph enables a number of interesting use cases, and it really depends on > what you’re after as to which tech makes sense. > > Spark graphx is a strong contender for analytics of things like > betweenness and community linkage on HDFS indexed data. That would tend to > be batch and through something like zeppelin. The very latest zeppelin also > supports a network visualisation method which gives a graph like visual > option. > > For more interactive, streaming graph and alerting on graph an actual > graph database makes more sense. I’ve seen some work done around Metron > stacks with janusgraph, which leans on solr and Hbase so avoids adding too > much complexity. Janus is not an apache project, but should be includable. > At present I’ve only seen that used in Metron based distributions rather > than Metron core. > > Simon > > > On 2 Jan 2019, at 11:59, Otto Fowler <ottobackwa...@gmail.com> wrote: > > Pieter, > Can you create a jira with your use case? It is important to capture. We > have some outstanding jira’s around graph support. > > > On January 2, 2019 at 04:40:23, Stefan Kupstaitis-Dunkler ( > stefan....@gmail.com) wrote: > > Hi Pieter, > > > > Happy new year! > > > > I believe that always depends on a lot of factors and applies to any kind > of visualization problem with big amounts of data: > > - How fast do you need the visualisations available? > - How up-to-date do they need to be? > - How complex? > - How beautiful/custom modified? > - How familiar are you with these frameworks? (could be a reason not > to use a lib if they are otherwise equal in capabilities) > > > > It sounds like you want to create a simple histogram across the full > history of stored data. So I’ll throw in another option, that is commonly > used for such use cases: > > - Zeppelin notebook: > - Access data stored in HDFS via Hive. > - A bit of preparation in Hive is required (and can be scheduled), > e.g. creating external tables and converting data into a more efficient > format, such as ORC. > > > > Best, > > Stefan > > > > *From: *Pieter Baele <pieter.ba...@gmail.com> > *Reply-To: *"user@metron.apache.org" <user@metron.apache.org> > *Date: *Wednesday, 2. January 2019 at 07:50 > *To: *"user@metron.apache.org" <user@metron.apache.org> > *Subject: *Graphs based on Metron or PCAP data > > > > Hi, > > > > (and good New Year to all as well!) > > > > What would you consider as the easiest approach to create a Graph based > primarly on ip_dst and ip_src adresses and the number (of connections) of > those? > > > > I was thinking: > > - graph functionality in Elastic stack, but limited (ex only recent data > in 1 index?) > > - interfacing with Neo4J > > - GraphX using Spark? > > - using R on data stored in HDFS? > > - using Python: plotly? Pandas? > > > > > > > > Sincerely > > Pieter > >