To be honest, primary use case is just for the fancy representation of some
data. A lot of people like nice graphs (in both meanings)
I am also not yet sure on which functionality is most important  (to create
a ticket)

I saw an example, written in a C# framework on a subset of a security
related dataset.
The databackend in that case really can't be used for large scale analysis.
I also saw a demo movie of a Metron-based distribution.

Thanks a lot for the responses!


On Wed, Jan 2, 2019 at 1:37 PM Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> Graph enables a number of interesting use cases, and it really depends on
> what you’re after as to which tech makes sense.
>
> Spark graphx is a strong contender for analytics of things like
> betweenness and community linkage on HDFS indexed data. That would tend to
> be batch and through something like zeppelin. The very latest zeppelin also
> supports a network visualisation method which gives a graph like visual
> option.
>
> For more interactive, streaming graph and alerting on graph an actual
> graph database makes more sense. I’ve seen some work done around Metron
> stacks with janusgraph, which leans on solr and Hbase so avoids adding too
> much complexity. Janus is not an apache project, but should be includable.
> At present I’ve only seen that used in Metron based distributions rather
> than Metron core.
>
> Simon
>
>
> On 2 Jan 2019, at 11:59, Otto Fowler <ottobackwa...@gmail.com> wrote:
>
> Pieter,
> Can you create a jira with your use case?  It is important to capture.  We
> have some outstanding jira’s around graph support.
>
>
> On January 2, 2019 at 04:40:23, Stefan Kupstaitis-Dunkler (
> stefan....@gmail.com) wrote:
>
> Hi Pieter,
>
>
>
> Happy new year!
>
>
>
> I believe that always depends on a lot of factors and applies to any kind
> of visualization problem with big amounts of data:
>
>    - How fast do you need the visualisations available?
>    - How up-to-date do they need to be?
>    - How complex?
>    - How beautiful/custom modified?
>    - How familiar are you with these frameworks? (could be a reason not
>    to use a lib if they are otherwise equal in capabilities)
>
>
>
> It sounds like you want to create a simple histogram across the full
> history of stored data. So I’ll throw in another option, that is commonly
> used for such use cases:
>
>    - Zeppelin notebook:
>       - Access data stored in HDFS via Hive.
>       - A bit of preparation in Hive is required (and can be scheduled),
>       e.g. creating external tables and converting data into a more efficient
>       format, such as ORC.
>
>
>
> Best,
>
> Stefan
>
>
>
> *From: *Pieter Baele <pieter.ba...@gmail.com>
> *Reply-To: *"user@metron.apache.org" <user@metron.apache.org>
> *Date: *Wednesday, 2. January 2019 at 07:50
> *To: *"user@metron.apache.org" <user@metron.apache.org>
> *Subject: *Graphs based on Metron or PCAP data
>
>
>
> Hi,
>
>
>
> (and good New Year to all as well!)
>
>
>
> What would you consider as the easiest approach to create a Graph based
> primarly on ip_dst and ip_src adresses and the number (of connections) of
> those?
>
>
>
> I was thinking:
>
> - graph functionality in Elastic stack, but limited (ex only recent data
> in 1 index?)
>
> - interfacing with Neo4J
>
> - GraphX using Spark?
>
> - using R on data stored in HDFS?
>
> - using Python: plotly? Pandas?
>
>
>
>
>
>
>
> Sincerely
>
> Pieter
>
>

Reply via email to