Of course the graph can be used for processing event data, and whether
that works for your case or not depends. But we have used it for this,
and I can discuss a few points.

The event stream is obviously just a linear chain and can be modeled
as such in the graph (eg. with NEXT relationships between event
nodes). However this does not bring much advantage over the original
flat file which already has implicit next (next line, assuming time
ordered). You could instead use a TimeLineIndex to manage the order,
and then you would have an advantage over disordered original data.
Durations between events can be new nodes with START and END
relationships to the individual events, and the time difference
optionally added as a property to the duration node.

One nice thing about the graph is that you can keep adding data and
structure as you go, sometimes much later. So your question about
adding server and number of items processed, etc, can be added later,
at your convenience.

When grouping events together and getting statistics, some things can
be added incrementally, like max/min/count/total. But percentile is
not so trivial. Consider the case where you want to know the
statistics for each hour of events. If you have an hour node connected
to all event nodes in that hour, you can update the
max/min/count/total values as new event data enters the database. But
percentile needs to be calculated once all events in the hour have
arrived. This can be handled at the application level.
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to