Re: [Neo4j] Large scale network analysis - best strategy?

Nigel Small Tue, 17 Jun 2014 15:05:27 -0700

Hi Gareth

As you identify, there are certainly some differences in terms of
performance and feature set that you get when working with Neo4j under
different programming languages. Depending on your background, constraints
and integration needs, you could consider a hybrid approach whereby you
continue working with Python for your main application and build anything
that requires serious performance as a server extension in Java. Neo4j
plugin support is pretty comprehensive: for example, my server extension
load2neo <http://nigelsmall.com/load2neo> provides a facility to bulk load
data but also has direct support from my Python driver, py2neo
<http://py2neo.org/>. This approach is somewhat analogous to compiling a C
extension in Python and could be done as an optimisation step once you have
built your end-to-end application logic.


Bear in mind also that Cypher is very powerful these days. It would
certainly be worth exploring some of its more recent capabilities before
choosing an architectural path as you may find there is little that cannot
already be achieved purely with Cypher. If this is the case, your choice of
application language could then become far less critical.

I'd suggest beginning with a prototype in a language you are comfortable
with. Then, build a suite of queries you need to run and ascertain the
bottlenecks or missing features. Once you have a list of these, you can
then make an informed decision on which pieces to optimise.

Kind regards
Nigel


On 17 June 2014 15:42, Shongololo <garethsim...@gmail.com> wrote:

> I am preparing a Neo4j database on which I would like to do some network
> analysis. It is a representation of a weakly connected and static physical
> system, and will have in the region of 50 million nodes where, lets say,
> about 50 nodes will connect to a parent node, which in turn is linked
> (think streets and intersections) to a network of other parent nodes.
>
> For most of the analysis, I will be using a weighted distance decay, so
> analysis of things like "betweenness" or "centrality" will be computed for
> the parent node network, but only to a limited extent. So, for example, if
> (a)--(b)--(c)--(d)--(e), then the computation will only be based up to,
> say, two steps away. So (a) will consider (b) and (c), whereas (c) will
> consider two steps in either direction.
>
> My question is a conceptual and strategic one: What is the best approach
> for doing this kind of analysis with neo4j?
>
> I currently work with Python, but it appears that for speed, flexibility,
> and use of complex graph algorithms, I am better off working with the
> embedded Java API for direct and powerful access to the graph? Or is an
> approach using something like bulb flow with gremlin also feasible? How
> does the power and flexibility of the different embedded tools compare -
> e.g. Python embedded vs. Java vs. Node.js?
>
> Thanks.
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Large scale network analysis - best strategy?

Reply via email to