Re: [Neo4j] I want a query which returns a subgraph...

'Michael Hunger' via Neo4j Sat, 20 Aug 2016 03:33:26 -0700

Hey,


sorry for the delay. You could use the allShortestPaths option too.

And shortest paths actually take predicates into account that follow it in
the where clause.

I don't know how you got this query to run, because UNWIND is not a
function but a clause:

match path = shortestPath( (n)-[*]->(m) )
return collect(distinct unwind(nodes(path))), collect(distinct
unwind(rels(path)))


For your other query, the initial lookup of those nodes will be slow
(it has to scan the full graph) b/c you don't have a label (using an
index) on them.

You probably also want to add a direction (if it makes sense)


Sorry I wrote the original statement just out of the top of my head, here
is the fixed version

// bind n, m, limit rel-type(s)
match path = shortestPath( (n)-[*]->(m) )
// optionally WHERE with predicates for the shortest paths
unwind nodes(path) as node
with collect(distinct node) as nodes, collect(path) as paths
unwind paths as path
unwind rels(path) as r
with collect(distinct r) as relationships, nodes
return count(*)

// enable use of indexes, even if "Node" is just a generic label (I still
think that using label-sets instad of nodetype is better
        MATCH (start:Node) WHERE start.nodetype = 'Drone' AND
start.designation = {name}
        MATCH (m:Node) WHERE m.nodetype IN {nodetypes}
        MATCH p = shortestPath( (start)-[:`*%s`**]-(m) )
// this spans up a cross product between ALL nodes and ALL rels of the
paths which can potentially become huge
// you can check it with returning count(*), count(distinct n),
count(distinct r) after the two unwinds
        UNWIND nodes(p) AS n
        UNWIND rels(p) AS r
        RETURN [x in COLLECT(DISTINCT n) WHERE x.nodetype in {nodetypes}]
AS nodes,
        COLLECT(DISTINCT r) AS relationships

On Mon, Aug 8, 2016 at 12:40 PM, Alan Robertson <al...@unix.sh> wrote:

> I had written up my problem here too:
>      http://assimilationsystems.com/2016/07/03/assimilation-
> subgraph-queries/
>
> Here's a thought which occurred to me while sleeping last night...
> The two ways I've done it are:
>
>    1. shortest paths from initial node set
>    2. all paths
>
> As noted in the original email thread below, the problems I ran into are:
>
>    1. Shortest paths produces all nodes, but misses some relationships I
>    care about.
>    2. All paths produces all nodes and relationships but takes a really
>    long time and often times out.
>
> A way which occurred to me while sleeping tonight was to create a 2-step
> query like this:
>
> Compute the set of shortest paths to get the set of nodes (as in (1)
> above). Ignore the relationships produced.
> From that set of nodes (n) compute (n)-[:a|b|c|d]-(n) and return 'n',
> relationships.
>
> If Cypher won't let me compute from 'n' to 'n', then compute the from
> n-[:a|b|c|d]-m WHERE m.nodetype is in the same set of node types as was in
> the original constraint. Then return n, relationships. [Although it's
> obvious, I'll mention that the difference is that the second query only
> follows a single level of relationship].
>
> It still might need a little experimentation...
>
> Any thoughts?
>
> I'm going back to bed now. Will look into it some more in the morning...
> ;-)
>
>
>
> On 07/21/2016 02:09 PM, Alan Robertson wrote:
>
> On 04/28/2016 03:32 PM, Alan Robertson wrote:
>
> The query isn't quite right. It has an error in it (variable n already
> used) and something about the order of operations causes it to return more
> than one row - with duplicates ;-)
>
> But this one seems to do the trick:
>
> match path = shortestPath( (n)-[*]->(m) )
> return collect(distinct unwind(nodes(path))), collect(distinct 
> unwind(rels(path)))
>
> That query is quite fast -- but...
>
> I don't actually want *only* the shortest paths, I want all the paths
> that involve those relationships.
>
> But if I delete the shortestPath() function call it runs so long I get an
> HTTP timeout on many of these queries. I restrict the types of
> relationships with [:rel1type|:rel2type|:etctype*]. And even though the
> result only includes one or two additional paths, it runs like 100 times
> slower - and often just times out via REST (via Py2neo).
>
> That's really disappointing.
>
> Does anyone have a suggestion on how to make this faster - or at least not
> time out?
>
>
>
> --
>
> Alan Robertson / CTO
> al...@assimilationsystems.com / +1 303.947.7999
>
> Assimilation Systems Limited
> http://AssimilationSystems.com
>
> [image: Twitter] <https://twitter.com/ossalanr> [image: Linkedin]
> <https://www.linkedin.com/in/alanr> [image: skype]
> <https://htmlsig.com/skype?username=alanr_unix.sh>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] I want a query which returns a subgraph...

Reply via email to