Re: [Neo4j] allSimplePaths performance
Hey folks, just wanted to let you know that I tested it with Neo4j Community 1.6M01 and can't reproduce the problem. It seems that it got fixed somewhere along the way. :) Iterating through the depth levels and calling pathsWithLength is still faster than allSimplePaths for depths 5 for me, so I'm going to stick to it. Even wrote a small plugin for my server to do that. Thanks a lot for the great work! Best regards, Petar. On Mon, Nov 28, 2011 at 12:16 PM, Mattias Persson matt...@neotechnology.com wrote: I think I'd need your dataset to be able to reproduce and fix it, would that be possible? 2011/11/25 Petar Dobrev petar.dob...@myphilanthropedia.org On Fri, Nov 25, 2011 at 7:59 PM, Mattias Persson matt...@neotechnology.comwrote: Correct, it finds paths on that depth only. If other paths are encountered along the way they aren't returned. However, in the example I provided it misses a path that is present on the desired depth, that's what I was actually wondering about. In this example: The source of the testing program is here: https://gist.github.com/1391654 Output before adding the additional relationship: https://gist.github.com/1391668 Output after adding the additional relationship: https://gist.github.com/1391661 shortestPath misses one path (at depth 3) unless an additional relationship is present, which is unrelated to the paths between the nodes. Petar ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Petar Dobrev Engineer Philanthropedia http://www.myphilanthropedia.org ___ NOTICE: THIS MAILING LIST IS BEING SWITCHED TO GOOGLE GROUPS, please register and consider posting at https://groups.google.com/forum/#!forum/neo4j Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] allSimplePaths performance
Sure, it's based on the shortestPath plugin example from the docs and is quite simple: https://gist.github.com/1472918 Best regards, Petar On Tue, Dec 13, 2011 at 5:58 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Cool. Got that plugin somewhere? Cheers, /peter neubauer TC CEO of the year - vote for Emil Eifrém! http://crunchies2011.techcrunch.com/nominate/ Google:neubauer.peter Skype:peter.neubauer Phone: +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter @peterneubauer brew install neo4j neo4j start heroku addons:add neo4j On Tue, Dec 13, 2011 at 4:52 PM, Petar Dobrev petar.dob...@myphilanthropedia.org wrote: Hey folks, just wanted to let you know that I tested it with Neo4j Community 1.6M01 and can't reproduce the problem. It seems that it got fixed somewhere along the way. :) Iterating through the depth levels and calling pathsWithLength is still faster than allSimplePaths for depths 5 for me, so I'm going to stick to it. Even wrote a small plugin for my server to do that. Thanks a lot for the great work! Best regards, Petar. On Mon, Nov 28, 2011 at 12:16 PM, Mattias Persson matt...@neotechnology.com wrote: I think I'd need your dataset to be able to reproduce and fix it, would that be possible? 2011/11/25 Petar Dobrev petar.dob...@myphilanthropedia.org On Fri, Nov 25, 2011 at 7:59 PM, Mattias Persson matt...@neotechnology.comwrote: Correct, it finds paths on that depth only. If other paths are encountered along the way they aren't returned. However, in the example I provided it misses a path that is present on the desired depth, that's what I was actually wondering about. In this example: The source of the testing program is here: https://gist.github.com/1391654 Output before adding the additional relationship: https://gist.github.com/1391668 Output after adding the additional relationship: https://gist.github.com/1391661 shortestPath misses one path (at depth 3) unless an additional relationship is present, which is unrelated to the paths between the nodes. Petar ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Petar Dobrev Engineer Philanthropedia http://www.myphilanthropedia.org ___ NOTICE: THIS MAILING LIST IS BEING SWITCHED TO GOOGLE GROUPS, please register and consider posting at https://groups.google.com/forum/#!forum/neo4j Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ NOTICE: THIS MAILING LIST IS BEING SWITCHED TO GOOGLE GROUPS, please register and consider posting at https://groups.google.com/forum/#!forum/neo4j Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Petar Dobrev Engineer Philanthropedia http://www.myphilanthropedia.org ___ NOTICE: THIS MAILING LIST IS BEING SWITCHED TO GOOGLE GROUPS, please register and consider posting at https://groups.google.com/forum/#!forum/neo4j Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] allSimplePaths performance
Sure, I sent you a link where you can download a snapshot of the dataset in a separate email. Best regards, Petar On Mon, Nov 28, 2011 at 12:16 PM, Mattias Persson matt...@neotechnology.com wrote: I think I'd need your dataset to be able to reproduce and fix it, would that be possible? 2011/11/25 Petar Dobrev petar.dob...@myphilanthropedia.org On Fri, Nov 25, 2011 at 7:59 PM, Mattias Persson matt...@neotechnology.comwrote: Correct, it finds paths on that depth only. If other paths are encountered along the way they aren't returned. However, in the example I provided it misses a path that is present on the desired depth, that's what I was actually wondering about. In this example: The source of the testing program is here: https://gist.github.com/1391654 Output before adding the additional relationship: https://gist.github.com/1391668 Output after adding the additional relationship: https://gist.github.com/1391661 shortestPath misses one path (at depth 3) unless an additional relationship is present, which is unrelated to the paths between the nodes. Petar ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Petar Dobrev Engineer Philanthropedia http://www.myphilanthropedia.org ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] allSimplePaths performance
Thanks for the explanation, Mattias! So, if I understand this correctly, due to some of the optimizations, shortestPath algo might miss some paths when used with findPathsOnMaxDepthOnly. And that's by design so to speak, it's not a bug or something, correct? Thanks! Petar On Thu, Nov 24, 2011 at 8:36 PM, Mattias Persson matt...@neotechnology.comwrote: There are several optimizations that the shortest path algo does that allSimplePaths doesn't, f.ex: * Traversal is bidirectional (it starts from the start AND the end simultaneously, although in the same thread) which means that the deeper the traversal goes the more it gains compared to a single directional traversal * It stops on the depth it finds the first hit on Shortest path algo is implemented from scratch to be optimized for just that, but allSimplePaths uses traversal framework which doesn't support bidirectional traversals yet, although there have been some experiments with that too so perhaps soon! 2011/11/24 Peter Neubauer peter.neuba...@neotechnology.com Petar, very cool if this worked out. Maybe you could write up a testcase that verifies that the results are the same, and then put this as a fork to the graphalgo package? Sounds like a great addition if this works out? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/- Öresund - Innovation happens HERE. On Thu, Nov 24, 2011 at 11:42 AM, Petar Dobrev peter.dob...@gmail.com wrote: Hi guys, I have a graph of about 2.5M nodes and 8M relationships and I am trying to find all simple paths between two nodes with maximum depth of 8. The allSimplePaths graph algo works well for maximum depth of 5, but for 8 it runs really long (I didn't even wait for it to finish). So I thought it's just that the graph is too complicated and the search operation is very expensive. On the other hand I noticed that shortestPath and pathsWithLength both work fast. So I tried this experiment: - Run shortestPath and record the shortest length - Iterate from the shortest length to max_depth - Run pathsWithLength and append the results - And it turns out to be working really well.. much, much faster than the allSimplePaths solution, which I found quite baffling, since the latter solution should be doing more work to accomplish the same task. Maybe it's just with my graph, but it's still weird. Best regards, Petar ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Petar Dobrev Engineer Philanthropedia http://www.myphilanthropedia.org ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] allSimplePaths performance
Hi guys, I have a graph of about 2.5M nodes and 8M relationships and I am trying to find all simple paths between two nodes with maximum depth of 8. The allSimplePaths graph algo works well for maximum depth of 5, but for 8 it runs really long (I didn't even wait for it to finish). So I thought it's just that the graph is too complicated and the search operation is very expensive. On the other hand I noticed that shortestPath and pathsWithLength both work fast. So I tried this experiment: - Run shortestPath and record the shortest length - Iterate from the shortest length to max_depth - Run pathsWithLength and append the results - And it turns out to be working really well.. much, much faster than the allSimplePaths solution, which I found quite baffling, since the latter solution should be doing more work to accomplish the same task. Maybe it's just with my graph, but it's still weird. Best regards, Petar ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] allSimplePaths performance
Hi guys, I played around some more with alternative approach and it turns out the results are not equivalent. Since I cannot test it with the desirable max depth of 8 on my setup, I ran a test with max depth of 4. For a given test case, allSimplePaths returned 2 paths of length 2 and 3: (2624016)--[User_relation_6,9067449]--(2161113)--[PERSON_PERSON,7879807]--(2161112) (2624016)--[User_relation_6,9067448]--(142023)--[PERSON_ORG,1982010]--(2161113)--[PERSON_PERSON,7879807]--(2161112) The alternative approach found only the first one. (2624016)--[User_relation_6,9067449]--(2161113)--[PERSON_PERSON,7879807]--(2161112) Interestingly, if I add another relationship to the source node, which does not change the number of paths between the two nodes, the pathsWithLength approach finds also the second one. Am I doing something wrong or is it that pathsWithLength will not always return all paths with that length due to some greedy approach? The documentation for ShortestPath states that the algorithm will try to find a path: @param findPathsOnMaxDepthOnly if {@code true} then it will only try to find paths on that particular depth ({@code maxDepth}). Does that mean that it can also fail to find a path on the particular depth? The source of the testing program is here: https://gist.github.com/1391654 Output before adding the additional relationship: https://gist.github.com/1391668 Output after adding the additional relationship: https://gist.github.com/1391661 Neo4j version: 1.5 Thanks! Best regards, Petar On Thu, Nov 24, 2011 at 12:57 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Petar, very cool if this worked out. Maybe you could write up a testcase that verifies that the results are the same, and then put this as a fork to the graphalgo package? Sounds like a great addition if this works out? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/- Öresund - Innovation happens HERE. On Thu, Nov 24, 2011 at 11:42 AM, Petar Dobrev peter.dob...@gmail.com wrote: Hi guys, I have a graph of about 2.5M nodes and 8M relationships and I am trying to find all simple paths between two nodes with maximum depth of 8. The allSimplePaths graph algo works well for maximum depth of 5, but for 8 it runs really long (I didn't even wait for it to finish). So I thought it's just that the graph is too complicated and the search operation is very expensive. On the other hand I noticed that shortestPath and pathsWithLength both work fast. So I tried this experiment: - Run shortestPath and record the shortest length - Iterate from the shortest length to max_depth - Run pathsWithLength and append the results - And it turns out to be working really well.. much, much faster than the allSimplePaths solution, which I found quite baffling, since the latter solution should be doing more work to accomplish the same task. Maybe it's just with my graph, but it's still weird. Best regards, Petar ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Petar Dobrev Engineer Philanthropedia http://www.myphilanthropedia.org ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Limiting the number of returned paths or finding next shortest path
Hi Peter, thanks for the swift response. Just a quick question about the ShortestPath path finder -- does it return paths of only the same (shortest) length or does it sort all simple paths according to their length. I am almost certain it does the former, but if it's the latter then that's exactly what I need. If not then I think this effect can be achieved by first running shortestPath and then pathsWithLength for all the lengths up to maxDepth. Not optimal, but would do the job. However, pathsWithLength is also not exposed via the API. Otherwise this looks exactly like what I'm looking for. Would be great to have them as part of the API. The API call I am using now is: POST http://localhost:7474/db/data/node/nodeid/paths Data: { to: http://localhost:7474/db/data/node/nodeid, max_depth: 8, relationships: [{ type: reltype }], algorithm: shortestPath } I created a feature request at: https://github.com/neo4j/community/issues/99 Thanks a lot! Best regards, Petar On Mon, Nov 14, 2011 at 10:11 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Mmh, there is an implementation for this in https://github.com/neo4j/community/blob/master/graph-algo/src/main/java/org/neo4j/graphalgo/GraphAlgoFactory.java#L93if that is what you want, we then should expose this as part of the REST API right? What REST call are you using right now? Please raise a feature request on this at https://github.com/neo4j/community/issues, would be great. Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/- Öresund - Innovation happens HERE. On Mon, Nov 14, 2011 at 6:27 PM, pdobrev peter.dob...@gmail.com wrote: Hi guys, I have a decent size graph (2.5M nodes, 8M relationships) and am interested in finding paths between two nodes. I am using the REST API since the app I am working on is developed in .NET. Due to the nature of the graph the paths can be quite long, so I am using 8 for the max depth parameter. Searching for the shortest path between two nodes is quite fast. On the other hand, searching for allSimplePaths is really slow. Up to a max depth value of 5 it is still ok, but going further makes it very slow and as I mentioned already, I can have meaningful connections of depth up to 8. I think the reason might be because there may be too many paths between the nodes. Is there any way I can tell the API to return not all paths, but rather just the first 3-4-5 it finds? I think that might speed up things significantly and I don't really need to show more paths than that. Alternatively, if I can ask Neo4j for the next shortest path, after the one I already got, that would also be a great solution. However, I don't expect that there's an easy way to do that. Thanks a lot in advance!! Best regards, Petar -- View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Limiting-the-number-of-returned-paths-or-finding-next-shortest-path-tp3507449p3507449.html Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Petar Dobrev Engineer Philanthropedia http://www.myphilanthropedia.org ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user