Thanks for your ideas, Peter and Mattias!

We will work on them and hopefully have some results we can post back here
soon.


On Mon, Apr 11, 2011 at 21:18, Mattias Persson <matt...@neotechnology.com>wrote:

> 2011/4/11 Peter Neubauer <peter.neuba...@neotechnology.com>:
> > Mmh,
> > you might be right in that the ShortestPath is not taking that much
> > context info into account. In that case, I guess you should hack it to
> > be even smarter about how to expand things. Right now, if you look at
> >
> https://github.com/neo4j/graphdb/blob/master/graph-algo/src/main/java/org/neo4j/graphalgo/impl/path/ShortestPath.java#L267
> ,
> > the relationship expander is only getting a node as the context to
> > decide what relationships to return.
> >
> > This probably could be changed to include a path as the context, or
> > you fork ShortestPath and make it smarter for your case, or implement
> > your own RelationshipExpander that is a bit smarter?
>
> Yes, I think two things would need to be done here (as you also noticed
> Peter):
>
> 1) Have RelationshipExpander have its expand method accept a Path
> instead of a Node.
> 2) Implement your own RelationshipExpander which can make smart
> decisions based on those Paths it gets as input.
>
> Having everything centered around Paths instead of Nodes/Relationships
> is good for just about everything, and I think all aspects of
> traversal will be geared towards that in a near future.
>
> To solve your problem now you might need a way to differentiate your
> user/company nodes so that you can immediately tell if a Node is a
> user or company, like a property or single relationship to a reference
> node or similar. If there is then you can still implement your own
> RelationshipExpander and have that look at each node and make decision
> based on that. If not then you might need to roll your own
> implementation, based on f.ex. ShortestPath.
>
> Any other suggestions, anyone?
>
> >
> > Cheers,
> >
> > /peter neubauer
> >
> > GTalk:      neubauer.peter
> > Skype       peter.neubauer
> > Phone       +46 704 106975
> > LinkedIn   http://www.linkedin.com/in/neubauer
> > Twitter      http://twitter.com/peterneubauer
> >
> > http://www.neo4j.org               - Your high performance graph
> database.
> > http://startupbootcamp.org/    - Ă–resund - Innovation happens HERE.
> > http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
> >
> >
> >
> > On Mon, Apr 11, 2011 at 4:14 PM, Stephan Hagemann
> > <stephan.hagem...@googlemail.com> wrote:
> >> Hi all,
> >>
> >> the reason I asked the question about counting the number of visited
> nodes
> >> earlier is that we are running into performance issues when working with
> >> different expanders.
> >>
> >> Our graph contains *user* and *company* nodes. There are a lot more
> users
> >> than companies. Users are connected through *contact* relationships,
> users
> >> are connected to companies as *employees*, companies aren't connected to
> >> each other directly. For paths among users we only want to traverse
> contact
> >> edges. For paths from users to companies we traverse user edges and one
> >> employee edge at the end (to get to a company).
> >>
> >> We are using Neo's shortest path algorithm to find connections between
> users
> >> and companies. The path requirements from above can be formalized into
> these
> >> *expanders*:
> >>
> >>    Map<RelationshipType, Direction> userToCompanyRelations = new
> >> HashMap<RelationshipType, Direction>();
> >>    userToCompanyRelations.put(Relationship.contact, Direction.BOTH);
> >>    userToCompanyRelations.put(Relationship.employee,
> Direction.OUTGOING);
> >>    USER_TO_COMPANY = new DirectedExpander(userToCompanyRelations);
> >>
> >>    Map<RelationshipType, Direction> userToUserRelations = new
> >> HashMap<RelationshipType, Direction>();
> >>    userToUserRelations.put(Relationship.contact, Direction.BOTH);
> >>    USER_TO_USER = new DirectedExpander(userToUserRelations);
> >>
> >> For the *query* we essentially do
> >>
> >>    GraphAlgoFactory.shortestPath(expander, 5).findAllPaths(fromNode,
> >> toNode);
> >>
> >> As you can see from the attached screenshot: we are doing it wrong. The
> path
> >> from a user to a company is almost 14 times slower than the path to
> another
> >> user! And it gets worse when more types of edges are added. The result
> for
> >> the user-company path is acceptably slow.
> >>
> >> What's going on?
> >> If I could output the number of visited nodes for a query, I could tell
> you
> >> exactly... Here is my intuition: the expanders only specify which edges
> can
> >> be traversed. In the user-user path query this is ok: all the fully
> expanded
> >> paths lead from one user to another user (so could technically be the
> path
> >> we're looking for). In the user-company case most fully expanded paths
> will
> >> lead from user to user also, for the query they will thus be unusable!
> We
> >> could do better in the latter case if we had the possibility of
> specifying
> >> in the expander that (a) there should be no attempt to expand after
> having
> >> just passed over an employee edge (since now we are at a company and
> can't
> >> find any more potential results on this path) and (b) there should be no
> >> attempt to expand along a contact edge in step five (because that will
> only
> >> lead to a user).
> >>
> >> Did I just miss how I can put this information into the ShortestPath
> >> algorithm or do I have to go one level deeper and implement a specific
> >> ShortestPath algorithm for each of these queries? Does anyone else face
> this
> >> problem? Anyone seeing similar kinds of queries?
> >>
> >> Thanks
> >> Stephan
> >>
> >> _______________________________________________
> >> Neo4j mailing list
> >> User@lists.neo4j.org
> >> https://lists.neo4j.org/mailman/listinfo/user
> >>
> >>
> > _______________________________________________
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> >
>
>
>
> --
> Mattias Persson, [matt...@neotechnology.com]
> Hacker, Neo Technology
> www.neotechnology.com
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to