Just use a dataset that you can reason about and check if they work correctly.
Hard for me to be the consistency checker on your queries :) In general if you really want to do these deep traversals you might be better off (in terms of performance) using the traversal-API with an appropriate uniqueness constraint, like node-path. On Mon, Mar 31, 2014 at 1:09 PM, Rio Eduardo <rioeduard...@gmail.com> wrote: > Hello again Michael. > > I just want to make sure that my query is correct to find friends of > friends at depth of four and five. Please help me by checking my query. > > Query at depth of four: > MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) > WHERE U.user_id=1 > WITH DISTINCT U, FU, FFU > WHERE FFU<>U > WITH DISTINCT U, FU, FFU > MATCH (FFU:User)-[FFF:Friend]->(FFFU:User) > WHERE FFFU<>FU > WITH DISTINCT U, FFU, FFFU > MATCH (FFFU:User)-[FFFF:Friend]->(FFFFU:User) > WHERE FFFFU<>FFU AND FFFFU<>U AND NOT (U)-[:Friend]->(FFFFU) > RETURN DISTINCT FFFFU.username; > > Query at depth of five: > MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) > WHERE U.user_id=1 > WITH DISTINCT U, FU, FFU > WHERE FFU<>U > WITH DISTINCT U, FU, FFU > MATCH (FFU:User)-[FFF:Friend]->(FFFU:User) > WHERE FFFU<>FU > WITH DISTINCT U, FFU, FFFU > MATCH (FFFU:User)-[FFFF:Friend]->(FFFFU:User) > WHERE FFFFU<>FFU > WITH DISTINCT U, FFFU, FFFFU > MATCH (FFFFU:User)-[FFFFF:Friend]->(FFFFFU:User) > WHERE FFFFFU<>FFFU AND FFFFFU<>U AND NOT (U)-[:Friend]->(FFFFFU) > RETURN DISTINCT FFFFFU.username; > > I need your help so much. > Thank you. > > > On Sunday, March 30, 2014 7:42:27 PM UTC+7, Michael Hunger wrote: > >> Split it up in one more intermediate step, the intermediate steps are >> there to get the cardinality down, so it doesn't have to match billions of >> paths, only millions or 100k >> >> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)-[FFF: >> Friend]->(FFFU:User) >> WHERE U.user_id=1 >> WITH DISTINCT U, FU, FFU >> WHERE FFU<>U >> WITH DISTINCT U, FFU >> MATCH (FFU:User)-[FFF:Friend]->(FFFU:User) >> WHERE NOT (U)-[:Friend]->(FFFU) >> RETURN distinct FFFU.username; >> >> >> >> >> On Sun, Mar 30, 2014 at 1:29 PM, Rio Eduardo <rioedu...@gmail.com> wrote: >> >>> Please help me again Michael. >>> >>> You ever said: >>> >>> I would also change: >>> >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT (U)-[:Friend]->(FFU) >>> RETURN FFU.username >>> >>> to >>> >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>> WHERE U.user_id=1 >>> WITH distinct U, FFU >>> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >>> RETURN FFU.username >>> >>> Query above is to find friends of friends at depth of two. And I would >>> like to find friends of friends at depth of three, when I use model of >>> your query, it returns result longer than mine and the result is much more >>> than mine. Ok so here is model of your query at depth of three: >>> >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)-[FFF: >>> Friend]->(FFFU:User) >>> WHERE U.user_id=1 >>> WITH DISTINCT U, FU, FFU, FFFU >>> WHERE FFU<>U AND FFFU<>FU AND NOT (U)-[:Friend]->(FFFU) >>> RETURN FFFU.username; >>> >>> ... >>> >>> 118858 rows >>> 20090 ms >>> >>> Mine: >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)-[FFF: >>> Friend]->(FFFU:User) >>> WHERE U.user_id=1 AND FFU<>U AND FFFU<>FU AND NOT (U)-[:Friend]->(FFFU) >>> RETURN DISTINCT FFFU.username; >>> >>> ... >>> >>> 950 rows >>> 18133 ms >>> >>> Please help me, Why is model of your query longer than mine and return >>> much more results than mine? >>> >>> Thank you. >>> >>> >>> >>> On Friday, March 28, 2014 8:30:20 PM UTC+7, Michael Hunger wrote: >>> >>>> Rio, >>>> >>>> was this your first run of both statements? If so, please run them for >>>> a second time. >>>> And did you create an index or constraint for :User(user_id) ? >>>> >>>> MATCH (U:User) RETURN COUNT(U); >>>> >>>> I would also change: >>>> >>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT >>>> (U)-[:Friend]->(FFU) >>>> RETURN FFU.username >>>> >>>> to >>>> >>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>> WHERE U.user_id=1 >>>> WITH distinct U, FFU >>>> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >>>> RETURN FFU.username >>>> >>>> I quickly created a dataset on my machine: >>>> >>>> cypher 2.0 foreach (i in range(1,1000) | create (:User {id:i})); >>>> >>>> create constraint on (u:User) assert u.id is unique; >>>> >>>> match (u1:User),(u2:User) with u1,u2 where rand() < 0.1 create >>>> (u1)-[:Friend]->(u2); >>>> >>>> Relationships created: 99974 >>>> >>>> 778 ms >>>> >>>> match (u:User) return count(*); >>>> >>>> +----------+ >>>> | count(*) | >>>> +----------+ >>>> | 1000 | >>>> +----------+ >>>> 1 row >>>> *4 ms* >>>> >>>> >>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>> WHERE U.id=1 >>>> WITH distinct U, FFU >>>> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >>>> RETURN FFU.id; >>>> >>>> ... >>>> >>>> 910 rows >>>> >>>> 101 ms >>>> >>>> but even your query takes only >>>> >>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>> WHERE U.id=1 AND FFU.id<>U.id AND NOT (U)-[:Friend]->(FFU) >>>> RETURN FFU.id; >>>> >>>> ... >>>> >>>> 8188 rows >>>> >>>> 578 ms >>>> >>>> >>>> On Fri, Mar 28, 2014 at 2:08 PM, Lundin <lundin....@gmail.com> wrote: >>>> > >>>> > ms, it is milliseconds. >>>> > >>>> > What is the corresponding result for a SQL db ? >>>> > MATCH (n:User)-[:Friend*3]-(FoFoF) return FoFoF; >>>> > >>>> > Albeit a valid search is it something useful ? I would think finding >>>> a specific persons FoFoF in either end, as a starting point or end point, >>>> would be a very realistic scenario. Adding an Index on User:name and query >>>> for a User with name:Rio try to find his FoFoF. >>>> > >>>> > Yes, neo4j has been kind and exposed various function, like >>>> shortestpath in cypher >>>> > http://docs.neo4j.org/refcard/2.0/ >>>> > >>>> > Also look at some gist examples >>>> > https://github.com/neo4j-contrib/graphgist/wiki >>>> > >>>> > Den fredagen den 28:e mars 2014 kl. 05:00:22 UTC+1 skrev Rio Eduardo: >>>> >> >>>> >> Thank you so much for the reply Lundin. I really apreciate it. Okay, >>>> yesterday I just tested my experiment again. And the result was not what I >>>> imagined and expected before. Okay, before I tested 1M users, I reduced the >>>> number of users into 1000 users and tested it not in my social network but >>>> directly in database only(Neo4j Shell) to find out that it was not caused >>>> by the performance of pc. But the result of returning 1000 users was 200ms >>>> and 1 row and the result of returning friends at depth of two was 85000ms >>>> and 2500 rows and are 200ms and 85000ms fast to you? and what does ms stand >>>> for? is it milliseconds or microseconds? >>>> >> >>>> >> the query I use for returning 1000 users is >>>> >> MATCH (U:User) RETURN COUNT(U); >>>> >> >>>> >> and the query I use for returning friends at depth of two is >>>> >> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>> >> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT >>>> (U)-[:Friend]->(FFU) >>>> >> RETURN FFU.username >>>> >> >>>> >> Please note that I tested with default configuration of Neo4j and >>>> created users with 1000 random nodes and created friends relationships with >>>> 50000 random relationships(1 user has 50 friends). Each relationship has a >>>> label Friend and no properties on it. Each node has a label User, 4 >>>> properties: user_id, username, password and profile_picture. Each property >>>> has a value of 1-60 characters. average of characters of user_id=1-1000 >>>> characters, all usernames have 10 characters randomly, all passwords have >>>> 60 characters because I MD5 it, and profile_picture has 1-60 characters. >>>> >> >>>> >> And about your statement "Otherwise if you really need to present >>>> that many "things" just paging the result with SKIP,LIMIT. I has never made >>>> sense to present 1M of anything at a time for a user.", I already did >>>> according to your statement above but it is still the same, Neo4j returns >>>> result slower. >>>> >> >>>> >> And I'm wondering if Neo4j already applied one of graph >>>> algorithms(shortest path, djikstra, A*, etc) in its system or not. >>>> >> >>>> >> Thank you. >>>> >> >>>> >> >>>> >> On Friday, March 28, 2014 3:43:49 AM UTC+7, Lundin wrote: >>>> >>> >>>> >>> Rio, any version will do. They can all handle million nodes on >>>> common hardware, no magic at all. When hundred of millions of billions then >>>> we might need to look into specfication more in detail. But in that case >>>> with that kind of data there are other bottlencks for a social network or >>>> any web appp that needs to be taken care of as well. >>>> >>> >>>> >>> you said: >>>> >>>> >>>> >>>> Given any two persons chosen at random, is there a path that >>>> connects them that is at most five relationships long? For a social network >>>> containing 1,000,000 people, each with approximately 50 friends, the >>>> results strongly suggest that graph databases are the best choice for >>>> connected data. And graph database can still work 150 times faster than >>>> relational database at third degree and 1000 times faster at fourth degre >>>> >>> >>>> >>> >>>> >>> I fail to see how this is connected to your attempt to list 1M >>>> users in one go at the first page. You would want to seek if there is a >>>> relationship and return that path between users. You need two start nodes >>>> and seek a path by traveser the relationsip rather than scan tables and >>>> that would be the comparison. >>>> >>> Otherwise if you really need to present that many "things" just >>>> paging the result with SKIP,LIMIT. I has never made sense to present 1M of >>>> anything at a time for a user. Again, that wouldn't really serve your >>>> experiment much good to prove graph theory. >>>> >>> >>>> >>> What is the result of MATCH(U:User) RETURN count(U); ? >>>> >>> >>>> >>> Also when you do your test make sure to add the warm/cold cache >>>> effect (better/worse performance) >>>> >>> >>>> >>> Den torsdagen den 27:e mars 2014 kl. 17:57:10 UTC+1 skrev Rio >>>> Eduardo: >>>> >>>> >>>> >>>> I just knew about memory allocation and just read Server >>>> Performance Tuning of Neo4j. neo4j.properties: >>>> >>>> # Default values for the low-level graph engine >>>> >>>> >>>> >>>> #neostore.nodestore.db.mapped_memory=25M >>>> >>>> #neostore.relationshipstore.db.mapped_memory=50M >>>> >>>> #neostore.propertystore.db.mapped_memory=90M >>>> >>>> #neostore.propertystore.db.strings.mapped_memory=130M >>>> >>>> #neostore.propertystore.db.arrays.mapped_memory=130M >>>> >>>> >>>> >>>> Should I change this to get high performance? If yes, please >>>> suggest me. >>>> >>>> >>>> >>>> And I just knew about Neo4j Licenses, they are Community, >>>> Personal, Startups, Business and Enterprise. And at Neo4j website all >>>> features are explained. So which Neo4j should I use for my case that has >>>> millions nodes and relationships? >>>> >>>> >>>> >>>> Please answer. I need your help so much. >>>> >>>> >>>> >>>> Thanks. >>>> >>>> >>>> >>>> On Tuesday, March 25, 2014 12:03:58 AM UTC+7, Rio Eduardo wrote: >>>> >>>>> >>>> >>>>> I'm testing my thesis which is about transforming from relational >>>> database to graph database. After transforming from relational database to >>>> graph database, I will test their own performance according to query >>>> response time and throughput. In relational database, I use MySQL while in >>>> graph database I use Neo4j for testing. I will have 3 Million more nodes >>>> and 6 Million more relationships. But when I just added 60000 nodes, my >>>> Neo4j is already dead. When I tried to return all 60000 nodes, it returned >>>> unknown. I did the same to MySQL, I added 60000 records but it could return >>>> all 60000 records. It's weird because it's against the papers I read that >>>> told me graph database is faster than relational database So Why is Neo4j >>>> slower(totally dead) in lower specification of pc/notebook while MySQL is >>>> not? And What specification of pc/notebook do I should use to give the best >>>> performance during testing with millions of nodes and relationships? >>>> >>>>> >>>> >>>>> Thank you. >>>> > >>>> > -- >>>> > You received this message because you are subscribed to the Google >>>> Groups "Neo4j" group. >>>> > To unsubscribe from this group and stop receiving emails from it, >>>> send an email to neo4j+un...@googlegroups.com. >>>> >>>> > For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to neo4j+un...@googlegroups.com. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to neo4j+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.