Hi Michael, you said "In general if you really want to do these deep traversals you might be better off (in terms of performance) using the traversal-API with an appropriate uniqueness constraint, like node-path". Please give me any references so I can learn it. or Does it mean you suggest me to use Gremlin?
Thank you. On Monday, March 31, 2014 8:09:32 PM UTC+7, Michael Hunger wrote: > > Just use a dataset that you can reason about and check if they work > correctly. > > Hard for me to be the consistency checker on your queries :) > > In general if you really want to do these deep traversals you might be > better off (in terms of performance) using the traversal-API with an > appropriate uniqueness constraint, like node-path. > > > > > On Mon, Mar 31, 2014 at 1:09 PM, Rio Eduardo <rioedu...@gmail.com<javascript:> > > wrote: > >> Hello again Michael. >> >> I just want to make sure that my query is correct to find friends of >> friends at depth of four and five. Please help me by checking my query. >> >> Query at depth of four: >> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >> WHERE U.user_id=1 >> WITH DISTINCT U, FU, FFU >> WHERE FFU<>U >> WITH DISTINCT U, FU, FFU >> MATCH (FFU:User)-[FFF:Friend]->(FFFU:User) >> WHERE FFFU<>FU >> WITH DISTINCT U, FFU, FFFU >> MATCH (FFFU:User)-[FFFF:Friend]->(FFFFU:User) >> WHERE FFFFU<>FFU AND FFFFU<>U AND NOT (U)-[:Friend]->(FFFFU) >> RETURN DISTINCT FFFFU.username; >> >> Query at depth of five: >> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >> WHERE U.user_id=1 >> WITH DISTINCT U, FU, FFU >> WHERE FFU<>U >> WITH DISTINCT U, FU, FFU >> MATCH (FFU:User)-[FFF:Friend]->(FFFU:User) >> WHERE FFFU<>FU >> WITH DISTINCT U, FFU, FFFU >> MATCH (FFFU:User)-[FFFF:Friend]->(FFFFU:User) >> WHERE FFFFU<>FFU >> WITH DISTINCT U, FFFU, FFFFU >> MATCH (FFFFU:User)-[FFFFF:Friend]->(FFFFFU:User) >> WHERE FFFFFU<>FFFU AND FFFFFU<>U AND NOT (U)-[:Friend]->(FFFFFU) >> RETURN DISTINCT FFFFFU.username; >> >> I need your help so much. >> Thank you. >> >> >> On Sunday, March 30, 2014 7:42:27 PM UTC+7, Michael Hunger wrote: >> >>> Split it up in one more intermediate step, the intermediate steps are >>> there to get the cardinality down, so it doesn't have to match billions of >>> paths, only millions or 100k >>> >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)-[FFF: >>> Friend]->(FFFU:User) >>> WHERE U.user_id=1 >>> WITH DISTINCT U, FU, FFU >>> WHERE FFU<>U >>> WITH DISTINCT U, FFU >>> MATCH (FFU:User)-[FFF:Friend]->(FFFU:User) >>> WHERE NOT (U)-[:Friend]->(FFFU) >>> RETURN distinct FFFU.username; >>> >>> >>> >>> >>> On Sun, Mar 30, 2014 at 1:29 PM, Rio Eduardo <rioedu...@gmail.com>wrote: >>> >>>> Please help me again Michael. >>>> >>>> You ever said: >>>> >>>> I would also change: >>>> >>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT >>>> (U)-[:Friend]->(FFU) >>>> RETURN FFU.username >>>> >>>> to >>>> >>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>> WHERE U.user_id=1 >>>> WITH distinct U, FFU >>>> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >>>> RETURN FFU.username >>>> >>>> Query above is to find friends of friends at depth of two. And I would >>>> like to find friends of friends at depth of three, when I use model of >>>> your query, it returns result longer than mine and the result is much more >>>> than mine. Ok so here is model of your query at depth of three: >>>> >>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)-[FFF: >>>> Friend]->(FFFU:User) >>>> WHERE U.user_id=1 >>>> WITH DISTINCT U, FU, FFU, FFFU >>>> WHERE FFU<>U AND FFFU<>FU AND NOT (U)-[:Friend]->(FFFU) >>>> RETURN FFFU.username; >>>> >>>> ... >>>> >>>> 118858 rows >>>> 20090 ms >>>> >>>> Mine: >>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)-[FFF: >>>> Friend]->(FFFU:User) >>>> WHERE U.user_id=1 AND FFU<>U AND FFFU<>FU AND NOT (U)-[:Friend]->(FFFU) >>>> RETURN DISTINCT FFFU.username; >>>> >>>> ... >>>> >>>> 950 rows >>>> 18133 ms >>>> >>>> Please help me, Why is model of your query longer than mine and return >>>> much more results than mine? >>>> >>>> Thank you. >>>> >>>> >>>> >>>> On Friday, March 28, 2014 8:30:20 PM UTC+7, Michael Hunger wrote: >>>> >>>>> Rio, >>>>> >>>>> was this your first run of both statements? If so, please run them for >>>>> a second time. >>>>> And did you create an index or constraint for :User(user_id) ? >>>>> >>>>> MATCH (U:User) RETURN COUNT(U); >>>>> >>>>> I would also change: >>>>> >>>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>>> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT >>>>> (U)-[:Friend]->(FFU) >>>>> RETURN FFU.username >>>>> >>>>> to >>>>> >>>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>>> WHERE U.user_id=1 >>>>> WITH distinct U, FFU >>>>> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >>>>> RETURN FFU.username >>>>> >>>>> I quickly created a dataset on my machine: >>>>> >>>>> cypher 2.0 foreach (i in range(1,1000) | create (:User {id:i})); >>>>> >>>>> create constraint on (u:User) assert u.id is unique; >>>>> >>>>> match (u1:User),(u2:User) with u1,u2 where rand() < 0.1 create >>>>> (u1)-[:Friend]->(u2); >>>>> >>>>> Relationships created: 99974 >>>>> >>>>> 778 ms >>>>> >>>>> match (u:User) return count(*); >>>>> >>>>> +----------+ >>>>> | count(*) | >>>>> +----------+ >>>>> | 1000 | >>>>> +----------+ >>>>> 1 row >>>>> *4 ms* >>>>> >>>>> >>>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>>> WHERE U.id=1 >>>>> WITH distinct U, FFU >>>>> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >>>>> RETURN FFU.id; >>>>> >>>>> ... >>>>> >>>>> 910 rows >>>>> >>>>> 101 ms >>>>> >>>>> but even your query takes only >>>>> >>>>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>>> WHERE U.id=1 AND FFU.id<>U.id AND NOT (U)-[:Friend]->(FFU) >>>>> RETURN FFU.id; >>>>> >>>>> ... >>>>> >>>>> 8188 rows >>>>> >>>>> 578 ms >>>>> >>>>> >>>>> On Fri, Mar 28, 2014 at 2:08 PM, Lundin <lundin....@gmail.com> wrote: >>>>> > >>>>> > ms, it is milliseconds. >>>>> > >>>>> > What is the corresponding result for a SQL db ? >>>>> > MATCH (n:User)-[:Friend*3]-(FoFoF) return FoFoF; >>>>> > >>>>> > Albeit a valid search is it something useful ? I would think finding >>>>> a specific persons FoFoF in either end, as a starting point or end point, >>>>> would be a very realistic scenario. Adding an Index on User:name and >>>>> query >>>>> for a User with name:Rio try to find his FoFoF. >>>>> > >>>>> > Yes, neo4j has been kind and exposed various function, like >>>>> shortestpath in cypher >>>>> > http://docs.neo4j.org/refcard/2.0/ >>>>> > >>>>> > Also look at some gist examples >>>>> > https://github.com/neo4j-contrib/graphgist/wiki >>>>> > >>>>> > Den fredagen den 28:e mars 2014 kl. 05:00:22 UTC+1 skrev Rio Eduardo: >>>>> >> >>>>> >> Thank you so much for the reply Lundin. I really apreciate it. >>>>> Okay, yesterday I just tested my experiment again. And the result was not >>>>> what I imagined and expected before. Okay, before I tested 1M users, I >>>>> reduced the number of users into 1000 users and tested it not in my >>>>> social >>>>> network but directly in database only(Neo4j Shell) to find out that it >>>>> was >>>>> not caused by the performance of pc. But the result of returning 1000 >>>>> users >>>>> was 200ms and 1 row and the result of returning friends at depth of two >>>>> was >>>>> 85000ms and 2500 rows and are 200ms and 85000ms fast to you? and what >>>>> does >>>>> ms stand for? is it milliseconds or microseconds? >>>>> >> >>>>> >> the query I use for returning 1000 users is >>>>> >> MATCH (U:User) RETURN COUNT(U); >>>>> >> >>>>> >> and the query I use for returning friends at depth of two is >>>>> >> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>>>> >> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT >>>>> (U)-[:Friend]->(FFU) >>>>> >> RETURN FFU.username >>>>> >> >>>>> >> Please note that I tested with default configuration of Neo4j and >>>>> created users with 1000 random nodes and created friends relationships >>>>> with >>>>> 50000 random relationships(1 user has 50 friends). Each relationship has >>>>> a >>>>> label Friend and no properties on it. Each node has a label User, 4 >>>>> properties: user_id, username, password and profile_picture. Each >>>>> property >>>>> has a value of 1-60 characters. average of characters of user_id=1-1000 >>>>> characters, all usernames have 10 characters randomly, all passwords have >>>>> 60 characters because I MD5 it, and profile_picture has 1-60 characters. >>>>> >> >>>>> >> And about your statement "Otherwise if you really need to present >>>>> that many "things" just paging the result with SKIP,LIMIT. I has never >>>>> made >>>>> sense to present 1M of anything at a time for a user.", I already did >>>>> according to your statement above but it is still the same, Neo4j returns >>>>> result slower. >>>>> >> >>>>> >> And I'm wondering if Neo4j already applied one of graph >>>>> algorithms(shortest path, djikstra, A*, etc) in its system or not. >>>>> >> >>>>> >> Thank you. >>>>> >> >>>>> >> >>>>> >> On Friday, March 28, 2014 3:43:49 AM UTC+7, Lundin wrote: >>>>> >>> >>>>> >>> Rio, any version will do. They can all handle million nodes on >>>>> common hardware, no magic at all. When hundred of millions of billions >>>>> then >>>>> we might need to look into specfication more in detail. But in that case >>>>> with that kind of data there are other bottlencks for a social network or >>>>> any web appp that needs to be taken care of as well. >>>>> >>> >>>>> >>> you said: >>>>> >>>> >>>>> >>>> Given any two persons chosen at random, is there a path that >>>>> connects them that is at most five relationships long? For a social >>>>> network >>>>> containing 1,000,000 people, each with approximately 50 friends, the >>>>> results strongly suggest that graph databases are the best choice for >>>>> connected data. And graph database can still work 150 times faster than >>>>> relational database at third degree and 1000 times faster at fourth degre >>>>> >>> >>>>> >>> >>>>> >>> I fail to see how this is connected to your attempt to list 1M >>>>> users in one go at the first page. You would want to seek if there is a >>>>> relationship and return that path between users. You need two start nodes >>>>> and seek a path by traveser the relationsip rather than scan tables and >>>>> that would be the comparison. >>>>> >>> Otherwise if you really need to present that many "things" just >>>>> paging the result with SKIP,LIMIT. I has never made sense to present 1M >>>>> of >>>>> anything at a time for a user. Again, that wouldn't really serve your >>>>> experiment much good to prove graph theory. >>>>> >>> >>>>> >>> What is the result of MATCH(U:User) RETURN count(U); ? >>>>> >>> >>>>> >>> Also when you do your test make sure to add the warm/cold cache >>>>> effect (better/worse performance) >>>>> >>> >>>>> >>> Den torsdagen den 27:e mars 2014 kl. 17:57:10 UTC+1 skrev Rio >>>>> Eduardo: >>>>> >>>> >>>>> >>>> I just knew about memory allocation and just read Server >>>>> Performance Tuning of Neo4j. neo4j.properties: >>>>> >>>> # Default values for the low-level graph engine >>>>> >>>> >>>>> >>>> #neostore.nodestore.db.mapped_memory=25M >>>>> >>>> #neostore.relationshipstore.db.mapped_memory=50M >>>>> >>>> #neostore.propertystore.db.mapped_memory=90M >>>>> >>>> #neostore.propertystore.db.strings.mapped_memory=130M >>>>> >>>> #neostore.propertystore.db.arrays.mapped_memory=130M >>>>> >>>> >>>>> >>>> Should I change this to get high performance? If yes, please >>>>> suggest me. >>>>> >>>> >>>>> >>>> And I just knew about Neo4j Licenses, they are Community, >>>>> Personal, Startups, Business and Enterprise. And at Neo4j website all >>>>> features are explained. So which Neo4j should I use for my case that has >>>>> millions nodes and relationships? >>>>> >>>> >>>>> >>>> Please answer. I need your help so much. >>>>> >>>> >>>>> >>>> Thanks. >>>>> >>>> >>>>> >>>> On Tuesday, March 25, 2014 12:03:58 AM UTC+7, Rio Eduardo wrote: >>>>> >>>>> >>>>> >>>>> I'm testing my thesis which is about transforming from >>>>> relational database to graph database. After transforming from relational >>>>> database to graph database, I will test their own performance according >>>>> to >>>>> query response time and throughput. In relational database, I use MySQL >>>>> while in graph database I use Neo4j for testing. I will have 3 Million >>>>> more >>>>> nodes and 6 Million more relationships. But when I just added 60000 >>>>> nodes, >>>>> my Neo4j is already dead. When I tried to return all 60000 nodes, it >>>>> returned unknown. I did the same to MySQL, I added 60000 records but it >>>>> could return all 60000 records. It's weird because it's against the >>>>> papers >>>>> I read that told me graph database is faster than relational database So >>>>> Why is Neo4j slower(totally dead) in lower specification of pc/notebook >>>>> while MySQL is not? And What specification of pc/notebook do I should use >>>>> to give the best performance during testing with millions of nodes and >>>>> relationships? >>>>> >>>>> >>>>> >>>>> Thank you. >>>>> > >>>>> > -- >>>>> > You received this message because you are subscribed to the Google >>>>> Groups "Neo4j" group. >>>>> > To unsubscribe from this group and stop receiving emails from it, >>>>> send an email to neo4j+un...@googlegroups.com. >>>>> >>>>> > For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Neo4j" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to neo4j+un...@googlegroups.com. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to neo4j+un...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.