Thank you so much for your help and your explaination Michael. It really helps.
On Saturday, March 29, 2014 6:35:06 PM UTC+7, Michael Hunger wrote: > > Probably a faster CPU on my machine? > > constraint also guarantees uniqueness and creates an index automatically > but adds more cost on insertion > index is optimizing lookups > > Depends on your needs, one property might be unique so you want a > constraint > other properties you might want to search by, so you add an index. > > > > On Sat, Mar 29, 2014 at 3:33 AM, Rio Eduardo <rioedu...@gmail.com<javascript:> > > wrote: > >> Thank you for the reply Michael. Yes, and I already tried it again for a >> second time. >> I just realized that was my mistake. I always thought that the new >> feature Labels already applied Index or Constraint so I had never created >> Index or Constraint when I was using cypher. >> >> And after I created constraint for :User(user_id), I got the result I >> expected: >> >> match (u:User) return count(*); >> +----------+ >> | count(*) | >> +----------+ >> | 1000 | >> +----------+ >> 1 row >> 7 ms >> >> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >> WHERE U.user_id=1 >> WITH distinct U, FFU >> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >> RETURN FFU.user_id; >> >> ... >> 879 rows >> >> 187 ms >> >> after I got my cypher faster than before, I have a question again, >> why is the execution time between me and you different? >> >> yours >> +----------+ >> | count(*) | >> +----------+ >> | 1000 | >> +----------+ >> 1 row >> 4 ms >> >> ... >> 910 rows >> >> 101 ms >> >> mine >> +----------+ >> | count(*) | >> +----------+ >> | 1000 | >> +----------+ >> 1 row >> 7 ms >> >> ... >> 879 rows >> >> 187 ms >> >> is it because the size of the property and the number of the property >> that belongs to node is different? >> >> And what is different between Index and Constraint? >> Should I create two of them? >> If I already created Index, Should I create Constraint again? >> Or if I already created Constraint, Should I create Index again? >> >> Thank you. >> >> On Friday, March 28, 2014 8:30:20 PM UTC+7, Michael Hunger wrote: >> >>> Rio, >>> >>> was this your first run of both statements? If so, please run them for a >>> second time. >>> And did you create an index or constraint for :User(user_id) ? >>> >>> MATCH (U:User) RETURN COUNT(U); >>> >>> I would also change: >>> >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT (U)-[:Friend]->(FFU) >>> RETURN FFU.username >>> >>> to >>> >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>> WHERE U.user_id=1 >>> WITH distinct U, FFU >>> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >>> RETURN FFU.username >>> >>> I quickly created a dataset on my machine: >>> >>> cypher 2.0 foreach (i in range(1,1000) | create (:User {id:i})); >>> >>> create constraint on (u:User) assert u.id is unique; >>> >>> match (u1:User),(u2:User) with u1,u2 where rand() < 0.1 create >>> (u1)-[:Friend]->(u2); >>> >>> Relationships created: 99974 >>> >>> 778 ms >>> >>> match (u:User) return count(*); >>> >>> +----------+ >>> | count(*) | >>> +----------+ >>> | 1000 | >>> +----------+ >>> 1 row >>> *4 ms* >>> >>> >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>> WHERE U.id=1 >>> WITH distinct U, FFU >>> WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU) >>> RETURN FFU.id; >>> >>> ... >>> >>> 910 rows >>> >>> 101 ms >>> >>> but even your query takes only >>> >>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>> WHERE U.id=1 AND FFU.id<>U.id AND NOT (U)-[:Friend]->(FFU) >>> RETURN FFU.id; >>> >>> ... >>> >>> 8188 rows >>> >>> 578 ms >>> >>> >>> On Fri, Mar 28, 2014 at 2:08 PM, Lundin <lundin....@gmail.com> wrote: >>> > >>> > ms, it is milliseconds. >>> > >>> > What is the corresponding result for a SQL db ? >>> > MATCH (n:User)-[:Friend*3]-(FoFoF) return FoFoF; >>> > >>> > Albeit a valid search is it something useful ? I would think finding a >>> specific persons FoFoF in either end, as a starting point or end point, >>> would be a very realistic scenario. Adding an Index on User:name and query >>> for a User with name:Rio try to find his FoFoF. >>> > >>> > Yes, neo4j has been kind and exposed various function, like >>> shortestpath in cypher >>> > http://docs.neo4j.org/refcard/2.0/ >>> > >>> > Also look at some gist examples >>> > https://github.com/neo4j-contrib/graphgist/wiki >>> > >>> > Den fredagen den 28:e mars 2014 kl. 05:00:22 UTC+1 skrev Rio Eduardo: >>> >> >>> >> Thank you so much for the reply Lundin. I really apreciate it. Okay, >>> yesterday I just tested my experiment again. And the result was not what I >>> imagined and expected before. Okay, before I tested 1M users, I reduced the >>> number of users into 1000 users and tested it not in my social network but >>> directly in database only(Neo4j Shell) to find out that it was not caused >>> by the performance of pc. But the result of returning 1000 users was 200ms >>> and 1 row and the result of returning friends at depth of two was 85000ms >>> and 2500 rows and are 200ms and 85000ms fast to you? and what does ms stand >>> for? is it milliseconds or microseconds? >>> >> >>> >> the query I use for returning 1000 users is >>> >> MATCH (U:User) RETURN COUNT(U); >>> >> >>> >> and the query I use for returning friends at depth of two is >>> >> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User) >>> >> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT >>> (U)-[:Friend]->(FFU) >>> >> RETURN FFU.username >>> >> >>> >> Please note that I tested with default configuration of Neo4j and >>> created users with 1000 random nodes and created friends relationships with >>> 50000 random relationships(1 user has 50 friends). Each relationship has a >>> label Friend and no properties on it. Each node has a label User, 4 >>> properties: user_id, username, password and profile_picture. Each property >>> has a value of 1-60 characters. average of characters of user_id=1-1000 >>> characters, all usernames have 10 characters randomly, all passwords have >>> 60 characters because I MD5 it, and profile_picture has 1-60 characters. >>> >> >>> >> And about your statement "Otherwise if you really need to present >>> that many "things" just paging the result with SKIP,LIMIT. I has never made >>> sense to present 1M of anything at a time for a user.", I already did >>> according to your statement above but it is still the same, Neo4j returns >>> result slower. >>> >> >>> >> And I'm wondering if Neo4j already applied one of graph >>> algorithms(shortest path, djikstra, A*, etc) in its system or not. >>> >> >>> >> Thank you. >>> >> >>> >> >>> >> On Friday, March 28, 2014 3:43:49 AM UTC+7, Lundin wrote: >>> >>> >>> >>> Rio, any version will do. They can all handle million nodes on >>> common hardware, no magic at all. When hundred of millions of billions then >>> we might need to look into specfication more in detail. But in that case >>> with that kind of data there are other bottlencks for a social network or >>> any web appp that needs to be taken care of as well. >>> >>> >>> >>> you said: >>> >>>> >>> >>>> Given any two persons chosen at random, is there a path that >>> connects them that is at most five relationships long? For a social network >>> containing 1,000,000 people, each with approximately 50 friends, the >>> results strongly suggest that graph databases are the best choice for >>> connected data. And graph database can still work 150 times faster than >>> relational database at third degree and 1000 times faster at fourth degre >>> >>> >>> >>> >>> >>> I fail to see how this is connected to your attempt to list 1M users >>> in one go at the first page. You would want to seek if there is a >>> relationship and return that path between users. You need two start nodes >>> and seek a path by traveser the relationsip rather than scan tables and >>> that would be the comparison. >>> >>> Otherwise if you really need to present that many "things" just >>> paging the result with SKIP,LIMIT. I has never made sense to present 1M of >>> anything at a time for a user. Again, that wouldn't really serve your >>> experiment much good to prove graph theory. >>> >>> >>> >>> What is the result of MATCH(U:User) RETURN count(U); ? >>> >>> >>> >>> Also when you do your test make sure to add the warm/cold cache >>> effect (better/worse performance) >>> >>> >>> >>> Den torsdagen den 27:e mars 2014 kl. 17:57:10 UTC+1 skrev Rio >>> Eduardo: >>> >>>> >>> >>>> I just knew about memory allocation and just read Server >>> Performance Tuning of Neo4j. neo4j.properties: >>> >>>> # Default values for the low-level graph engine >>> >>>> >>> >>>> #neostore.nodestore.db.mapped_memory=25M >>> >>>> #neostore.relationshipstore.db.mapped_memory=50M >>> >>>> #neostore.propertystore.db.mapped_memory=90M >>> >>>> #neostore.propertystore.db.strings.mapped_memory=130M >>> >>>> #neostore.propertystore.db.arrays.mapped_memory=130M >>> >>>> >>> >>>> Should I change this to get high performance? If yes, please >>> suggest me. >>> >>>> >>> >>>> And I just knew about Neo4j Licenses, they are Community, Personal, >>> Startups, Business and Enterprise. And at Neo4j website all features are >>> explained. So which Neo4j should I use for my case that has millions nodes >>> and relationships? >>> >>>> >>> >>>> Please answer. I need your help so much. >>> >>>> >>> >>>> Thanks. >>> >>>> >>> >>>> On Tuesday, March 25, 2014 12:03:58 AM UTC+7, Rio Eduardo wrote: >>> >>>>> >>> >>>>> I'm testing my thesis which is about transforming from relational >>> database to graph database. After transforming from relational database to >>> graph database, I will test their own performance according to query >>> response time and throughput. In relational database, I use MySQL while in >>> graph database I use Neo4j for testing. I will have 3 Million more nodes >>> and 6 Million more relationships. But when I just added 60000 nodes, my >>> Neo4j is already dead. When I tried to return all 60000 nodes, it returned >>> unknown. I did the same to MySQL, I added 60000 records but it could return >>> all 60000 records. It's weird because it's against the papers I read that >>> told me graph database is faster than relational database So Why is Neo4j >>> slower(totally dead) in lower specification of pc/notebook while MySQL is >>> not? And What specification of pc/notebook do I should use to give the best >>> performance during testing with millions of nodes and relationships? >>> >>>>> >>> >>>>> Thank you. >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> an email to neo4j+un...@googlegroups.com. >>> >>> > For more options, visit https://groups.google.com/d/optout. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to neo4j+un...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to neo4j+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.