Re: [Neo4j] Re: Why is Neo4j slower(totally dead) with many nodes and relationships in lower specification of pc/notebook while MySQL is not?

Michael Hunger Fri, 28 Mar 2014 06:31:30 -0700

Rio,

was this your first run of both statements? If so, please run them for a
second time.
And did you create an index or constraint for :User(user_id) ?


MATCH (U:User) RETURN COUNT(U);

I would also change:

MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)
WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT (U)-[:Friend]->(FFU)
RETURN FFU.username

to

MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)
WHERE U.user_id=1
WITH distinct U, FFU
WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU)
RETURN FFU.username

I quickly created a dataset on my machine:

cypher 2.0 foreach (i in range(1,1000) | create (:User {id:i}));

create constraint on (u:User) assert u.id is unique;

match (u1:User),(u2:User) with u1,u2 where rand() < 0.1 create
(u1)-[:Friend]->(u2);

Relationships created: 99974

778 ms

match (u:User) return count(*);

+----------+
| count(*) |
+----------+
| 1000     |
+----------+
1 row
*4 ms*


MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)
WHERE U.id=1
WITH distinct U, FFU
WHERE FFU<>U AND NOT (U)-[:Friend]->(FFU)
RETURN FFU.id;

...

910 rows

101 ms

but even your query takes only

MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)
WHERE U.id=1 AND FFU.id<>U.id AND NOT (U)-[:Friend]->(FFU)
RETURN FFU.id;

...

8188 rows

578 ms


On Fri, Mar 28, 2014 at 2:08 PM, Lundin <lundin.codei...@gmail.com> wrote:
>
> ms, it is milliseconds.
>
> What is the corresponding result for a SQL db ?
> MATCH (n:User)-[:Friend*3]-(FoFoF) return FoFoF;
>
> Albeit a valid search is it something useful ? I would think finding a
specific persons FoFoF in either end, as a starting point or end point,
would be a very realistic scenario. Adding an Index on User:name and query
for a User with name:Rio try to find his FoFoF.
>
> Yes, neo4j has been kind and exposed various function, like shortestpath
in cypher
> http://docs.neo4j.org/refcard/2.0/
>
> Also look at some gist examples
> https://github.com/neo4j-contrib/graphgist/wiki
>
> Den fredagen den 28:e mars 2014 kl. 05:00:22 UTC+1 skrev Rio Eduardo:
>>
>> Thank you so much for the reply Lundin. I really apreciate it. Okay,
yesterday I just tested my experiment again. And the result was not what I
imagined and expected before. Okay, before I tested 1M users, I reduced the
number of users into 1000 users and tested it not in my social network but
directly in database only(Neo4j Shell) to find out that it was not caused
by the performance of pc. But the result of returning 1000 users was 200ms
and 1 row and the result of returning friends at depth of two was 85000ms
and 2500 rows and are 200ms and 85000ms fast to you? and what does ms stand
for? is it milliseconds or microseconds?
>>
>> the query I use for returning 1000 users is
>> MATCH (U:User) RETURN COUNT(U);
>>
>> and the query I use for returning friends at depth of two is
>> MATCH (U:User)-[F:Friend]->(FU:User)-[FF:Friend]->(FFU:User)
>> WHERE U.user_id=1 AND FFU.user_id<>U.user_id AND NOT (U)-[:Friend]->(FFU)
>> RETURN FFU.username
>>
>> Please note that I tested with default configuration of Neo4j and
created users with 1000 random nodes and created friends relationships with
50000 random relationships(1 user has 50 friends). Each relationship has a
label Friend and no properties on it. Each node has a label User, 4
properties: user_id, username, password and profile_picture. Each property
has a value of 1-60 characters. average of characters of user_id=1-1000
characters, all usernames have 10 characters randomly, all passwords have
60 characters because I MD5 it, and profile_picture has 1-60 characters.
>>
>> And about your statement "Otherwise if you really need to present that
many "things" just paging the result with SKIP,LIMIT. I has never made
sense to present 1M of anything at a time for a user.", I already did
according to your statement above but it is still the same, Neo4j returns
result slower.
>>
>> And I'm wondering if Neo4j already applied one of graph
algorithms(shortest path, djikstra, A*, etc) in its system or not.
>>
>> Thank you.
>>
>>
>> On Friday, March 28, 2014 3:43:49 AM UTC+7, Lundin wrote:
>>>
>>> Rio, any version will do. They can all handle million nodes on common
hardware, no magic at all. When hundred of millions of billions then we
might need to look into specfication more in detail. But in that case with
that kind of data there are other bottlencks for a social network or any
web appp that needs to be taken care of as well.
>>>
>>> you said:
>>>>
>>>>  Given any two persons chosen at random, is there a path that connects
them that is at most five relationships long? For a social network
containing 1,000,000 people, each with approximately 50 friends, the
results strongly suggest that graph databases are the best choice for
connected data. And graph database can still work 150 times faster than
relational database at third degree and 1000 times faster at fourth degre
>>>
>>>
>>> I fail to see how this is connected to your attempt to list 1M users in
one go at the first page. You would want to seek if there is a relationship
and return that path between users. You need two start nodes and seek a
path by traveser the relationsip rather than scan tables and that would be
the comparison.
>>> Otherwise if you really need to present that many "things" just paging
the result with SKIP,LIMIT. I has never made sense to present 1M of
anything at a time for a user. Again, that wouldn't really serve your
experiment much good to prove graph theory.
>>>
>>> What is the result of MATCH(U:User) RETURN count(U); ?
>>>
>>> Also when you do your test make sure to add the warm/cold cache effect
(better/worse performance)
>>>
>>> Den torsdagen den 27:e mars 2014 kl. 17:57:10 UTC+1 skrev Rio Eduardo:
>>>>
>>>> I just knew about memory allocation and just read Server Performance
Tuning of Neo4j. neo4j.properties:
>>>> # Default values for the low-level graph engine
>>>>
>>>> #neostore.nodestore.db.mapped_memory=25M
>>>> #neostore.relationshipstore.db.mapped_memory=50M
>>>> #neostore.propertystore.db.mapped_memory=90M
>>>> #neostore.propertystore.db.strings.mapped_memory=130M
>>>> #neostore.propertystore.db.arrays.mapped_memory=130M
>>>>
>>>> Should I change this to get high performance? If yes, please suggest
me.
>>>>
>>>> And I just knew about Neo4j Licenses, they are Community, Personal,
Startups, Business and Enterprise. And at Neo4j website all features are
explained. So which Neo4j should I use for my case that has millions nodes
and relationships?
>>>>
>>>> Please answer. I need your help so much.
>>>>
>>>> Thanks.
>>>>
>>>> On Tuesday, March 25, 2014 12:03:58 AM UTC+7, Rio Eduardo wrote:
>>>>>
>>>>> I'm testing my thesis which is about transforming from relational
database to graph database. After transforming from relational database to
graph database, I will test their own performance according to query
response time and throughput. In relational database, I use MySQL while in
graph database I use Neo4j for testing. I will have 3 Million more nodes
and 6 Million more relationships. But when I just added 60000 nodes, my
Neo4j is already dead. When I tried to return all 60000 nodes, it returned
unknown. I did the same to MySQL, I added 60000 records but it could return
all 60000 records. It's weird because it's against the papers I read that
told me graph database is faster than relational database So Why is Neo4j
slower(totally dead) in lower specification of pc/notebook while MySQL is
not? And What specification of pc/notebook do I should use to give the best
performance during testing with millions of nodes and relationships?
>>>>>
>>>>> Thank you.
>
> --
> You received this message because you are subscribed to the Google Groups
"Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
email to neo4j+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Re: Why is Neo4j slower(totally dead) with many nodes and relationships in lower specification of pc/notebook while MySQL is not?

Reply via email to