Re: [Neo4j] Neo4j performance with 400million nodes

2011-11-06 Thread algecya
anders, thank you very much for reporting back and looking at it!
Good luck fixing the bug then 
--
alican

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-performance-with-400million-nodes-tp3467806p3486237.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Neo4j performance with 400million nodes

2011-11-02 Thread algecya
Hi anders,
appreciate your offer very much! It is good to know that the neo4j community
is very active and involved. 

http://neo4j-community-discussions.438527.n3.nabble.com/file/n3472966/BatchImportData.groovy
BatchImportData.groovy 

Here is the import script. it is a stripped version of the graph I used for
testing. If you need more data, just increase the variable 'amountTypeA' at
line 26. 

--
alican


--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-performance-with-400million-nodes-tp3467806p3472966.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Neo4j performance with 400million nodes

2011-10-31 Thread algecya
Hello everyone,

We are relatively new to neo4j and are evaluating some test scenarios in
order to decide to use neo4j in productive systems. We used the latest
stable release 1.4.2.

I wrote an import script and generated some random data with the given tree
structure:
http://neo4j-community-discussions.438527.n3.nabble.com/file/n3467806/neo4j_nodes.png
 

Nodes Summary:
Nodes with Type A: 1
Nodes with Type B: 100
Nodes with Type C: 50'000 (100x500)
Nodes with Type D: 500'000 (50'000x10)
Nodes with Type E: 25'000'000 (500'000x50)
Nodes with Type F: 375'000'000 (25'000'000x15)

This all worked quite OK, the import took approx. 30hours using the
batchimport.
We have multiple indexes, but we also have one index where all nodes are
indexed.

My first question would be, does it make sense to index all nodes with the
same index?

If I would like to list all nodes with property type:type E it is quite
slow the first time ~270s
Second time it is fast ~1/2s. I know this is normal and mostlikely fixed in
the current milestone version. But I am not sure how long the query will be
cached in memory. Are there any configurations I should be concerned about?

We also took the hardware sizing calculator. See the result here:
http://neo4j-community-discussions.438527.n3.nabble.com/file/n3467806/neo4j_hardware.png
 

Are these realistic result values? I guess 128GB RAM and 12TB SSD harddrives
might be a bit cost intense.

Are there any reference applications with these amount of nodes and
relations?

Also Neoclipse won't start/connect to the database anymore with these amount
of data.
Am I missing some configurations for neoclipse?

Best regards
-- 
alican


--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-performance-with-400million-nodes-tp3467806p3467806.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user