Hello everyone,
We are relatively new to neo4j and are evaluating some test scenarios in
order to decide to use neo4j in productive systems. We used the latest
stable release 1.4.2.
I wrote an import script and generated some random data with the given tree
structure:
http://neo4j-community-discussions.438527.n3.nabble.com/file/n3467806/neo4j_nodes.png
Nodes Summary:
Nodes with Type A: 1
Nodes with Type B: 100
Nodes with Type C: 50'000 (100x500)
Nodes with Type D: 500'000 (50'000x10)
Nodes with Type E: 25'000'000 (500'000x50)
Nodes with Type F: 375'000'000 (25'000'000x15)
This all worked quite OK, the import took approx. 30hours using the
batchimport.
We have multiple indexes, but we also have one index where all nodes are
indexed.
My first question would be, does it make sense to index all nodes with the
same index?
If I would like to list all nodes with property type:type E it is quite
slow the first time ~270s
Second time it is fast ~1/2s. I know this is normal and mostlikely fixed in
the current milestone version. But I am not sure how long the query will be
cached in memory. Are there any configurations I should be concerned about?
We also took the hardware sizing calculator. See the result here:
http://neo4j-community-discussions.438527.n3.nabble.com/file/n3467806/neo4j_hardware.png
Are these realistic result values? I guess 128GB RAM and 12TB SSD harddrives
might be a bit cost intense.
Are there any reference applications with these amount of nodes and
relations?
Also Neoclipse won't start/connect to the database anymore with these amount
of data.
Am I missing some configurations for neoclipse?
Best regards
--
alican
--
View this message in context:
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-performance-with-400million-nodes-tp3467806p3467806.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user