Re: [Neo4j] Node creation limit
Thanks. Regarding scaling 1.0 and 1.1 releases have a limit of 4 billion records / store file so if you need to store 4 billion strings you have to make sure every string fits in a single block. This limit will be increased to 32 billion or more in the 1.2 release. Any timeline guidance on release 1.2? We would like to learn about any implementation supporting following claim on the main neo4j page. Does anyone know about sharding schemes and how would traverser work with distributed graph? - massive scalability. Neo4j can handle graphs of several *billion*nodes/relationships/properties on a single machine and can be sharded to scale out across multiple machines. -Johan On Mon, Jun 7, 2010 at 4:27 PM, Biren Gandhi biren.gan...@gmail.com wrote: Similar issue on my side as well. Test data is ok, but production data (100 million+ objects, 200 relationships per object and 10 properties per object, with multi-million queries per day about search and traversal) would need clear disk sizing calculations due to iops and other hardware limits in a monolithic storage model. Has anyone been able to use neo4j succeessfully in scaling needs similar to mentioned avove? -b On Jun 7, 2010, at 4:45 AM, Craig Taverner cr...@amanzi.com wrote: Is there a specific constrain on disk space? Normally disk space isn't a problem... it's cheap and there's usually loads of it. Actually for most of my use cases the disk space has been fine. Except for one data source, that surprised me by expanding from less than a gig of original binary data, to over 20GB database. While this too can be managed, it was just a sample, and so I have yet to see what the customers 'real data' will do to the database (several hundred times larger, I'm expecting). When we get to that point we will need to decide how to deal with it. Currently we 'solve' the issue by allowing the user to filter out data on import, so we don't store everything. This will not satisfy all users, however. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node creation limit
Similar issue on my side as well. Test data is ok, but production data (100 million+ objects, 200 relationships per object and 10 properties per object, with multi-million queries per day about search and traversal) would need clear disk sizing calculations due to iops and other hardware limits in a monolithic storage model. Has anyone been able to use neo4j succeessfully in scaling needs similar to mentioned avove? -b On Jun 7, 2010, at 4:45 AM, Craig Taverner cr...@amanzi.com wrote: Is there a specific constrain on disk space? Normally disk space isn't a problem... it's cheap and there's usually loads of it. Actually for most of my use cases the disk space has been fine. Except for one data source, that surprised me by expanding from less than a gig of original binary data, to over 20GB database. While this too can be managed, it was just a sample, and so I have yet to see what the customers 'real data' will do to the database (several hundred times larger, I'm expecting). When we get to that point we will need to decide how to deal with it. Currently we 'solve' the issue by allowing the user to filter out data on import, so we don't store everything. This will not satisfy all users, however. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node creation limit
Thanks. Big transactions were indeed problematic. Splitting them down into smaller chunks did the trick. I'm still disappointed by the on-disk size of a minimal node without any relationships or attributes. For 500K nodes, it is taking 80MB space (160 byes/node) and for 1M objects it is consuming 160MB (again 160 byes/node). Is this normal? 4.0Kactive_tx_log 12K lucene 12K lucene-fulltext 4.0Kneostore 4.0Kneostore.id 4.4Mneostore.nodestore.db 4.0Kneostore.nodestore.db.id 12M neostore.propertystore.db 4.0Kneostore.propertystore.db.arrays 4.0Kneostore.propertystore.db.arrays.id 4.0Kneostore.propertystore.db.id 4.0Kneostore.propertystore.db.index 4.0Kneostore.propertystore.db.index.id 4.0Kneostore.propertystore.db.index.keys 4.0Kneostore.propertystore.db.index.keys.id 64M neostore.propertystore.db.strings 4.0Kneostore.propertystore.db.strings.id 4.0Kneostore.relationshipstore.db 4.0Kneostore.relationshipstore.db.id 4.0Kneostore.relationshiptypestore.db 4.0Kneostore.relationshiptypestore.db.id 4.0Kneostore.relationshiptypestore.db.names 4.0Kneostore.relationshiptypestore.db.names.id 4.0Knioneo_logical.log.active 4.0Ktm_tx_log.1 80M total On Wed, Jun 2, 2010 at 12:17 AM, Mattias Persson matt...@neotechnology.comwrote: Exactly, the problem is most likely that you try to insert all your stuff in one transaction. All data for a transaction is kept in memory until committed so for really big transactions it can fill your entire heap. Try to group 10k operations or so for big insertions or use the batch inserter. Links: http://wiki.neo4j.org/content/Transactions#Big_transactions http://wiki.neo4j.org/content/Batch_Insert 2010/6/2, Laurent Laborde kerdez...@gmail.com: On Wed, Jun 2, 2010 at 3:50 AM, Biren Gandhi biren.gan...@gmail.com wrote: Is there any limit on number of nodes that can be created in a neo4j instance? Any other tips? I created hundreds of millions of nodes without problems, but it was splitted into many transaction. -- Laurent ker2x Laborde Sysadmin DBA at http://www.over-blog.com/ ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node creation limit
There is only 1 property - n (to store name of the node) - used as follows: Node node = graphDb.createNode(); node.setProperty( NAME_KEY, username ); And the values of username are Node-1, Node-2 etc. On Wed, Jun 2, 2010 at 3:14 PM, Mattias Persson matt...@neotechnology.comwrote: Only 4,4mb out of those 80 is consumed by nodes so you must be storing some properties somewhere. Would you mind sharing your code so that it would be easier to get a better insight into your problem? 2010/6/2, Biren Gandhi biren.gan...@gmail.com: Thanks. Big transactions were indeed problematic. Splitting them down into smaller chunks did the trick. I'm still disappointed by the on-disk size of a minimal node without any relationships or attributes. For 500K nodes, it is taking 80MB space (160 byes/node) and for 1M objects it is consuming 160MB (again 160 byes/node). Is this normal? 4.0Kactive_tx_log 12K lucene 12K lucene-fulltext 4.0Kneostore 4.0Kneostore.id 4.4Mneostore.nodestore.db 4.0Kneostore.nodestore.db.id 12M neostore.propertystore.db 4.0Kneostore.propertystore.db.arrays 4.0Kneostore.propertystore.db.arrays.id 4.0Kneostore.propertystore.db.id 4.0Kneostore.propertystore.db.index 4.0Kneostore.propertystore.db.index.id 4.0Kneostore.propertystore.db.index.keys 4.0Kneostore.propertystore.db.index.keys.id 64M neostore.propertystore.db.strings 4.0Kneostore.propertystore.db.strings.id 4.0Kneostore.relationshipstore.db 4.0Kneostore.relationshipstore.db.id 4.0Kneostore.relationshiptypestore.db 4.0Kneostore.relationshiptypestore.db.id 4.0Kneostore.relationshiptypestore.db.names 4.0Kneostore.relationshiptypestore.db.names.id 4.0Knioneo_logical.log.active 4.0Ktm_tx_log.1 80M total On Wed, Jun 2, 2010 at 12:17 AM, Mattias Persson matt...@neotechnology.comwrote: Exactly, the problem is most likely that you try to insert all your stuff in one transaction. All data for a transaction is kept in memory until committed so for really big transactions it can fill your entire heap. Try to group 10k operations or so for big insertions or use the batch inserter. Links: http://wiki.neo4j.org/content/Transactions#Big_transactions http://wiki.neo4j.org/content/Batch_Insert 2010/6/2, Laurent Laborde kerdez...@gmail.com: On Wed, Jun 2, 2010 at 3:50 AM, Biren Gandhi biren.gan...@gmail.com wrote: Is there any limit on number of nodes that can be created in a neo4j instance? Any other tips? I created hundreds of millions of nodes without problems, but it was splitted into many transaction. -- Laurent ker2x Laborde Sysadmin DBA at http://www.over-blog.com/ ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node creation limit
Here is some content from neostore.propertystore.db.strings - another huge file. What are the max number of nodes/relationships that people have tried with Neo4j so far? Can someone share disk space usage characteristics? od -N 1000 -x -c neostore.propertystore.db.strings 000 8500 \0 \0 \0 205 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 020 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 200 0100 0c00 \0 \0 \0 \0 \0 001 377 377 377 377 \0 \0 \0 \f 377 377 220 4e00 6f00 6400 6500 2d00 3000 377 377 \0 N \0 o \0 d \0 e \0 - \0 0 \0 \0 240 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 400 ff01 00ff \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 001 377 377 377 377 \0 420 ff0c 00ff 004e 006f 0064 0065 \0 \0 \f 377 377 377 377 \0 N \0 o \0 d \0 e \0 440 002d 0031 - \0 1 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 460 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 600 0100 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 001 620 0c00 4e00 6f00 377 377 377 377 \0 \0 \0 \f 377 377 377 377 \0 N \0 o 640 6400 6500 2d00 3200 \0 d \0 e \0 - \0 2 \0 \0 \0 \0 \0 \0 \0 \0 660 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0001020 ff01 00ff ff0c \0 \0 \0 \0 001 377 377 377 377 \0 \0 \0 \f 377 377 377 0001040 00ff 004e 006f 0064 0065 002d 0033 377 \0 N \0 o \0 d \0 e \0 - \0 3 \0 \0 \0 0001060 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0001220 0100 \0 \0 \0 \0 \0 \0 \0 \0 \0 001 377 377 377 377 \0 \0 0001240 0c00 4e00 6f00 6400 6500 2d00 \0 \f 377 377 377 377 \0 N \0 o \0 d \0 e \0 - 0001260 3400 \0 4 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0001300 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0001420 ff01 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 001 377 0001440 00ff ff0c 00ff 004e 006f 377 377 377 \0 \0 \0 \f 377 377 377 377 \0 N \0 o \0 0001460 0064 0065 002d 0035 d \0 e \0 - \0 5 \0 \0 \0 \0 \0 \0 \0 \0 \0 0001500 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * On Wed, Jun 2, 2010 at 3:50 PM, Biren Gandhi biren.gan...@gmail.com wrote: There is only 1 property - n (to store name of the node) - used as follows: Node node = graphDb.createNode(); node.setProperty( NAME_KEY, username ); And the values of username are Node-1, Node-2 etc. On Wed, Jun 2, 2010 at 3:14 PM, Mattias Persson matt...@neotechnology.com wrote: Only 4,4mb out of those 80 is consumed by nodes so you must be storing some properties somewhere. Would you mind sharing your code so that it would be easier to get a better insight into your problem? 2010/6/2, Biren Gandhi biren.gan...@gmail.com: Thanks. Big transactions were indeed problematic. Splitting them down into smaller chunks did the trick. I'm still disappointed by the on-disk size of a minimal node without any relationships or attributes. For 500K nodes, it is taking 80MB space (160 byes/node) and for 1M objects it is consuming 160MB (again 160 byes/node). Is this normal? 4.0Kactive_tx_log 12K lucene 12K lucene-fulltext 4.0Kneostore 4.0Kneostore.id 4.4Mneostore.nodestore.db 4.0Kneostore.nodestore.db.id 12M neostore.propertystore.db 4.0Kneostore.propertystore.db.arrays 4.0Kneostore.propertystore.db.arrays.id 4.0Kneostore.propertystore.db.id 4.0Kneostore.propertystore.db.index 4.0Kneostore.propertystore.db.index.id 4.0Kneostore.propertystore.db.index.keys 4.0Kneostore.propertystore.db.index.keys.id 64M neostore.propertystore.db.strings 4.0Kneostore.propertystore.db.strings.id 4.0Kneostore.relationshipstore.db 4.0Kneostore.relationshipstore.db.id 4.0Kneostore.relationshiptypestore.db 4.0Kneostore.relationshiptypestore.db.id 4.0K
[Neo4j] Node creation limit
While trying to perform a create-only stress test for nodes, i'm constantly getting Out of Memory error even while running with these params (with default config - as no searching/optimizations are being exercised just yet): EXTRA_JVM_ARGUMENTS=-d64 -server -Xms256m -Xmx1024m Able to create 200K nodes (without any attributes - bare bone nodes only) successfully, but anything above that Java gives up. One strange observation is that the disk size gets capped at 116K in all of the cases while tying to create 100K, 200K, 300K nodes on a fresh DB. Is there any limit on number of nodes that can be created in a neo4j instance? Any other tips? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node creation limit
Correction - disk size 116K is applicable only in failure cases. Here are the numbers for 100K node inserts (takes up 17MB): 4.0Kactive_tx_log 12K lucene 12K lucene-fulltext 4.0Kneostore 4.0Kneostore.id 884Kneostore.nodestore.db 4.0Kneostore.nodestore.db.id 2.4Mneostore.propertystore.db 4.0Kneostore.propertystore.db.arrays 4.0Kneostore.propertystore.db.arrays.id 4.0Kneostore.propertystore.db.id 4.0Kneostore.propertystore.db.index 4.0Kneostore.propertystore.db.index.id 4.0Kneostore.propertystore.db.index.keys 4.0Kneostore.propertystore.db.index.keys.id 13M neostore.propertystore.db.strings 4.0Kneostore.propertystore.db.strings.id 4.0Kneostore.relationshipstore.db 4.0Kneostore.relationshipstore.db.id 4.0Kneostore.relationshiptypestore.db 4.0Kneostore.relationshiptypestore.db.id 4.0Kneostore.relationshiptypestore.db.names 4.0Kneostore.relationshiptypestore.db.names.id 4.0Knioneo_logical.log.active 4.0Ktm_tx_log.1 17M total On Tue, Jun 1, 2010 at 6:50 PM, Biren Gandhi biren.gan...@gmail.com wrote: While trying to perform a create-only stress test for nodes, i'm constantly getting Out of Memory error even while running with these params (with default config - as no searching/optimizations are being exercised just yet): EXTRA_JVM_ARGUMENTS=-d64 -server -Xms256m -Xmx1024m Able to create 200K nodes (without any attributes - bare bone nodes only) successfully, but anything above that Java gives up. One strange observation is that the disk size gets capped at 116K in all of the cases while tying to create 100K, 200K, 300K nodes on a fresh DB. Is there any limit on number of nodes that can be created in a neo4j instance? Any other tips? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user