I think it would be very helpful to the neo community for the neo team to do some testing and benchmarking on a variety of commonly used OS platforms and make specific suggestions for tuning and "known issues". It seems that platform specific io, threading and memory issues are what lead to performance concerns and differences in reliability under heavy load. I'm sure it isn't an easy problem to solve, but the reality is that neo4j will be deployed on at least Windows and the main Linux flavors.
Thoughts? ----- Reply message ----- From: "Marco Gerber" <mger...@junisphere.net> Date: Mon, May 30, 2011 10:59 am Subject: [Neo4j] performance issues with ubuntu To: "Neo4j user discussions" <user@lists.neo4j.org> Hello everybody After profiling the application I found the bottleneck in my application. Each traversal is running within a transaction and with a likelihood of 50% a property is getting written on every node on the traversal's path. Note, that the traversal's path has a depth of ~8 nodes. This results in ~4 writes inside one transaction before commit. When I disable updating properties, the traversial execution time is mutch mutch better (thousand of times faster - really nice). The second thing I found is the following: When I setup the database (always the same one) on a ramdisk, the traversial terminates around 300 times faster as on physical disk. This two things brings me to the conclusion, that there is a lot of optimization possible for how I should configure linux. One of the things I did was to disable logrotation as described [1] but without the expected outcome. What else can I do to enhance the throughput to the underlying disk? Any suggestions? [1] http://wiki.neo4j.org/content/Performance_Guide#Write_performance Cheers, Marco -----Original Message----- From: user-boun...@lists.neo4j.org on behalf of Marco Gerber Sent: Mon 30.05.2011 11:47 To: Neo4j user discussions Subject: Re: [Neo4j] performance issues with ubuntu Hi Peter Thank you for your answer. A node contains +/- 4 Strings with on average 50 characters. I run the tests around 50 times without interruption. I also run the linux performance benchmark as described here [1]. After setting the sysctl values as described, the benchmark reached half of the values of what is listed on [1] (before doing this, the benchmark reached one tenth of the values) . But also after doing this, nothing has been changed for my application. I'm little curious about the low amount of cpu usage for all the testcases. I conclude from this that the cpu power is mostly consumed somewhere else. Short: my application sleeps most of the time instead of sweating :-). I began to move my application from ubuntu to CentOS 5.5 also with comparable hardware settings. The result was that my application was running about 5 times faster than in ubuntu. But also not yet where it should be. What kind of profiling software do you use to test neo4j? Maybe I'm able to find something with this approach. I already tryed to use Eclipse Profiler Plugin from [2] but it lacks of version missmatch with STS 2.6.RELEASE and it seems that it's no longer maintained. [1] http://wiki.neo4j.org/content/Linux_Performance_Guide [2] http://eclipsecolorer.sourceforge.net/index_profiler.html Cheers, Marco -----Original Message----- From: user-boun...@lists.neo4j.org on behalf of Peter Neubauer Sent: Sat 28.05.2011 20:00 To: Neo4j user discussions Subject: Re: [Neo4j] performance issues with ubuntu Mmh, this is very strange. How much data is associated with the nodes? The amount of data you are dealing with is very small, so I am not thinking that the underlying IO has anything to do with it here, unless you are running a lot of cold tests without warming up. In your case, the whole dataset fits easily into your JVM heap, so please make sure you are running tests several times - that is your production scenario. Is that giving better performance? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Ă–resund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Fri, May 27, 2011 at 4:35 PM, Marco Gerber <mger...@junisphere.net>wrote: > Hello everybody > > I have havy performance issues with the following setup: > > 1010 Nodes in use > 1052 RelationshipIds in use > > The depth of the graph is set to 8 (from factory nodes on). The traversial > of the graph starts from around 300 nodes synchronously (one traversial by > one). In average, there are 21 such traversials/s possible, where on average > 77 nodes/s are getting traversed. This happens even if I provide more heap > memory (Xmx2G) to the executing jvm or enlarge the memory mapping sizes. The > rest of my setup looks as listed under [1]: > > The same code, running on a windows 7 64bit installation (with similar cpu > power and installed memory) is on average 30 times faster. What could be the > root of such performance differences? > > Thanks and let me know if you need more information to help me solve this > problem, > Marco > > [1]: > Cpu(s): 6.4%us, 3.1%sy, 0.2%ni, 66.6%id, 23.6%wa, 0.0%hi, 0.1%si, > 0.0%st > Mem: 5980156k total, 5719900k used, 260256k free, 251656k buffers > Swap: 15624188k total, 0k used, 15624188k free, 3649388k cached > > - Ubuntu 10.10 > - java version: "1.6.0_24" ; Java(TM) SE Runtime Environment (build > 1.6.0_24-b07) ; Java HotSpot(TM) Server VM (build 19.1-b02, mixed mode) > - VM arguments: -server -Xmx2G > - neo4j properties vary between default and: > neostore.nodestore.db.mapped_memory=500M > neostore.relationshipstore.db.mapped_memory=500M > neostore.propertystore.db.mapped_memory=500M > neostore.propertystore.db.index.mapped_memory=1M > neostore.propertystore.db.index.keys.mapped_memory=1M > neostore.propertystore.db.strings.mapped_memory=130M > neostore.propertystore.db.arrays.mapped_memory=130M > use_adaptive_cache=YES > adaptive_cache_heap_ratio=0.77 > adaptive_cache_manager_decrease_ratio=1.15 > adaptive_cache_manager_increase_ratio=1.1 > adaptive_cache_worker_sleep_time=3000 > min_node_cache_size=1000 > min_relationship_cache_size=0 > max_node_cache_size=1500 > max_relationship_cache_size=10500 > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user