If you're using ETS, then yes that makes sense that the majority of the time would be CPU (there's no disk to wait on!). One thing - when you say 60% of available CPU, is that 60% of one core, or 60% of 12 cores (sometimes reported as 720%)? The latter seems highly unlikely.
There are some generic things you can do to tweak Erlang and Riak for multicore computers, many of them referenced on this page: http://erlang.org/doc/man/erl.html Of special note are: +A - how many OS threads will be in the async thread pool (handles IO, some system calls) +S - the number of scheduler threads to create (usually 1 thread per CPU) +sct - specify the CPU topology (useful for NUMA processors like Nehalem) Sean Cribbs <[email protected]> Developer Advocate Basho Technologies, Inc. http://basho.com/ On Dec 9, 2010, at 5:40 AM, David Dawson wrote: > Sean, > > We have used the ETS backend, to rule out Disk IO as much as possible > and we are pretty sure that our memory IO is fine, this leaves either Network > IO being a factor or the erlang VM itself ( highly unlikely ). In further > testing we also discovered that just using 1 node in the ring gave us 80% cpu > / 0% wait / 17% idle / 3% sys, which could suggest that network chatter > between the ring could be the factor, so we are planning to setup the RIAK > ring so the handoff and chatter traffic is done over a separate NIC, in your > experiences will this help? > > Dave > > > On 8 Dec 2010, at 17:57, Sean Cribbs wrote: > >> David, >> >> It's expected that, especially when running basho_bench, your cluster will >> not use that much CPU. Riak is mostly I/O-bound, which is a good thing (vs. >> wasting CPU not delivering data to you). I would definitely expect network >> and disk (and possibly RAM) to be the primary limiting factors. >> >> Sean Cribbs <[email protected]> >> Developer Advocate >> Basho Technologies, Inc. >> http://basho.com/ >> >> On Dec 8, 2010, at 12:36 PM, David Dawson wrote: >> >>> We are currently running a ring of 3 machines each machine with 12 cores, >>> and have noticed that we seem to only be using 60% of available CPU ( other >>> 40% is spent idle ) when running basho bench against the cluster. We have >>> tried changing the 'n_val' from 3 to 1 and also the backend from bitcask to >>> ets, and it seems to make no difference. Is this expected? >>> >>> Also are there any suggested tuning parameters that should be used when >>> running riak on machines with lots of cores? and if not what would you >>> expect the bottleneck to be in a cluster? ( cpu / disk io/ memory / network >>> io ) >>> >>> Dave >>> >>> >>> >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> > _______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
