We have a problem where all work is given to ONE host in our cluster. What
then happens is that this box goes to 100% CPU and other boxes are idle and
need more work.

We have an activemq setup where we create 16 connections to ActiveMQ (one
per core), and then one session per thread with a prefetch size > 0...
Right now it's set to 1 but it has been higher in the past.

We run about 200 threads per box so that's 200 sessions across all 16
connections.

So I'm pretty sure what's happening is that all the messages are getting
read into the prefetch on ONE host and then no other work is is available.
This host just then SITS on this work choking out other consumers on other
hosts.

Is there a way to flatten this out to slow specific hosts or rate limit
total messages given out per host?

This would help spread work throughout the cluster.

Kevin

-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Reply via email to