We have a problem where all work is given to ONE host in our cluster. What then happens is that this box goes to 100% CPU and other boxes are idle and need more work.
We have an activemq setup where we create 16 connections to ActiveMQ (one per core), and then one session per thread with a prefetch size > 0... Right now it's set to 1 but it has been higher in the past. We run about 200 threads per box so that's 200 sessions across all 16 connections. So I'm pretty sure what's happening is that all the messages are getting read into the prefetch on ONE host and then no other work is is available. This host just then SITS on this work choking out other consumers on other hosts. Is there a way to flatten this out to slow specific hosts or rate limit total messages given out per host? This would help spread work throughout the cluster. Kevin -- We’re hiring if you know of any awesome Java Devops or Linux Operations Engineers! Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile <https://plus.google.com/102718274791889610666/posts>