Re: Thousands of workers waiting to receive jobs but ActiveMQ isn’t sending them.

Kevin Burton Thu, 12 Mar 2015 02:22:42 -0700

The problem is that this is happening in production but not locally.

Setting prefetchPolicy to zero fixed it for one of our tasks, but another
one of our tasks is still waiting for work and I can’t figure out why.


It has 6000 consumers / threads, 200k messages waiting to be processed, but
none are being executed.

The workers themselves are all waiting for ActiveMQ to send the messages.

And none of our connections are marked slow (I think, still verifying but
it takes forever).

On Wed, Mar 11, 2015 at 5:33 AM, Tim Bain <tb...@alumni.duke.edu> wrote:

> I'd definitely set breakpoints and step through with a debugger to try to
> figure out what's going on.
> On Mar 10, 2015 8:19 PM, "Kevin Burton" <bur...@spinn3r.com> wrote:
>
> > This is exceedingly bizarre.  Now ActiveMQ is refusing to deliver ANY
> > messages to my workers.
> >
> > This is very bizarre, no code has changed.  Nothing.  It’s just refusing
> to
> > give work.
> >
> > If I set the prefetch to 0 or 1, it does work for a few moments, then
> > halts.
> >
> > 99% certain I’m committing all my messages.  As it would make sense that
> > nothing could be processed after that of course.
> >
> > Kevin
> >
> >
> > On Tue, Mar 10, 2015 at 6:53 PM, Tim Bain <tb...@alumni.duke.edu> wrote:
> >
> > > If you make a single consumer, you'll only get one message at a time by
> > > default (so only one thread will be doing any work).  You'd have to use
> > > client acknowledgement or selective acknowledgement to get more than
> one
> > > message at a time.  I'd probably leave many consumers but tune down
> your
> > > prefetch buffers to something relatively small to ensure that the
> > workload
> > > is evenly spread and you don't have some consumers with a large
> prefetch
> > > buffer worth of backlog while others sit around idle.
> > >
> > > But if you're seeing the broker report pending messages, then that
> means
> > > that having unbalanced workloads due to large prefetch buffers isn't
> your
> > > problem...  Pending messages on the broker only occur when those
> messages
> > > can't be dispatched to any consumer because all of their prefetch
> buffers
> > > are full, which would mean that you don't have unbalanced workloads.
> > > You'll get one or the other, not both.
> > >
> > > On Tue, Mar 10, 2015 at 7:42 PM, Kevin Burton <bur...@spinn3r.com>
> > wrote:
> > >
> > > > I’m actually wondering if this is my issues. I’m creating one session
> > per
> > > > thread.  So perhaps some of the threads have work to do, but they’re
> > each
> > > > prefetching a bunch of work when in reality a better strategy might
> the
> > > to
> > > > have one master listener and then dispatch messages to each thread.
> > > >
> > > > On Tue, Mar 10, 2015 at 6:37 PM, Kevin Burton <bur...@spinn3r.com>
> > > wrote:
> > > >
> > > > > OK.  That’s good to know.  I have a large number of connections so
> I
> > > have
> > > > > to look at each one.  I wonder if this could also be the issue.
> AKA
> > > too
> > > > > many connections.
> > > > >
> > > > > On Tue, Mar 10, 2015 at 6:19 PM, Tim Bain <tb...@alumni.duke.edu>
> > > wrote:
> > > > >
> > > > >> You should be able to confirm that the prefetch buffers are empty
> by
> > > > >> inspecting the JMX MBeans on the broker.  Look at the consumers
> for
> > > the
> > > > >> destination, and for each one look at its DispatchedQueueSize
> > > attribute.
> > > > >>
> > > > >> Keep in mind that slow consumers are identified *ONLY* if you
> > > configure
> > > > >> one
> > > > >> of the abort strategies.  If you didn't set that up, don't expect
> > any
> > > > slow
> > > > >> consumer identification log lines.  And if you did, I've never
> seen
> > a
> > > > >> situation where a consumer went slow and a log line didn't happen
> > > (using
> > > > >> the SlowConsumerAbortStrategy; I haven't used
> > > > SlowAckConsumerAbortStrategy
> > > > >> and can't vouch for it); we get those log lines pretty frequently.
> > So
> > > > if
> > > > >> you're not seeing broker-side log lines about consumers being
> > > identified
> > > > >> as
> > > > >> slow and then aborted, I'd bet it's simply not happening.
> > > > >>
> > > > >> On Tue, Mar 10, 2015 at 7:10 PM, Kevin Burton <bur...@spinn3r.com
> >
> > > > wrote:
> > > > >>
> > > > >> > The broker.  I’ll assume the prefetch brokers are empty. I’m
> > looking
> > > > >> into
> > > > >> > debugging that now but I don’t have tools to introspect.
> > > > >> >
> > > > >> > The broker has thousands of messages.
> > > > >> >
> > > > >> > I just confirmed that a restart DOES improve the situation.
> > > > >> >
> > > > >> > It’s possible that they’re being marked as slow consumers but
> not
> > > > >> *logged*
> > > > >> > as such so I’m trying to use JMX to dump the sessions.
> > > > >> >
> > > > >> > On Tue, Mar 10, 2015 at 5:58 PM, Tim Bain <
> tb...@alumni.duke.edu>
> > > > >> wrote:
> > > > >> >
> > > > >> > > Are the messages getting hung up in the broker or in the
> client?
> > > > (Do
> > > > >> the
> > > > >> > > consumers have empty or full prefetch buffers?)
> > > > >> > >
> > > > >> > > On Tue, Mar 10, 2015 at 6:47 PM, Kevin Burton <
> > bur...@spinn3r.com
> > > >
> > > > >> > wrote:
> > > > >> > >
> > > > >> > > > I’m still trying to track down some issues with ActiveMQ …
> > > > >> > > >
> > > > >> > > > One is that I have 5 ActiveMQ servers now, and each one has
> > > about
> > > > >> 3000
> > > > >> > > > messages pending.  So 15000 messages in queues.
> > > > >> > > >
> > > > >> > > > These are non-persistent queues, plenty of memory and plenty
> > of
> > > > CPU,
> > > > >> > but
> > > > >> > > > the workers are just blocked waiting to receive work.
> > > > >> > > >
> > > > >> > > > I had a hypothesis that this could be slow workers, but
> after
> > > > tuning
> > > > >> > some
> > > > >> > > > things I no longer receive any errors about slow workers.
> > > > >> > > >
> > > > >> > > > Restarting the daemons doesn’t fix things either.  Anything
> > else
> > > > it
> > > > >> > could
> > > > >> > > > be?  I’m a bit stumped unfortunately.
> > > > >> > > >
> > > > >> > > > --
> > > > >> > > >
> > > > >> > > > Founder/CEO Spinn3r.com
> > > > >> > > > Location: *San Francisco, CA*
> > > > >> > > > blog: http://burtonator.wordpress.com
> > > > >> > > > … or check out my Google+ profile
> > > > >> > > > <https://plus.google.com/102718274791889610666/posts>
> > > > >> > > > <http://spinn3r.com>
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >> >
> > > > >> >
> > > > >> > --
> > > > >> >
> > > > >> > Founder/CEO Spinn3r.com
> > > > >> > Location: *San Francisco, CA*
> > > > >> > blog: http://burtonator.wordpress.com
> > > > >> > … or check out my Google+ profile
> > > > >> > <https://plus.google.com/102718274791889610666/posts>
> > > > >> > <http://spinn3r.com>
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Founder/CEO Spinn3r.com
> > > > > Location: *San Francisco, CA*
> > > > > blog: http://burtonator.wordpress.com
> > > > > … or check out my Google+ profile
> > > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > <http://spinn3r.com>
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Founder/CEO Spinn3r.com
> > > > Location: *San Francisco, CA*
> > > > blog: http://burtonator.wordpress.com
> > > > … or check out my Google+ profile
> > > > <https://plus.google.com/102718274791889610666/posts>
> > > > <http://spinn3r.com>
> > > >
> > >
> >
> >
> >
> > --
> >
> > Founder/CEO Spinn3r.com
> > Location: *San Francisco, CA*
> > blog: http://burtonator.wordpress.com
> > … or check out my Google+ profile
> > <https://plus.google.com/102718274791889610666/posts>
> > <http://spinn3r.com>
> >
>



-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Re: Thousands of workers waiting to receive jobs but ActiveMQ isn’t sending them.

Reply via email to