Re: Thousands of workers waiting to receive jobs but ActiveMQ isn’t sending them.

Kevin Burton Wed, 11 Mar 2015 13:31:57 -0700

Yes.  I’m sorry. I meant the stock configuration so no slow consumer
strategy!



On Wed, Mar 11, 2015 at 12:56 PM, Tim Bain <tb...@alumni.duke.edu> wrote:

> There isn't a stock slow consumer strategy.  If you didn't configure it,
> you don't have one.
>
> On Wed, Mar 11, 2015 at 1:49 PM, Kevin Burton <bur...@spinn3r.com> wrote:
>
> > I didn’t configure one… so we’re running the stock one. I’m trying to
> > verify it it’s just a specific task not committing messages but for some
> > reason it processes them in batches.
> >
> > If it wasn’t committing the messages I would expect it to lock up
> > permanently.
> >
> > One thing that could be nice for ActiveMQ is a way to easily write a
> report
> > on potential problems with the stack.
> >
> > So for example, if you have slow consumers or consumers that have been
> > pending an ack for a LONG time this would be printed in the report.
> >
> > This way if the report comes back with *nothing* you can be fairly
> certain
> > it’s a bug in your code.
> >
> > But right now what happens is that ActiveMQ refuses to deliver messages
> and
> > I have no easy way to tell *why*.
> >
> >
> >
> > On Wed, Mar 11, 2015 at 12:40 PM, Tim Bain <tb...@alumni.duke.edu>
> wrote:
> >
> > > Out of curiosity, which slow consumer strategy did you configure?
> > >
> > > On Wed, Mar 11, 2015 at 11:45 AM, Kevin Burton <bur...@spinn3r.com>
> > wrote:
> > >
> > > > The problem is that this is happening in production but not locally.
> > > >
> > > > Setting prefetchPolicy to zero fixed it for one of our tasks, but
> > another
> > > > one of our tasks is still waiting for work and I can’t figure out
> why.
> > > >
> > > > It has 6000 consumers / threads, 200k messages waiting to be
> processed,
> > > but
> > > > none are being executed.
> > > >
> > > > The workers themselves are all waiting for ActiveMQ to send the
> > messages.
> > > >
> > > > And none of our connections are marked slow (I think, still verifying
> > but
> > > > it takes forever).
> > > >
> > > > On Wed, Mar 11, 2015 at 5:33 AM, Tim Bain <tb...@alumni.duke.edu>
> > wrote:
> > > >
> > > > > I'd definitely set breakpoints and step through with a debugger to
> > try
> > > to
> > > > > figure out what's going on.
> > > > > On Mar 10, 2015 8:19 PM, "Kevin Burton" <bur...@spinn3r.com>
> wrote:
> > > > >
> > > > > > This is exceedingly bizarre.  Now ActiveMQ is refusing to deliver
> > ANY
> > > > > > messages to my workers.
> > > > > >
> > > > > > This is very bizarre, no code has changed.  Nothing.  It’s just
> > > > refusing
> > > > > to
> > > > > > give work.
> > > > > >
> > > > > > If I set the prefetch to 0 or 1, it does work for a few moments,
> > then
> > > > > > halts.
> > > > > >
> > > > > > 99% certain I’m committing all my messages.  As it would make
> sense
> > > > that
> > > > > > nothing could be processed after that of course.
> > > > > >
> > > > > > Kevin
> > > > > >
> > > > > >
> > > > > > On Tue, Mar 10, 2015 at 6:53 PM, Tim Bain <tb...@alumni.duke.edu
> >
> > > > wrote:
> > > > > >
> > > > > > > If you make a single consumer, you'll only get one message at a
> > > time
> > > > by
> > > > > > > default (so only one thread will be doing any work).  You'd
> have
> > to
> > > > use
> > > > > > > client acknowledgement or selective acknowledgement to get more
> > > than
> > > > > one
> > > > > > > message at a time.  I'd probably leave many consumers but tune
> > down
> > > > > your
> > > > > > > prefetch buffers to something relatively small to ensure that
> the
> > > > > > workload
> > > > > > > is evenly spread and you don't have some consumers with a large
> > > > > prefetch
> > > > > > > buffer worth of backlog while others sit around idle.
> > > > > > >
> > > > > > > But if you're seeing the broker report pending messages, then
> > that
> > > > > means
> > > > > > > that having unbalanced workloads due to large prefetch buffers
> > > isn't
> > > > > your
> > > > > > > problem...  Pending messages on the broker only occur when
> those
> > > > > messages
> > > > > > > can't be dispatched to any consumer because all of their
> prefetch
> > > > > buffers
> > > > > > > are full, which would mean that you don't have unbalanced
> > > workloads.
> > > > > > > You'll get one or the other, not both.
> > > > > > >
> > > > > > > On Tue, Mar 10, 2015 at 7:42 PM, Kevin Burton <
> > bur...@spinn3r.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > I’m actually wondering if this is my issues. I’m creating one
> > > > session
> > > > > > per
> > > > > > > > thread.  So perhaps some of the threads have work to do, but
> > > > they’re
> > > > > > each
> > > > > > > > prefetching a bunch of work when in reality a better strategy
> > > might
> > > > > the
> > > > > > > to
> > > > > > > > have one master listener and then dispatch messages to each
> > > thread.
> > > > > > > >
> > > > > > > > On Tue, Mar 10, 2015 at 6:37 PM, Kevin Burton <
> > > bur...@spinn3r.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > OK.  That’s good to know.  I have a large number of
> > connections
> > > > so
> > > > > I
> > > > > > > have
> > > > > > > > > to look at each one.  I wonder if this could also be the
> > issue.
> > > > > AKA
> > > > > > > too
> > > > > > > > > many connections.
> > > > > > > > >
> > > > > > > > > On Tue, Mar 10, 2015 at 6:19 PM, Tim Bain <
> > > tb...@alumni.duke.edu
> > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> You should be able to confirm that the prefetch buffers
> are
> > > > empty
> > > > > by
> > > > > > > > >> inspecting the JMX MBeans on the broker.  Look at the
> > > consumers
> > > > > for
> > > > > > > the
> > > > > > > > >> destination, and for each one look at its
> > DispatchedQueueSize
> > > > > > > attribute.
> > > > > > > > >>
> > > > > > > > >> Keep in mind that slow consumers are identified *ONLY* if
> > you
> > > > > > > configure
> > > > > > > > >> one
> > > > > > > > >> of the abort strategies.  If you didn't set that up, don't
> > > > expect
> > > > > > any
> > > > > > > > slow
> > > > > > > > >> consumer identification log lines.  And if you did, I've
> > never
> > > > > seen
> > > > > > a
> > > > > > > > >> situation where a consumer went slow and a log line didn't
> > > > happen
> > > > > > > (using
> > > > > > > > >> the SlowConsumerAbortStrategy; I haven't used
> > > > > > > > SlowAckConsumerAbortStrategy
> > > > > > > > >> and can't vouch for it); we get those log lines pretty
> > > > frequently.
> > > > > > So
> > > > > > > > if
> > > > > > > > >> you're not seeing broker-side log lines about consumers
> > being
> > > > > > > identified
> > > > > > > > >> as
> > > > > > > > >> slow and then aborted, I'd bet it's simply not happening.
> > > > > > > > >>
> > > > > > > > >> On Tue, Mar 10, 2015 at 7:10 PM, Kevin Burton <
> > > > bur...@spinn3r.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >>
> > > > > > > > >> > The broker.  I’ll assume the prefetch brokers are empty.
> > I’m
> > > > > > looking
> > > > > > > > >> into
> > > > > > > > >> > debugging that now but I don’t have tools to introspect.
> > > > > > > > >> >
> > > > > > > > >> > The broker has thousands of messages.
> > > > > > > > >> >
> > > > > > > > >> > I just confirmed that a restart DOES improve the
> > situation.
> > > > > > > > >> >
> > > > > > > > >> > It’s possible that they’re being marked as slow
> consumers
> > > but
> > > > > not
> > > > > > > > >> *logged*
> > > > > > > > >> > as such so I’m trying to use JMX to dump the sessions.
> > > > > > > > >> >
> > > > > > > > >> > On Tue, Mar 10, 2015 at 5:58 PM, Tim Bain <
> > > > > tb...@alumni.duke.edu>
> > > > > > > > >> wrote:
> > > > > > > > >> >
> > > > > > > > >> > > Are the messages getting hung up in the broker or in
> the
> > > > > client?
> > > > > > > > (Do
> > > > > > > > >> the
> > > > > > > > >> > > consumers have empty or full prefetch buffers?)
> > > > > > > > >> > >
> > > > > > > > >> > > On Tue, Mar 10, 2015 at 6:47 PM, Kevin Burton <
> > > > > > bur...@spinn3r.com
> > > > > > > >
> > > > > > > > >> > wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > > I’m still trying to track down some issues with
> > > ActiveMQ …
> > > > > > > > >> > > >
> > > > > > > > >> > > > One is that I have 5 ActiveMQ servers now, and each
> > one
> > > > has
> > > > > > > about
> > > > > > > > >> 3000
> > > > > > > > >> > > > messages pending.  So 15000 messages in queues.
> > > > > > > > >> > > >
> > > > > > > > >> > > > These are non-persistent queues, plenty of memory
> and
> > > > plenty
> > > > > > of
> > > > > > > > CPU,
> > > > > > > > >> > but
> > > > > > > > >> > > > the workers are just blocked waiting to receive
> work.
> > > > > > > > >> > > >
> > > > > > > > >> > > > I had a hypothesis that this could be slow workers,
> > but
> > > > > after
> > > > > > > > tuning
> > > > > > > > >> > some
> > > > > > > > >> > > > things I no longer receive any errors about slow
> > > workers.
> > > > > > > > >> > > >
> > > > > > > > >> > > > Restarting the daemons doesn’t fix things either.
> > > > Anything
> > > > > > else
> > > > > > > > it
> > > > > > > > >> > could
> > > > > > > > >> > > > be?  I’m a bit stumped unfortunately.
> > > > > > > > >> > > >
> > > > > > > > >> > > > --
> > > > > > > > >> > > >
> > > > > > > > >> > > > Founder/CEO Spinn3r.com
> > > > > > > > >> > > > Location: *San Francisco, CA*
> > > > > > > > >> > > > blog: http://burtonator.wordpress.com
> > > > > > > > >> > > > … or check out my Google+ profile
> > > > > > > > >> > > > <
> https://plus.google.com/102718274791889610666/posts>
> > > > > > > > >> > > > <http://spinn3r.com>
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > --
> > > > > > > > >> >
> > > > > > > > >> > Founder/CEO Spinn3r.com
> > > > > > > > >> > Location: *San Francisco, CA*
> > > > > > > > >> > blog: http://burtonator.wordpress.com
> > > > > > > > >> > … or check out my Google+ profile
> > > > > > > > >> > <https://plus.google.com/102718274791889610666/posts>
> > > > > > > > >> > <http://spinn3r.com>
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Founder/CEO Spinn3r.com
> > > > > > > > > Location: *San Francisco, CA*
> > > > > > > > > blog: http://burtonator.wordpress.com
> > > > > > > > > … or check out my Google+ profile
> > > > > > > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > > > > > <http://spinn3r.com>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Founder/CEO Spinn3r.com
> > > > > > > > Location: *San Francisco, CA*
> > > > > > > > blog: http://burtonator.wordpress.com
> > > > > > > > … or check out my Google+ profile
> > > > > > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > > > > <http://spinn3r.com>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Founder/CEO Spinn3r.com
> > > > > > Location: *San Francisco, CA*
> > > > > > blog: http://burtonator.wordpress.com
> > > > > > … or check out my Google+ profile
> > > > > > <https://plus.google.com/102718274791889610666/posts>
> > > > > > <http://spinn3r.com>
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Founder/CEO Spinn3r.com
> > > > Location: *San Francisco, CA*
> > > > blog: http://burtonator.wordpress.com
> > > > … or check out my Google+ profile
> > > > <https://plus.google.com/102718274791889610666/posts>
> > > > <http://spinn3r.com>
> > > >
> > >
> >
> >
> >
> > --
> >
> > Founder/CEO Spinn3r.com
> > Location: *San Francisco, CA*
> > blog: http://burtonator.wordpress.com
> > … or check out my Google+ profile
> > <https://plus.google.com/102718274791889610666/posts>
> > <http://spinn3r.com>
> >
>



-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Re: Thousands of workers waiting to receive jobs but ActiveMQ isn’t sending them.

Reply via email to