Re: Java broker OOM due to DirectMemory

Ramayan Tiwari Wed, 24 May 2017 16:03:44 -0700

Hi Keith,

Thanks so much for the update and putting together the new version with
enhancements to flowToDisk. Just to confirm, in order for me to get the
broker with these fixes and also use MultiQueueConsumer, I will get 6.0.7
version and apply QPID-7462 on it?


For the longer terms fix (periodically coalesce byte buffers) QPID-7753, is
it possible to also backport that to 6.0.x branch? We are working of
migrating all of our monitoring to use REST API, however, if its possible
to put this fix in 6.0.x, that would be very helpful.

Thanks
Ramayan



On Wed, May 24, 2017 at 5:53 AM, Keith W <keith.w...@gmail.com> wrote:

> Hi Ramayan
>
> Further changes have been made on the 6.0.x branch that prevent the
> noisy flow to disk messages in the log.   The flow to disk on/off
> messages we had before didn't really fit and this resulted in the logs
> being spammed when either a queue was considered full or memory was
> under pressure.  Now you'll see a VHT-1008 periodically written to the
> logs when flow to disk has been working.  It reports the number of
> bytes that have been evacuated from memory.
>
> The actual mechanism used to prevent the OOM is the same as code you
> tested on May 5th: the flow to disk mechanism is triggered when the
> allocated direct memory size exceeds the flow to disk threshold.
> This evacuates messages from memory, flowing them to disk if
> necessary, until the allocated direct memory falls below the threshold
> again.  In doing so, some buffers that were previously sparse will be
> released.  Expect to see the direct memory graph level off, but it
> won't necessarily fall. This is expected.   In other words, the direct
> graph you shared on 5th does not surprise me.
>
> We plan to put out a 6.0.7/6.1.3 RCs out in the next few days which
> will include these changes.    The patch attached to QPID-7462 has
> been updated it will apply.
>
> In the longer term, for 7.0.0 we have already improved the Broker so
> that it actively manage the sparsity of buffers without falling back
> on flow to disk, Memory requirements for use cases such as yours
> should be much more reasonable.
>
> I know you currently have a dependency on the old JMX management
> interface.   I'd suggest you look at eliminating the dependency soon,
> so you are free to upgrade when the time comes.
>
> Kind regards, Keith Wall.
>
> On 24 May 2017 at 13:52, Keith W <keith.w...@gmail.com> wrote:
> > Hi Ramayan
> >
> > Further changes have been made on the 6.0.x branch that prevent the
> > noisy flow to disk messages in the log.   The flow to disk on/off
> > messages we had before didn't really fit and this resulted in the logs
> > being spammed when either a queue was considered full or memory was
> > under pressure.  Now you'll see a VHT-1008 periodically written to the
> > logs when flow to disk has been working.  It reports the number of
> > bytes that have been evacuated from memory.
> >
> > The actual mechanism used to prevent the OOM is the same as code you
> > tested on May 5th: the flow to disk mechanism is triggered when the
> > allocated direct memory size exceeds the flow to disk threshold.
> > This evacuates messages from memory, flowing them to disk if
> > necessary, until the allocated direct memory falls below the threshold
> > again.  In doing so, some buffers that were previously sparse will be
> > released.  Expect to see the direct memory graph level off, but it
> > won't necessarily fall. This is expected.   In other words, the direct
> > graph you shared on 5th does not surprise me.
> >
> > We plan to put out a 6.0.7/6.1.3 RCs out in the next few days which
> > will include these changes.    The patch attached to QPID-7462 has
> > been updated it will apply.
> >
> > In the longer term, for 7.0.0 we have already improved the Broker so
> > that it actively manage the sparsity of buffers without falling back
> > on flow to disk, Memory requirements for use cases such as yours
> > should be much more reasonable.
> >
> > I know you currently have a dependency on the old JMX management
> > interface.   I'd suggest you look at eliminating the dependency soon,
> > so you are free to upgrade when the time comes.
> >
> > Kind regards, Keith Wall.
> >
> >
> > On 16 May 2017 at 19:58, Ramayan Tiwari <ramayan.tiw...@gmail.com>
> wrote:
> >> Thanks Keith for the update.
> >>
> >> - Ramayan
> >>
> >> On Mon, May 15, 2017 at 2:35 AM, Keith W <keith.w...@gmail.com> wrote:
> >>>
> >>> Hi Ramayan
> >>>
> >>> We are still looking at our approach to the Broker's flow to disk
> >>> feature in light of the defect you highlighted.  We have some work in
> >>> flight this week investigating alternative approaches which I am
> >>> hoping will conclude by the end of week.  I should be able to update
> >>> you then.
> >>>
> >>> Thanks Keith
> >>>
> >>> On 12 May 2017 at 20:58, Ramayan Tiwari <ramayan.tiw...@gmail.com>
> wrote:
> >>> > Hi Alex,
> >>> >
> >>> > Any update on the fix for this?
> >>> > QPID-7753 is assigned a fix version for 7.0.0, I am hoping that the
> fix
> >>> > will also be back ported to 6.0.x.
> >>> >
> >>> > Thanks
> >>> > Ramayan
> >>> >
> >>> > On Mon, May 8, 2017 at 2:14 AM, Oleksandr Rudyy <oru...@gmail.com>
> >>> > wrote:
> >>> >
> >>> >> Hi Ramayan,
> >>> >>
> >>> >> Thanks for testing the patch and providing a feedback.
> >>> >>
> >>> >> Regarding direct memory utilization, the Qpid Broker caches up to
> 256MB
> >>> >> of
> >>> >> direct memory internally in QpidByteBuffers. Thus, when testing the
> >>> >> Broker
> >>> >> with only 256MB of direct memory, the entire direct memory could be
> >>> >> cached
> >>> >> and it would look as if direct memory is never released.
> Potentially,
> >>> >> you
> >>> >> can reduce the number of buffers cached on broker by changing
> context
> >>> >> variable 'broker.directByteBufferPoolSize'. By default, it is set
> to
> >>> >> 1000.
> >>> >> With buffer size of 256K, it would give ~256M of cache.
> >>> >>
> >>> >> Regarding introducing lower and upper thresholds for 'flow to
> disk'. It
> >>> >> seems like a good idea and we will try to implement it early this
> week
> >>> >> on
> >>> >> trunk first.
> >>> >>
> >>> >> Kind Regards,
> >>> >> Alex
> >>> >>
> >>> >>
> >>> >> On 5 May 2017 at 23:49, Ramayan Tiwari <ramayan.tiw...@gmail.com>
> >>> >> wrote:
> >>> >>
> >>> >> > Hi Alex,
> >>> >> >
> >>> >> > Thanks for providing the patch. I verified the fix with same perf
> >>> >> > test,
> >>> >> and
> >>> >> > it does prevent broker from going OOM, however. DM utilization
> >>> >> > doesn't
> >>> >> get
> >>> >> > any better after hitting the threshold (where flow to disk is
> >>> >> > activated
> >>> >> > based on total used % across broker - graph in the link below).
> >>> >> >
> >>> >> > After hitting the final threshold, flow to disk activates and
> >>> >> > deactivates
> >>> >> > pretty frequently across all the queues. The reason seems to be
> >>> >> > because
> >>> >> > there is only one threshold currently to trigger flow to disk.
> Would
> >>> >> > it
> >>> >> > make sense to break this down to high and low threshold - so that
> >>> >> > once
> >>> >> flow
> >>> >> > to disk is active after hitting high threshold, it will be active
> >>> >> > until
> >>> >> the
> >>> >> > queue utilization (or broker DM allocation) reaches the low
> >>> >> > threshold.
> >>> >> >
> >>> >> > Graph and flow to disk logs are here:
> >>> >> > https://docs.google.com/document/d/1Wc1e-id-
> >>> >> WlpI7FGU1Lx8XcKaV8sauRp82T5XZV
> >>> >> > U-RiM/edit#heading=h.6400pltvjhy7
> >>> >> >
> >>> >> > Thanks
> >>> >> > Ramayan
> >>> >> >
> >>> >> > On Thu, May 4, 2017 at 2:44 AM, Oleksandr Rudyy <oru...@gmail.com
> >
> >>> >> wrote:
> >>> >> >
> >>> >> > > Hi Ramayan,
> >>> >> > >
> >>> >> > > We attached to the QPID-7753 a patch with a work around for
> 6.0.x
> >>> >> branch.
> >>> >> > > It triggers flow to disk based on direct memory consumption
> rather
> >>> >> > > than
> >>> >> > > estimation of the space occupied by the message content. The
> flow
> >>> >> > > to
> >>> >> disk
> >>> >> > > should evacuate message content preventing running out of direct
> >>> >> memory.
> >>> >> > We
> >>> >> > > already committed the changes into 6.0.x and 6.1.x branches. It
> >>> >> > > will be
> >>> >> > > included into upcoming 6.0.7 and 6.1.3 releases.
> >>> >> > >
> >>> >> > > Please try and test the patch in your environment.
> >>> >> > >
> >>> >> > > We are still working at finishing of the fix for trunk.
> >>> >> > >
> >>> >> > > Kind Regards,
> >>> >> > > Alex
> >>> >> > >
> >>> >> > > On 30 April 2017 at 15:45, Lorenz Quack <quack.lor...@gmail.com
> >
> >>> >> wrote:
> >>> >> > >
> >>> >> > > > Hi Ramayan,
> >>> >> > > >
> >>> >> > > > The high-level plan is currently as follows:
> >>> >> > > >  1) Periodically try to compact sparse direct memory buffers.
> >>> >> > > >  2) Increase accuracy of messages' direct memory usage
> estimation
> >>> >> > > > to
> >>> >> > more
> >>> >> > > > reliably trigger flow to disk.
> >>> >> > > >  3) Add an additional flow to disk trigger based on the
> amount of
> >>> >> > > allocated
> >>> >> > > > direct memory.
> >>> >> > > >
> >>> >> > > > A little bit more details:
> >>> >> > > >  1) We plan on periodically checking the amount of direct
> memory
> >>> >> usage
> >>> >> > > and
> >>> >> > > > if it is above a
> >>> >> > > >     threshold (50%) we compare the sum of all queue sizes with
> >>> >> > > > the
> >>> >> > amount
> >>> >> > > > of allocated direct memory.
> >>> >> > > >     If the ratio falls below a certain threshold we trigger a
> >>> >> > compaction
> >>> >> > > > task which goes through all queues
> >>> >> > > >     and copy's a certain amount of old message buffers into
> new
> >>> >> > > > ones
> >>> >> > > > thereby freeing the old buffers so
> >>> >> > > >     that they can be returned to the buffer pool and be
> reused.
> >>> >> > > >
> >>> >> > > >  2) Currently we trigger flow to disk based on an estimate of
> how
> >>> >> much
> >>> >> > > > memory the messages on the
> >>> >> > > >     queues consume. We had to use estimates because we did not
> >>> >> > > > have
> >>> >> > > > accurate size numbers for
> >>> >> > > >     message headers. By having accurate size information for
> >>> >> > > > message
> >>> >> > > > headers we can more reliably
> >>> >> > > >     enforce queue memory limits.
> >>> >> > > >
> >>> >> > > >  3) The flow to disk trigger based on message size had another
> >>> >> problem
> >>> >> > > > which is more pertinent to the
> >>> >> > > >     current issue. We only considered the size of the messages
> >>> >> > > > and
> >>> >> not
> >>> >> > > how
> >>> >> > > > much memory we allocate
> >>> >> > > >     to store those messages. In the FIFO use case those
> numbers
> >>> >> > > > will
> >>> >> be
> >>> >> > > > very close to each other but in
> >>> >> > > >     use cases like yours we can end up with sparse buffers and
> >>> >> > > > the
> >>> >> > > numbers
> >>> >> > > > will diverge. Because of this
> >>> >> > > >     divergence we do not trigger flow to disk in time and the
> >>> >> > > > broker
> >>> >> > can
> >>> >> > > go
> >>> >> > > > OOM.
> >>> >> > > >     To fix the issue we want to add an additional flow to disk
> >>> >> trigger
> >>> >> > > > based on the amount of allocated direct
> >>> >> > > >     memory. This should prevent the broker from going OOM
> even if
> >>> >> > > > the
> >>> >> > > > compaction strategy outlined above
> >>> >> > > >     should fail for some reason (e.g., the compaction task
> cannot
> >>> >> keep
> >>> >> > up
> >>> >> > > > with the arrival of new messages).
> >>> >> > > >
> >>> >> > > > Currently, there are patches for the above points but they
> suffer
> >>> >> from
> >>> >> > > some
> >>> >> > > > thread-safety issues that need to be addressed.
> >>> >> > > >
> >>> >> > > > I hope this description helps. Any feedback is, as always,
> >>> >> > > > welcome.
> >>> >> > > >
> >>> >> > > > Kind regards,
> >>> >> > > > Lorenz
> >>> >> > > >
> >>> >> > > >
> >>> >> > > >
> >>> >> > > > On Sat, Apr 29, 2017 at 12:00 AM, Ramayan Tiwari <
> >>> >> > > ramayan.tiw...@gmail.com
> >>> >> > > > >
> >>> >> > > > wrote:
> >>> >> > > >
> >>> >> > > > > Hi Lorenz,
> >>> >> > > > >
> >>> >> > > > > Thanks so much for the patch. We have a perf test now to
> >>> >> > > > > reproduce
> >>> >> > this
> >>> >> > > > > issue, so we did test with 256KB, 64KB and 4KB network byte
> >>> >> > > > > buffer.
> >>> >> > > None
> >>> >> > > > of
> >>> >> > > > > these configurations help with the issue (or give any more
> >>> >> breathing
> >>> >> > > > room)
> >>> >> > > > > for our use case. We would like to share the perf analysis
> with
> >>> >> > > > > the
> >>> >> > > > > community:
> >>> >> > > > >
> >>> >> > > > > https://docs.google.com/document/d/1Wc1e-id-
> >>> >> > > > WlpI7FGU1Lx8XcKaV8sauRp82T5XZV
> >>> >> > > > > U-RiM/edit?usp=sharing
> >>> >> > > > >
> >>> >> > > > > Feel free to comment on the doc if certain details are
> >>> >> > > > > incorrect or
> >>> >> > if
> >>> >> > > > > there are questions.
> >>> >> > > > >
> >>> >> > > > > Since the short term solution doesn't help us, we are very
> >>> >> interested
> >>> >> > > in
> >>> >> > > > > getting some details on how the community plans to address
> >>> >> > > > > this, a
> >>> >> > high
> >>> >> > > > > level description of the approach will be very helpful for
> us
> >>> >> > > > > in
> >>> >> > order
> >>> >> > > to
> >>> >> > > > > brainstorm our use cases along with this solution.
> >>> >> > > > >
> >>> >> > > > > - Ramayan
> >>> >> > > > >
> >>> >> > > > > On Fri, Apr 28, 2017 at 9:34 AM, Lorenz Quack <
> >>> >> > quack.lor...@gmail.com>
> >>> >> > > > > wrote:
> >>> >> > > > >
> >>> >> > > > > > Hello Ramayan,
> >>> >> > > > > >
> >>> >> > > > > > We are still working on a fix for this issue.
> >>> >> > > > > > In the mean time we had an idea to potentially workaround
> the
> >>> >> issue
> >>> >> > > > until
> >>> >> > > > > > a proper fix is released.
> >>> >> > > > > >
> >>> >> > > > > > The idea is to decrease the qpid network buffer size the
> >>> >> > > > > > broker
> >>> >> > uses.
> >>> >> > > > > > While this still allows for sparsely populated buffers it
> >>> >> > > > > > would
> >>> >> > > improve
> >>> >> > > > > > the overall occupancy ratio.
> >>> >> > > > > >
> >>> >> > > > > > Here are the steps to follow:
> >>> >> > > > > >  * ensure you are not using TLS
> >>> >> > > > > >  * apply the attached patch
> >>> >> > > > > >  * figure out the size of the largest messages you are
> >>> >> > > > > > sending
> >>> >> > > > (including
> >>> >> > > > > > header and some overhead)
> >>> >> > > > > >  * set the context variable "qpid.broker.networkBufferSize
> "
> >>> >> > > > > > to
> >>> >> > that
> >>> >> > > > > value
> >>> >> > > > > > but not smaller than 4096
> >>> >> > > > > >  * test
> >>> >> > > > > >
> >>> >> > > > > > Decreasing the qpid network buffer size automatically
> limits
> >>> >> > > > > > the
> >>> >> > > > maximum
> >>> >> > > > > > AMQP frame size.
> >>> >> > > > > > Since you are using a very old client we are not sure how
> >>> >> > > > > > well it
> >>> >> > > copes
> >>> >> > > > > > with small frame sizes where it has to split a message
> across
> >>> >> > > multiple
> >>> >> > > > > > frames.
> >>> >> > > > > > Therefore, to play it safe you should not set it smaller
> than
> >>> >> > > > > > the
> >>> >> > > > largest
> >>> >> > > > > > messages (+ header + overhead) you are sending.
> >>> >> > > > > > I do not know what message sizes you are sending but AMQP
> >>> >> > > > > > imposes
> >>> >> > the
> >>> >> > > > > > restriction that the framesize cannot be smaller than 4096
> >>> >> > > > > > bytes.
> >>> >> > > > > > In the qpid broker the default currently is 256 kB.
> >>> >> > > > > >
> >>> >> > > > > > In the current state the broker does not allow setting the
> >>> >> network
> >>> >> > > > buffer
> >>> >> > > > > > to values smaller than 64 kB to allow TLS frames to fit
> into
> >>> >> > > > > > one
> >>> >> > > > network
> >>> >> > > > > > buffer.
> >>> >> > > > > > I attached a patch to this mail that lowers that
> restriction
> >>> >> > > > > > to
> >>> >> the
> >>> >> > > > limit
> >>> >> > > > > > imposed by AMQP (4096 Bytes).
> >>> >> > > > > > Obviously, you should not use this when using TLS.
> >>> >> > > > > >
> >>> >> > > > > >
> >>> >> > > > > > I hope this reduces the problems you are currently facing
> >>> >> > > > > > until
> >>> >> we
> >>> >> > > can
> >>> >> > > > > > complete the proper fix.
> >>> >> > > > > >
> >>> >> > > > > > Kind regards,
> >>> >> > > > > > Lorenz
> >>> >> > > > > >
> >>> >> > > > > >
> >>> >> > > > > > On Fri, 2017-04-21 at 09:17 -0700, Ramayan Tiwari wrote:
> >>> >> > > > > > > Thanks so much Keith and the team for finding the root
> >>> >> > > > > > > cause.
> >>> >> We
> >>> >> > > are
> >>> >> > > > so
> >>> >> > > > > > > relieved that we fix the root cause shortly.
> >>> >> > > > > > >
> >>> >> > > > > > > Couple of things that I forgot to mention on the
> mitigation
> >>> >> steps
> >>> >> > > we
> >>> >> > > > > took
> >>> >> > > > > > > in the last incident:
> >>> >> > > > > > > 1) We triggered GC from JMX bean multiple times, it did
> not
> >>> >> help
> >>> >> > in
> >>> >> > > > > > > reducing DM allocated.
> >>> >> > > > > > > 2) We also killed all the AMQP connections to the broker
> >>> >> > > > > > > when
> >>> >> DM
> >>> >> > > was
> >>> >> > > > at
> >>> >> > > > > > > 80%. This did not help either. The way we killed
> >>> >> > > > > > > connections -
> >>> >> > > using
> >>> >> > > > > JMX
> >>> >> > > > > > > got list of all the open AMQP connections and called
> close
> >>> >> > > > > > > from
> >>> >> > JMX
> >>> >> > > > > > mbean.
> >>> >> > > > > > >
> >>> >> > > > > > > I am hoping the above two are not related to root cause,
> >>> >> > > > > > > but
> >>> >> > wanted
> >>> >> > > > to
> >>> >> > > > > > > bring it up in case this is relevant.
> >>> >> > > > > > >
> >>> >> > > > > > > Thanks
> >>> >> > > > > > > Ramayan
> >>> >> > > > > > >
> >>> >> > > > > > > On Fri, Apr 21, 2017 at 8:29 AM, Keith W
> >>> >> > > > > > > <keith.w...@gmail.com
> >>> >> >
> >>> >> > > > wrote:
> >>> >> > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > > Hello Ramayan
> >>> >> > > > > > > >
> >>> >> > > > > > > > I believe I understand the root cause of the
> problem.  We
> >>> >> have
> >>> >> > > > > > > > identified a flaw in the direct memory buffer
> management
> >>> >> > employed
> >>> >> > > > by
> >>> >> > > > > > > > Qpid Broker J which for some messaging use-cases can
> lead
> >>> >> > > > > > > > to
> >>> >> > the
> >>> >> > > > OOM
> >>> >> > > > > > > > direct you describe.   For the issue to manifest the
> >>> >> producing
> >>> >> > > > > > > > application needs to use a single connection for the
> >>> >> production
> >>> >> > > of
> >>> >> > > > > > > > messages some of which are short-lived (i.e. are
> consumed
> >>> >> > > quickly)
> >>> >> > > > > > > > whilst others remain on the queue for some time.
> >>> >> > > > > > > > Priority
> >>> >> > > queues,
> >>> >> > > > > > > > sorted queues and consumers utilising selectors that
> >>> >> > > > > > > > result
> >>> >> in
> >>> >> > > some
> >>> >> > > > > > > > messages being left of the queue could all produce
> this
> >>> >> patten.
> >>> >> > > > The
> >>> >> > > > > > > > pattern leads to a sparsely occupied 256K net buffers
> >>> >> > > > > > > > which
> >>> >> > > cannot
> >>> >> > > > be
> >>> >> > > > > > > > released or reused until every message that reference
> a
> >>> >> 'chunk'
> >>> >> > > of
> >>> >> > > > it
> >>> >> > > > > > > > is either consumed or flown to disk.   The problem was
> >>> >> > introduced
> >>> >> > > > > with
> >>> >> > > > > > > > Qpid v6.0 and exists in v6.1 and trunk too.
> >>> >> > > > > > > >
> >>> >> > > > > > > > The flow to disk feature is not helping us here
> because
> >>> >> > > > > > > > its
> >>> >> > > > algorithm
> >>> >> > > > > > > > considers only the size of live messages on the
> queues.
> >>> >> > > > > > > > If
> >>> >> the
> >>> >> > > > > > > > accumulative live size does not exceed the threshold,
> the
> >>> >> > > messages
> >>> >> > > > > > > > aren't flown to disk. I speculate that when you
> observed
> >>> >> > > > > > > > that
> >>> >> > > > moving
> >>> >> > > > > > > > messages cause direct message usage to drop earlier
> >>> >> > > > > > > > today,
> >>> >> your
> >>> >> > > > > > > > message movement cause a queue to go over threshold,
> >>> >> > > > > > > > cause
> >>> >> > > message
> >>> >> > > > to
> >>> >> > > > > > > > be flown to disk and their direct memory references
> >>> >> > > > > > > > released.
> >>> >> > > The
> >>> >> > > > > > > > logs will confirm this is so.
> >>> >> > > > > > > >
> >>> >> > > > > > > > I have not identified an easy workaround at the
> moment.
> >>> >> > > >  Decreasing
> >>> >> > > > > > > > the flow to disk threshold and/or increasing available
> >>> >> > > > > > > > direct
> >>> >> > > > memory
> >>> >> > > > > > > > should alleviate and may be an acceptable short term
> >>> >> > workaround.
> >>> >> > > > If
> >>> >> > > > > > > > it were possible for publishing application to publish
> >>> >> > > > > > > > short
> >>> >> > > lived
> >>> >> > > > > and
> >>> >> > > > > > > > long lived messages on two separate JMS connections
> this
> >>> >> would
> >>> >> > > > avoid
> >>> >> > > > > > > > this defect.
> >>> >> > > > > > > >
> >>> >> > > > > > > > QPID-7753 tracks this issue and QPID-7754 is a related
> >>> >> > > > > > > > this
> >>> >> > > > problem.
> >>> >> > > > > > > > We intend to be working on these early next week and
> will
> >>> >> > > > > > > > be
> >>> >> > > aiming
> >>> >> > > > > > > > for a fix that is back-portable to 6.0.
> >>> >> > > > > > > >
> >>> >> > > > > > > > Apologies that you have run into this defect and
> thanks
> >>> >> > > > > > > > for
> >>> >> > > > > reporting.
> >>> >> > > > > > > >
> >>> >> > > > > > > > Thanks, Keith
> >>> >> > > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > > On 21 April 2017 at 10:21, Ramayan Tiwari <
> >>> >> > > > ramayan.tiw...@gmail.com>
> >>> >> > > > > > > > wrote:
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > Hi All,
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > We have been monitoring the brokers everyday and
> today
> >>> >> > > > > > > > > we
> >>> >> > found
> >>> >> > > > one
> >>> >> > > > > > > > instance
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > where broker’s DM was constantly going up and was
> about
> >>> >> > > > > > > > > to
> >>> >> > > crash,
> >>> >> > > > > so
> >>> >> > > > > > we
> >>> >> > > > > > > > > experimented some mitigations, one of which caused
> the
> >>> >> > > > > > > > > DM
> >>> >> to
> >>> >> > > come
> >>> >> > > > > > down.
> >>> >> > > > > > > > > Following are the details, which might help us
> >>> >> understanding
> >>> >> > > the
> >>> >> > > > > > issue:
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > Traffic scenario:
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > DM allocation had been constantly going up and was
> at
> >>> >> > > > > > > > > 90%.
> >>> >> > > There
> >>> >> > > > > > were two
> >>> >> > > > > > > > > queues which seemed to align with the theories that
> we
> >>> >> > > > > > > > > had.
> >>> >> > > Q1’s
> >>> >> > > > > > size had
> >>> >> > > > > > > > > been large right after the broker start and had slow
> >>> >> > > consumption
> >>> >> > > > of
> >>> >> > > > > > > > > messages, queue size only reduced from 76MB to 75MB
> >>> >> > > > > > > > > over a
> >>> >> > > period
> >>> >> > > > > of
> >>> >> > > > > > > > 6hrs.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > Q2 on the other hand, started small and was
> gradually
> >>> >> > growing,
> >>> >> > > > > queue
> >>> >> > > > > > size
> >>> >> > > > > > > > > went from 7MB to 10MB in 6hrs. There were other
> queues
> >>> >> > > > > > > > > with
> >>> >> > > > traffic
> >>> >> > > > > > > > during
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > this time.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > Action taken:
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > Moved all the messages from Q2 (since this was our
> >>> >> > > > > > > > > original
> >>> >> > > > theory)
> >>> >> > > > > > to Q3
> >>> >> > > > > > > > > (already created but no messages in it). This did
> not
> >>> >> > > > > > > > > help
> >>> >> > with
> >>> >> > > > the
> >>> >> > > > > > DM
> >>> >> > > > > > > > > growing up.
> >>> >> > > > > > > > > Moved all the messages from Q1 to Q4 (already
> created
> >>> >> > > > > > > > > but
> >>> >> no
> >>> >> > > > > > messages in
> >>> >> > > > > > > > > it). This reduced DM allocation from 93% to 31%.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > We have the heap dump and thread dump from when
> broker
> >>> >> > > > > > > > > was
> >>> >> > 90%
> >>> >> > > in
> >>> >> > > > > DM
> >>> >> > > > > > > > > allocation. We are going to analyze that to see if
> we
> >>> >> > > > > > > > > can
> >>> >> get
> >>> >> > > > some
> >>> >> > > > > > clue.
> >>> >> > > > > > > > We
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > wanted to share this new information which might
> help
> >>> >> > > > > > > > > in
> >>> >> > > > reasoning
> >>> >> > > > > > about
> >>> >> > > > > > > > the
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > memory issue.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > - Ramayan
> >>> >> > > > > > > > >
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > On Thu, Apr 20, 2017 at 11:20 AM, Ramayan Tiwari <
> >>> >> > > > > > > > ramayan.tiw...@gmail.com>
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > wrote:
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > Hi Keith,
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > Thanks so much for your response and digging into
> the
> >>> >> > issue.
> >>> >> > > > > Below
> >>> >> > > > > > are
> >>> >> > > > > > > > the
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > answer to your questions:
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > 1) Yeah we are using QPID-7462 with 6.0.5. We
> >>> >> > > > > > > > > > couldn't
> >>> >> use
> >>> >> > > 6.1
> >>> >> > > > > > where it
> >>> >> > > > > > > > > > was released because we need JMX support. Here is
> the
> >>> >> > > > destination
> >>> >> > > > > > > > format:
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > ""%s ; {node : { type : queue }, link : {
> >>> >> > > > > > > > > > x-subscribes :
> >>> >> {
> >>> >> > > > > > arguments : {
> >>> >> > > > > > > > > > x-multiqueue : [%s], x-pull-only : true }}}}";"
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > 2) Our machines have 40 cores, which will make the
> >>> >> > > > > > > > > > number
> >>> >> > of
> >>> >> > > > > > threads to
> >>> >> > > > > > > > > > 80. This might not be an issue, because this will
> >>> >> > > > > > > > > > show up
> >>> >> > in
> >>> >> > > > the
> >>> >> > > > > > > > baseline DM
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > allocated, which is only 6% (of 4GB) when we just
> >>> >> > > > > > > > > > bring
> >>> >> up
> >>> >> > > the
> >>> >> > > > > > broker.
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > 3) The only setting that we tuned WRT to DM is
> >>> >> > > > > flowToDiskThreshold,
> >>> >> > > > > > > > which
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > is set at 80% now.
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > 4) Only one virtual host in the broker.
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > 5) Most of our queues (99%) are priority, we also
> >>> >> > > > > > > > > > have
> >>> >> 8-10
> >>> >> > > > > sorted
> >>> >> > > > > > > > queues.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > 6) Yeah we are using the standard 0.16 client and
> not
> >>> >> AMQP
> >>> >> > > 1.0
> >>> >> > > > > > clients.
> >>> >> > > > > > > > > > The connection log line looks like:
> >>> >> > > > > > > > > > CON-1001 : Open : Destination : AMQP(IP:5672) :
> >>> >> > > > > > > > > > Protocol
> >>> >> > > > Version
> >>> >> > > > > :
> >>> >> > > > > > 0-10
> >>> >> > > > > > > > :
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > Client ID : test : Client Version : 0.16 : Client
> >>> >> Product :
> >>> >> > > > qpid
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > We had another broker crashed about an hour back,
> we
> >>> >> > > > > > > > > > do
> >>> >> see
> >>> >> > > the
> >>> >> > > > > > same
> >>> >> > > > > > > > > > patterns:
> >>> >> > > > > > > > > > 1) There is a queue which is constantly growing,
> >>> >> > > > > > > > > > enqueue
> >>> >> is
> >>> >> > > > > faster
> >>> >> > > > > > than
> >>> >> > > > > > > > > > dequeue on that queue for a long period of time.
> >>> >> > > > > > > > > > 2) Flow to disk didn't kick in at all.
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > This graph shows memory growth (red line - heap,
> blue
> >>> >> > > > > > > > > > -
> >>> >> DM
> >>> >> > > > > > allocated,
> >>> >> > > > > > > > > > yellow - DM used)
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > https://drive.google.com/file/d/
> >>> >> > > 0Bwi0MEV3srPRdVhXdTBncHJLY2c/
> >>> >> > > > > > > > view?usp=sharing
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > The below graph shows growth on a single queue
> (there
> >>> >> > > > > > > > > > are
> >>> >> > > 10-12
> >>> >> > > > > > other
> >>> >> > > > > > > > > > queues with traffic as well, something large size
> >>> >> > > > > > > > > > than
> >>> >> this
> >>> >> > > > > queue):
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > https://drive.google.com/file/d/
> >>> >> > > 0Bwi0MEV3srPRWmNGbDNGUkJhQ0U/
> >>> >> > > > > > > > view?usp=sharing
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > Couple of questions:
> >>> >> > > > > > > > > > 1) Is there any developer level doc/design spec on
> >>> >> > > > > > > > > > how
> >>> >> Qpid
> >>> >> > > > uses
> >>> >> > > > > > DM?
> >>> >> > > > > > > > > > 2) We are not getting heap dumps automatically
> when
> >>> >> broker
> >>> >> > > > > crashes
> >>> >> > > > > > due
> >>> >> > > > > > > > to
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > DM (HeapDumpOnOutOfMemoryError not respected). Has
> >>> >> > > > > > > > > > anyone
> >>> >> > > > found a
> >>> >> > > > > > way
> >>> >> > > > > > > > to get
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > around this problem?
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > Thanks
> >>> >> > > > > > > > > > Ramayan
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > On Thu, Apr 20, 2017 at 9:08 AM, Keith W <
> >>> >> > > keith.w...@gmail.com
> >>> >> > > > >
> >>> >> > > > > > wrote:
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > Hi Ramayan
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > We have been discussing your problem here and
> have
> >>> >> > > > > > > > > > > a
> >>> >> > couple
> >>> >> > > > of
> >>> >> > > > > > > > questions.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > I have been experimenting with use-cases based
> on
> >>> >> > > > > > > > > > > your
> >>> >> > > > > > descriptions
> >>> >> > > > > > > > > > > above, but so far, have been unsuccessful in
> >>> >> reproducing
> >>> >> > a
> >>> >> > > > > > > > > > > "java.lang.OutOfMemoryError: Direct buffer
> memory"
> >>> >> > > > condition.
> >>> >> > > > > > The
> >>> >> > > > > > > > > > > direct memory usage reflects the expected
> model: it
> >>> >> > levels
> >>> >> > > > off
> >>> >> > > > > > when
> >>> >> > > > > > > > > > > the flow to disk threshold is reached and direct
> >>> >> > > > > > > > > > > memory
> >>> >> > is
> >>> >> > > > > > release as
> >>> >> > > > > > > > > > > messages are consumed until the minimum size for
> >>> >> caching
> >>> >> > of
> >>> >> > > > > > direct is
> >>> >> > > > > > > > > > > reached.
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > 1] For clarity let me check: we believe when you
> >>> >> > > > > > > > > > > say
> >>> >> > "patch
> >>> >> > > > to
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > MultiQueueConsumer" you are referring to the
> patch
> >>> >> > attached
> >>> >> > > > to
> >>> >> > > > > > > > > > > QPID-7462 "Add experimental "pull" consumers to
> the
> >>> >> > broker"
> >>> >> > > > > and
> >>> >> > > > > > you
> >>> >> > > > > > > > > > > are using a combination of this "x-pull-only"
> with
> >>> >> > > > > > > > > > > the
> >>> >> > > > > standard
> >>> >> > > > > > > > > > > "x-multiqueue" feature.  Is this correct?
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > 2] One idea we had here relates to the size of
> the
> >>> >> > > > virtualhost
> >>> >> > > > > IO
> >>> >> > > > > > > > > > > pool.   As you know from the documentation, the
> >>> >> > > > > > > > > > > Broker
> >>> >> > > > > > caches/reuses
> >>> >> > > > > > > > > > > direct memory internally but the documentation
> >>> >> > > > > > > > > > > fails to
> >>> >> > > > > mentions
> >>> >> > > > > > that
> >>> >> > > > > > > > > > > each pooled virtualhost IO thread also grabs a
> >>> >> > > > > > > > > > > chunk
> >>> >> > (256K)
> >>> >> > > > of
> >>> >> > > > > > direct
> >>> >> > > > > > > > > > > memory from this cache.  By default the virtual
> >>> >> > > > > > > > > > > host IO
> >>> >> > > pool
> >>> >> > > > is
> >>> >> > > > > > sized
> >>> >> > > > > > > > > > > Math.max(Runtime.getRuntime().
> availableProcessors()
> >>> >> > > > > > > > > > > *
> >>> >> 2,
> >>> >> > > > 64),
> >>> >> > > > > > so if
> >>> >> > > > > > > > > > > you have a machine with a very large number of
> >>> >> > > > > > > > > > > cores,
> >>> >> you
> >>> >> > > may
> >>> >> > > > > > have a
> >>> >> > > > > > > > > > > surprising large amount of direct memory
> assigned
> >>> >> > > > > > > > > > > to
> >>> >> > > > > virtualhost
> >>> >> > > > > > IO
> >>> >> > > > > > > > > > > threads.   Check the value of
> >>> >> > > > > > > > > > > connectionThreadPoolSize
> >>> >> on
> >>> >> > > the
> >>> >> > > > > > > > > > > virtualhost
> >>> >> > > > > > > > > > > (http://<server>:<port>/api/la
> test/virtualhost/<
> >>> >> > > > > > virtualhostnodename>/<;
> >>> >> > > > > > > > virtualhostname>)
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > to see what value is in force.  What is it?  It
> is
> >>> >> > possible
> >>> >> > > > to
> >>> >> > > > > > tune
> >>> >> > > > > > > > > > > the pool size using context variable
> >>> >> > > > > > > > > > > virtualhost.connectionThreadPool.size.
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > 3] Tell me if you are tuning the Broker in way
> >>> >> > > > > > > > > > > beyond
> >>> >> the
> >>> >> > > > > > direct/heap
> >>> >> > > > > > > > > > > memory settings you have told us about already.
> >>> >> > > > > > > > > > > For
> >>> >> > > instance
> >>> >> > > > > > you are
> >>> >> > > > > > > > > > > changing any of the direct memory pooling
> settings
> >>> >> > > > > > > > > > > broker.directByteBufferPoolSize, default
> network
> >>> >> buffer
> >>> >> > > size
> >>> >> > > > > > > > > > > qpid.broker.networkBufferSize or applying any
> other
> >>> >> > > > > non-standard
> >>> >> > > > > > > > > > > settings?
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > 4] How many virtual hosts do you have on the
> >>> >> > > > > > > > > > > Broker?
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > 5] What is the consumption pattern of the
> messages?
> >>> >> > > > > > > > > > > Do
> >>> >> > > > consume
> >>> >> > > > > > in a
> >>> >> > > > > > > > > > > strictly FIFO fashion or are you making use of
> >>> >> > > > > > > > > > > message
> >>> >> > > > > selectors
> >>> >> > > > > > > > > > > or/and any of the out-of-order queue types
> (LVQs,
> >>> >> > priority
> >>> >> > > > > queue
> >>> >> > > > > > or
> >>> >> > > > > > > > > > > sorted queues)?
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > 6] Is it just the 0.16 client involved in the
> >>> >> > application?
> >>> >> > > > >  Can
> >>> >> > > > > > I
> >>> >> > > > > > > > > > > check that you are not using any of the AMQP 1.0
> >>> >> clients
> >>> >> > > > > > > > > > > (org,apache.qpid:qpid-jms-client or
> >>> >> > > > > > > > > > > org.apache.qpid:qpid-amqp-1-0-client) in the
> >>> >> > > > > > > > > > > software
> >>> >> > > stack
> >>> >> > > > > (as
> >>> >> > > > > > either
> >>> >> > > > > > > > > > > consumers or producers)
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > Hopefully the answers to these questions will
> get
> >>> >> > > > > > > > > > > us
> >>> >> > closer
> >>> >> > > > to
> >>> >> > > > > a
> >>> >> > > > > > > > > > > reproduction.   If you are able to reliable
> >>> >> > > > > > > > > > > reproduce
> >>> >> it,
> >>> >> > > > > please
> >>> >> > > > > > share
> >>> >> > > > > > > > > > > the steps with us.
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > Kind regards, Keith.
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > On 20 April 2017 at 10:21, Ramayan Tiwari <
> >>> >> > > > > > ramayan.tiw...@gmail.com>
> >>> >> > > > > > > > > > > wrote:
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > After a lot of log mining, we might have a
> way to
> >>> >> > explain
> >>> >> > > > the
> >>> >> > > > > > > > sustained
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > increased in DirectMemory allocation, the
> >>> >> > > > > > > > > > > > correlation
> >>> >> > > seems
> >>> >> > > > > to
> >>> >> > > > > > be
> >>> >> > > > > > > > with
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > the
> >>> >> > > > > > > > > > > > growth in the size of a Queue that is getting
> >>> >> consumed
> >>> >> > > but
> >>> >> > > > at
> >>> >> > > > > > a much
> >>> >> > > > > > > > > > > > slower
> >>> >> > > > > > > > > > > > rate than producers putting messages on this
> >>> >> > > > > > > > > > > > queue.
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > The pattern we see is that in each instance of
> >>> >> > > > > > > > > > > > broker
> >>> >> > > > crash,
> >>> >> > > > > > there is
> >>> >> > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > least one queue (usually 1 queue) whose size
> kept
> >>> >> > growing
> >>> >> > > > > > steadily.
> >>> >> > > > > > > > > > > > It’d be
> >>> >> > > > > > > > > > > > of significant size but not the largest queue
> --
> >>> >> > usually
> >>> >> > > > > there
> >>> >> > > > > > are
> >>> >> > > > > > > > > > > > multiple
> >>> >> > > > > > > > > > > > larger queues -- but it was different from
> other
> >>> >> queues
> >>> >> > > in
> >>> >> > > > > > that its
> >>> >> > > > > > > > > > > > size
> >>> >> > > > > > > > > > > > was growing steadily. The queue would also be
> >>> >> > > > > > > > > > > > moving,
> >>> >> > but
> >>> >> > > > its
> >>> >> > > > > > > > > > > > processing
> >>> >> > > > > > > > > > > > rate was not keeping up with the enqueue rate.
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > Our theory that might be totally wrong: If a
> >>> >> > > > > > > > > > > > queue is
> >>> >> > > > moving
> >>> >> > > > > > the
> >>> >> > > > > > > > entire
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > time, maybe then the broker would keep reusing
> >>> >> > > > > > > > > > > > the
> >>> >> same
> >>> >> > > > > buffer
> >>> >> > > > > > in
> >>> >> > > > > > > > > > > > direct
> >>> >> > > > > > > > > > > > memory for the queue, and keep on adding onto
> it
> >>> >> > > > > > > > > > > > at
> >>> >> the
> >>> >> > > end
> >>> >> > > > > to
> >>> >> > > > > > > > > > > > accommodate
> >>> >> > > > > > > > > > > > new messages. But because it’s active all the
> >>> >> > > > > > > > > > > > time
> >>> >> and
> >>> >> > > > we’re
> >>> >> > > > > > pointing
> >>> >> > > > > > > > > > > > to
> >>> >> > > > > > > > > > > > the same buffer, space allocated for messages
> at
> >>> >> > > > > > > > > > > > the
> >>> >> > head
> >>> >> > > > of
> >>> >> > > > > > the
> >>> >> > > > > > > > > > > > queue/buffer doesn’t get reclaimed, even long
> >>> >> > > > > > > > > > > > after
> >>> >> > those
> >>> >> > > > > > messages
> >>> >> > > > > > > > have
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > been processed. Just a theory.
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > We are also trying to reproduce this using
> some
> >>> >> > > > > > > > > > > > perf
> >>> >> > > tests
> >>> >> > > > to
> >>> >> > > > > > enqueue
> >>> >> > > > > > > > > > > > with
> >>> >> > > > > > > > > > > > same pattern, will update with the findings.
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > Thanks
> >>> >> > > > > > > > > > > > Ramayan
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > On Wed, Apr 19, 2017 at 6:52 PM, Ramayan
> Tiwari
> >>> >> > > > > > > > > > > > <ramayan.tiw...@gmail.com>
> >>> >> > > > > > > > > > > > wrote:
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > Another issue that we noticed is when broker
> >>> >> > > > > > > > > > > > > goes
> >>> >> OOM
> >>> >> > > due
> >>> >> > > > > to
> >>> >> > > > > > direct
> >>> >> > > > > > > > > > > > > memory, it doesn't create heap dump
> (specified
> >>> >> > > > > > > > > > > > > by
> >>> >> > > "-XX:+
> >>> >> > > > > > > > > > > > > HeapDumpOnOutOfMemoryError"), even when the
> OOM
> >>> >> error
> >>> >> > > is
> >>> >> > > > > > same as
> >>> >> > > > > > > > what
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > is
> >>> >> > > > > > > > > > > > > mentioned in the oracle JVM docs
> >>> >> > > > > > ("java.lang.OutOfMemoryError").
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > Has anyone been able to find a way to get to
> >>> >> > > > > > > > > > > > > heap
> >>> >> > dump
> >>> >> > > > for
> >>> >> > > > > > DM OOM?
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > - Ramayan
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > On Wed, Apr 19, 2017 at 11:21 AM, Ramayan
> >>> >> > > > > > > > > > > > > Tiwari
> >>> >> > > > > > > > > > > > > <ramayan.tiw...@gmail.com
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > wrote:
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > Alex,
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > Below are the flow to disk logs from
> broker
> >>> >> having
> >>> >> > > > > > 3million+
> >>> >> > > > > > > > messages
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > this time. We only have one virtual host.
> >>> >> > > > > > > > > > > > > > Time is
> >>> >> > in
> >>> >> > > > GMT.
> >>> >> > > > > > Looks
> >>> >> > > > > > > > like
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > flow
> >>> >> > > > > > > > > > > > > > to disk is active on the whole virtual
> host
> >>> >> > > > > > > > > > > > > > and
> >>> >> > not a
> >>> >> > > > > > queue level.
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > When the same broker went OOM yesterday, I
> >>> >> > > > > > > > > > > > > > did
> >>> >> not
> >>> >> > > see
> >>> >> > > > > any
> >>> >> > > > > > flow to
> >>> >> > > > > > > > > > > > > > disk
> >>> >> > > > > > > > > > > > > > logs from when it was started until it
> >>> >> > > > > > > > > > > > > > crashed
> >>> >> > > (crashed
> >>> >> > > > > > twice
> >>> >> > > > > > > > within
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > 4hrs).
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > 4/19/17 4:17:43.509 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1014 : Message flow to disk active :
> >>> >> > > > > > > > > > > > > > Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3356539KB
> >>> >> > > > > > > > > > > > > > exceeds threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 2:31:13.502 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1015 : Message flow to disk inactive :
> >>> >> Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3354866KB
> >>> >> > > > > > > > > > > > > > within threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 2:28:43.511 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1014 : Message flow to disk active :
> >>> >> > > > > > > > > > > > > > Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3358509KB
> >>> >> > > > > > > > > > > > > > exceeds threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 2:20:13.500 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1015 : Message flow to disk inactive :
> >>> >> Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3353501KB
> >>> >> > > > > > > > > > > > > > within threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 2:18:13.500 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1014 : Message flow to disk active :
> >>> >> > > > > > > > > > > > > > Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3357544KB
> >>> >> > > > > > > > > > > > > > exceeds threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 2:08:43.501 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1015 : Message flow to disk inactive :
> >>> >> Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3353236KB
> >>> >> > > > > > > > > > > > > > within threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 2:08:13.501 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1014 : Message flow to disk active :
> >>> >> > > > > > > > > > > > > > Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3356704KB
> >>> >> > > > > > > > > > > > > > exceeds threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 2:00:43.500 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1015 : Message flow to disk inactive :
> >>> >> Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3353511KB
> >>> >> > > > > > > > > > > > > > within threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 2:00:13.504 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1014 : Message flow to disk active :
> >>> >> > > > > > > > > > > > > > Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3357948KB
> >>> >> > > > > > > > > > > > > > exceeds threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 1:50:43.501 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1015 : Message flow to disk inactive :
> >>> >> Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3355310KB
> >>> >> > > > > > > > > > > > > > within threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 1:47:43.501 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1014 : Message flow to disk active :
> >>> >> > > > > > > > > > > > > > Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3365624KB
> >>> >> > > > > > > > > > > > > > exceeds threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 1:43:43.501 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1015 : Message flow to disk inactive :
> >>> >> Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3355136KB
> >>> >> > > > > > > > > > > > > > within threshold 3355443KB
> >>> >> > > > > > > > > > > > > > 4/19/17 1:31:43.509 AM INFO
> >>> >> [Housekeeping[test]] -
> >>> >> > > > > > > > > > > > > > [Housekeeping[test]]
> >>> >> > > > > > > > > > > > > > BRK-1014 : Message flow to disk active :
> >>> >> > > > > > > > > > > > > > Message
> >>> >> > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > 3358683KB
> >>> >> > > > > > > > > > > > > > exceeds threshold 3355443KB
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > After production release (2days back), we
> >>> >> > > > > > > > > > > > > > have
> >>> >> > seen 4
> >>> >> > > > > > crashes in 3
> >>> >> > > > > > > > > > > > > > different brokers, this is the most
> pressing
> >>> >> > concern
> >>> >> > > > for
> >>> >> > > > > > us in
> >>> >> > > > > > > > > > > > > > decision if
> >>> >> > > > > > > > > > > > > > we should roll back to 0.32. Any help is
> >>> >> > > > > > > > > > > > > > greatly
> >>> >> > > > > > appreciated.
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > Thanks
> >>> >> > > > > > > > > > > > > > Ramayan
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > On Wed, Apr 19, 2017 at 9:36 AM, Oleksandr
> >>> >> > > > > > > > > > > > > > Rudyy
> >>> >> <
> >>> >> > > > > > oru...@gmail.com
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > wrote:
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > Ramayan,
> >>> >> > > > > > > > > > > > > > > Thanks for the details. I would like to
> >>> >> > > > > > > > > > > > > > > clarify
> >>> >> > > > whether
> >>> >> > > > > > flow to
> >>> >> > > > > > > > disk
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > was
> >>> >> > > > > > > > > > > > > > > triggered today for 3 million messages?
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > The following logs are issued for flow
> to
> >>> >> > > > > > > > > > > > > > > disk:
> >>> >> > > > > > > > > > > > > > > BRK-1014 : Message flow to disk active :
> >>> >> Message
> >>> >> > > > > memory
> >>> >> > > > > > use
> >>> >> > > > > > > > > > > > > > > {0,number,#}KB
> >>> >> > > > > > > > > > > > > > > exceeds threshold {1,number,#.##}KB
> >>> >> > > > > > > > > > > > > > > BRK-1015 : Message flow to disk
> inactive :
> >>> >> > Message
> >>> >> > > > > > memory use
> >>> >> > > > > > > > > > > > > > > {0,number,#}KB within threshold
> >>> >> {1,number,#.##}KB
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > Kind Regards,
> >>> >> > > > > > > > > > > > > > > Alex
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > On 19 April 2017 at 17:10, Ramayan
> Tiwari <
> >>> >> > > > > > > > ramayan.tiw...@gmail.com>
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > wrote:
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > Hi Alex,
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > Thanks for your response, here are the
> >>> >> details:
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > We use "direct" exchange, without
> >>> >> > > > > > > > > > > > > > > > persistence
> >>> >> > (we
> >>> >> > > > > > specify
> >>> >> > > > > > > > > > > > > > > NON_PERSISTENT
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > that while sending from client) and
> use
> >>> >> > > > > > > > > > > > > > > > BDB
> >>> >> > > store.
> >>> >> > > > We
> >>> >> > > > > > use JSON
> >>> >> > > > > > > > > > > > > > > > virtual
> >>> >> > > > > > > > > > > > > > > host
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > type. We are not using SSL.
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > When the broker went OOM, we had
> around
> >>> >> > > > > > > > > > > > > > > > 1.3
> >>> >> > > million
> >>> >> > > > > > messages
> >>> >> > > > > > > > with
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > 100
> >>> >> > > > > > > > > > > > > > > bytes
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > average message size. Direct memory
> >>> >> allocation
> >>> >> > > > (value
> >>> >> > > > > > read from
> >>> >> > > > > > > > > > > > > > > > MBean)
> >>> >> > > > > > > > > > > > > > > kept
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > going up, even though it wouldn't need
> >>> >> > > > > > > > > > > > > > > > more
> >>> >> DM
> >>> >> > to
> >>> >> > > > > > store these
> >>> >> > > > > > > > many
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > messages. DM allocated persisted at
> 99%
> >>> >> > > > > > > > > > > > > > > > for
> >>> >> > > about 3
> >>> >> > > > > > and half
> >>> >> > > > > > > > hours
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > before
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > crashing.
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > Today, on the same broker we have 3
> >>> >> > > > > > > > > > > > > > > > million
> >>> >> > > > messages
> >>> >> > > > > > (same
> >>> >> > > > > > > > message
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > size)
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > and DM allocated is only at 8%. This
> >>> >> > > > > > > > > > > > > > > > seems
> >>> >> like
> >>> >> > > > there
> >>> >> > > > > > is some
> >>> >> > > > > > > > > > > > > > > > issue
> >>> >> > > > > > > > > > > > > > > with
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > de-allocation or a leak.
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > I have uploaded the memory utilization
> >>> >> > > > > > > > > > > > > > > > graph
> >>> >> > > here:
> >>> >> > > > > > > > > > > > > > > > https://drive.google.com/file/d/
> >>> >> > > > > > 0Bwi0MEV3srPRVHFEbDlIYUpLaUE/
> >>> >> > > > > > > > > > > > > > > > view?usp=sharing
> >>> >> > > > > > > > > > > > > > > > Blue line is DM allocated, Yellow is
> DM
> >>> >> > > > > > > > > > > > > > > > Used
> >>> >> > (sum
> >>> >> > > > of
> >>> >> > > > > > queue
> >>> >> > > > > > > > > > > > > > > > payload)
> >>> >> > > > > > > > > > > > > > > and Red
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > is heap usage.
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > Thanks
> >>> >> > > > > > > > > > > > > > > > Ramayan
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > On Wed, Apr 19, 2017 at 4:10 AM,
> >>> >> > > > > > > > > > > > > > > > Oleksandr
> >>> >> > Rudyy
> >>> >> > > > > > > > > > > > > > > > <oru...@gmail.com>
> >>> >> > > > > > > > > > > > > > > wrote:
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > Hi Ramayan,
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > Could please share with us the
> details
> >>> >> > > > > > > > > > > > > > > > > of
> >>> >> > > > messaging
> >>> >> > > > > > use
> >>> >> > > > > > > > case(s)
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > which
> >>> >> > > > > > > > > > > > > > > > ended
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > up in OOM on broker side?
> >>> >> > > > > > > > > > > > > > > > > I would like to reproduce the issue
> on
> >>> >> > > > > > > > > > > > > > > > > my
> >>> >> > local
> >>> >> > > > > > broker in
> >>> >> > > > > > > > order
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > to
> >>> >> > > > > > > > > > > > > > > fix
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > it.
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > I would appreciate if you could
> provide
> >>> >> > > > > > > > > > > > > > > > > as
> >>> >> > much
> >>> >> > > > > > details as
> >>> >> > > > > > > > > > > > > > > > > possible,
> >>> >> > > > > > > > > > > > > > > > > including, messaging topology,
> message
> >>> >> > > > persistence
> >>> >> > > > > > type,
> >>> >> > > > > > > > message
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > sizes,volumes, etc.
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > Qpid Broker 6.0.x uses direct memory
> >>> >> > > > > > > > > > > > > > > > > for
> >>> >> > > keeping
> >>> >> > > > > > message
> >>> >> > > > > > > > content
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > and
> >>> >> > > > > > > > > > > > > > > > > receiving/sending data. Each plain
> >>> >> connection
> >>> >> > > > > > utilizes 512K of
> >>> >> > > > > > > > > > > > > > > > > direct
> >>> >> > > > > > > > > > > > > > > > > memory. Each SSL connection uses 1M
> of
> >>> >> direct
> >>> >> > > > > > memory. Your
> >>> >> > > > > > > > > > > > > > > > > memory
> >>> >> > > > > > > > > > > > > > > > settings
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > look Ok to me.
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > Kind Regards,
> >>> >> > > > > > > > > > > > > > > > > Alex
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > On 18 April 2017 at 23:39, Ramayan
> >>> >> > > > > > > > > > > > > > > > > Tiwari
> >>> >> > > > > > > > > > > > > > > > > <ramayan.tiw...@gmail.com>
> >>> >> > > > > > > > > > > > > > > > > wrote:
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > Hi All,
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > We are using Java broker 6.0.5,
> with
> >>> >> patch
> >>> >> > to
> >>> >> > > > use
> >>> >> > > > > > > > > > > > > > > MultiQueueConsumer
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > feature. We just finished
> deploying
> >>> >> > > > > > > > > > > > > > > > > > to
> >>> >> > > > production
> >>> >> > > > > > and saw
> >>> >> > > > > > > > > > > > > > > > > > couple of
> >>> >> > > > > > > > > > > > > > > > > > instances of broker OOM due to
> >>> >> > > > > > > > > > > > > > > > > > running
> >>> >> out
> >>> >> > of
> >>> >> > > > > > DirectMemory
> >>> >> > > > > > > > > > > > > > > > > > buffer
> >>> >> > > > > > > > > > > > > > > > > > (exceptions at the end of this
> >>> >> > > > > > > > > > > > > > > > > > email).
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > Here is our setup:
> >>> >> > > > > > > > > > > > > > > > > > 1. Max heap 12g, max direct
> memory 4g
> >>> >> (this
> >>> >> > > is
> >>> >> > > > > > opposite of
> >>> >> > > > > > > > > > > > > > > > > > what the
> >>> >> > > > > > > > > > > > > > > > > > recommendation is, however, for
> our
> >>> >> > > > > > > > > > > > > > > > > > use
> >>> >> > cause
> >>> >> > > > > > message
> >>> >> > > > > > > > payload
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > is
> >>> >> > > > > > > > > > > > > > > really
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > small ~400bytes and is way less
> than
> >>> >> > > > > > > > > > > > > > > > > > the
> >>> >> > per
> >>> >> > > > > > message
> >>> >> > > > > > > > overhead
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > of
> >>> >> > > > > > > > > > > > > > > 1KB).
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > In
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > perf testing, we were able to put
> 2
> >>> >> million
> >>> >> > > > > > messages without
> >>> >> > > > > > > > > > > > > > > > > > any
> >>> >> > > > > > > > > > > > > > > > issues.
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > 2. ~400 connections to broker.
> >>> >> > > > > > > > > > > > > > > > > > 3. Each connection has 20 sessions
> >>> >> > > > > > > > > > > > > > > > > > and
> >>> >> > there
> >>> >> > > is
> >>> >> > > > > > one multi
> >>> >> > > > > > > > > > > > > > > > > > queue
> >>> >> > > > > > > > > > > > > > > > consumer
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > attached to each session,
> listening
> >>> >> > > > > > > > > > > > > > > > > > to
> >>> >> > around
> >>> >> > > > > 1000
> >>> >> > > > > > queues.
> >>> >> > > > > > > > > > > > > > > > > > 4. We are still using 0.16 client
> (I
> >>> >> know).
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > With the above setup, the baseline
> >>> >> > > utilization
> >>> >> > > > > > (without any
> >>> >> > > > > > > > > > > > > > > messages)
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > for
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > direct memory was around 230mb
> (with
> >>> >> > > > > > > > > > > > > > > > > > 410
> >>> >> > > > > > connection each
> >>> >> > > > > > > > > > > > > > > > > > taking
> >>> >> > > > > > > > > > > > > > > 500KB).
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > Based on our understanding of
> broker
> >>> >> memory
> >>> >> > > > > > allocation,
> >>> >> > > > > > > > > > > > > > > > > > message
> >>> >> > > > > > > > > > > > > > > payload
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > should be the only thing adding to
> >>> >> > > > > > > > > > > > > > > > > > direct
> >>> >> > > > memory
> >>> >> > > > > > utilization
> >>> >> > > > > > > > > > > > > > > > > > (on
> >>> >> > > > > > > > > > > > > > > top of
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > baseline), however, we are
> >>> >> > > > > > > > > > > > > > > > > > experiencing
> >>> >> > > > something
> >>> >> > > > > > completely
> >>> >> > > > > > > > > > > > > > > different.
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > In
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > our last broker crash, we see that
> >>> >> > > > > > > > > > > > > > > > > > broker
> >>> >> > is
> >>> >> > > > > > constantly
> >>> >> > > > > > > > > > > > > > > > > > running
> >>> >> > > > > > > > > > > > > > > with
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > 90%+
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > direct memory allocated, even when
> >>> >> message
> >>> >> > > > > payload
> >>> >> > > > > > sum from
> >>> >> > > > > > > > > > > > > > > > > > all the
> >>> >> > > > > > > > > > > > > > > > > queues
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > is only 6-8% (these % are against
> >>> >> available
> >>> >> > > DM
> >>> >> > > > of
> >>> >> > > > > > 4gb).
> >>> >> > > > > > > > During
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > these
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > high
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > DM usage period, heap usage was
> >>> >> > > > > > > > > > > > > > > > > > around
> >>> >> 60%
> >>> >> > > (of
> >>> >> > > > > > 12gb).
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > We would like some help in
> >>> >> > > > > > > > > > > > > > > > > > understanding
> >>> >> > what
> >>> >> > > > > > could be the
> >>> >> > > > > > > > > > > > > > > > > > reason
> >>> >> > > > > > > > > > > > > > > of
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > these
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > high DM allocations. Are there
> things
> >>> >> other
> >>> >> > > > than
> >>> >> > > > > > message
> >>> >> > > > > > > > > > > > > > > > > > payload
> >>> >> > > > > > > > > > > > > > > and
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > AMQP
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > connection, which use DM and
> could be
> >>> >> > > > > contributing
> >>> >> > > > > > to these
> >>> >> > > > > > > > > > > > > > > > > > high
> >>> >> > > > > > > > > > > > > > > usage?
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > Another thing where we are
> puzzled is
> >>> >> > > > > > > > > > > > > > > > > > the
> >>> >> > > > > > de-allocation of
> >>> >> > > > > > > > DM
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > byte
> >>> >> > > > > > > > > > > > > > > > > buffers.
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > From log mining of heap and DM
> >>> >> utilization,
> >>> >> > > > > > de-allocation of
> >>> >> > > > > > > > > > > > > > > > > > DM
> >>> >> > > > > > > > > > > > > > > doesn't
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > correlate with heap GC. If anyone
> has
> >>> >> seen
> >>> >> > > any
> >>> >> > > > > > documentation
> >>> >> > > > > > > > > > > > > > > related to
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > this, it would be very helpful if
> you
> >>> >> could
> >>> >> > > > share
> >>> >> > > > > > that.
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > Thanks
> >>> >> > > > > > > > > > > > > > > > > > Ramayan
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > *Exceptions*
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > java.lang.OutOfMemoryError: Direct
> >>> >> > > > > > > > > > > > > > > > > > buffer
> >>> >> > > > memory
> >>> >> > > > > > > > > > > > > > > > > > at java.nio.Bits.reserveMemory(
> >>> >> > > Bits.java:658)
> >>> >> > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > at java.nio.DirectByteBuffer.<
> >>> >> > > > > > init>(DirectByteBuffer.java:
> >>> >> > > > > > > > 123)
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > > at java.nio.ByteBuffer.
> >>> >> > > > > allocateDirect(ByteBuffer.
> >>> >> > > > > > java:311)
> >>> >> > > > > > > > > > > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.bytebuffer.
> >>> >> > > > > > QpidByteBuffer.allocateDirect(
> >>> >> > > > > > > > > > > > > > > > > > QpidByteBuffer.java:474)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-common-6.0.5.jar:6.0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > > > NonBlockingConnectionPlainD
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > elegate.
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > restoreApplicationBufferForWrite(
> >>> >> > > > > > > > NonBlockingConnectionPlainDele
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > gate.java:93)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > > > NonBlockingConnectionPlainDele
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > gate.processData(
> >>> >> > > > NonBlockingConnectionPlainDele
> >>> >> > > > > > > > gate.java:60)
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > > > NonBlockingConnection.doRead(
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > NonBlockingConnection.java:506)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > > > NonBlockingConnection.doWork(
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > NonBlockingConnection.java:285)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > > > NetworkConnectionScheduler.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > processConnection(
> >>> >> > > NetworkConnectionScheduler.
> >>> >> > > > > > java:124)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.
> >>> >> > > > transport.SelectorThread$
> >>> >> > > > > > > > ConnectionPr
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > ocessor.
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > processConnection(
> >>> >> SelectorThread.java:504)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.
> >>> >> > > > transport.SelectorThread$
> >>> >> > > > > > > > > > > > > > > > > > SelectionTask.performSelect(
> >>> >> > > > > > SelectorThread.java:337)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.
> >>> >> > > > transport.SelectorThread$
> >>> >> > > > > > > > SelectionTask.run(
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > SelectorThread.java:87)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.
> >>> >> > > > > > transport.SelectorThread.run(
> >>> >> > > > > > > > > > > > > > > > > > SelectorThread.java:462)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > java.util.concurrent.
> >>> >> > > > > ThreadPoolExecutor.runWorker(
> >>> >> > > > > > > > > > > > > > > > > > ThreadPoolExecutor.java:1142)
> >>> >> > > > > > > > > > > > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > java.util.concurrent.
> >>> >> > > > > > ThreadPoolExecutor$Worker.run(
> >>> >> > > > > > > > > > > > > > > > > > ThreadPoolExecutor.java:617)
> >>> >> > > > > > > > > > > > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > java.lang.Thread.run(Thread.ja
> va:745)
> >>> >> > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > *Second exception*
> >>> >> > > > > > > > > > > > > > > > > > java.lang.OutOfMemoryError: Direct
> >>> >> > > > > > > > > > > > > > > > > > buffer
> >>> >> > > > memory
> >>> >> > > > > > > > > > > > > > > > > > at java.nio.Bits.reserveMemory(
> >>> >> > > Bits.java:658)
> >>> >> > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > at java.nio.DirectByteBuffer.<
> >>> >> > > > > > init>(DirectByteBuffer.java:
> >>> >> > > > > > > > 123)
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > > at java.nio.ByteBuffer.
> >>> >> > > > > allocateDirect(ByteBuffer.
> >>> >> > > > > > java:311)
> >>> >> > > > > > > > > > > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.bytebuffer.
> >>> >> > > > > > QpidByteBuffer.allocateDirect(
> >>> >> > > > > > > > > > > > > > > > > > QpidByteBuffer.java:474)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-common-6.0.5.jar:6.0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > > > NonBlockingConnectionPlainDele
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > gate.<init>(
> >>> >> NonBlockingConnectionPlainDele
> >>> >> > > > > > gate.java:45)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > NonBlockingConnection.
> >>> >> > > > > > > > > > > > > > > > > > setTransportEncryption(
> >>> >> > > > > NonBlockingConnection.java:
> >>> >> > > > > > 625)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > > > NonBlockingConnection.<init>(
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > NonBlockingConnection.java:117)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.transport.
> >>> >> > > > > > > > NonBlockingNetworkTransport.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > acceptSocketChannel(
> >>> >> > > > NonBlockingNetworkTransport.
> >>> >> > > > > > java:158)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.
> >>> >> > > > transport.SelectorThread$
> >>> >> > > > > > > > SelectionTas
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > k$1.run(
> >>> >> > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > > > > > > SelectorThread.java:191)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > org.apache.qpid.server.
> >>> >> > > > > > transport.SelectorThread.run(
> >>> >> > > > > > > > > > > > > > > > > > SelectorThread.java:462)
> >>> >> > > > > > > > > > > > > > > > > > ~[qpid-broker-core-6.0.5.jar:6
> .0.5]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > java.util.concurrent.
> >>> >> > > > > ThreadPoolExecutor.runWorker(
> >>> >> > > > > > > > > > > > > > > > > > ThreadPoolExecutor.java:1142)
> >>> >> > > > > > > > > > > > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > java.util.concurrent.
> >>> >> > > > > > ThreadPoolExecutor$Worker.run(
> >>> >> > > > > > > > > > > > > > > > > > ThreadPoolExecutor.java:617)
> >>> >> > > > > > > > > > > > > > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > > at
> >>> >> > > > > > > > > > > > > > > > > > java.lang.Thread.run(Thread.ja
> va:745)
> >>> >> > > > > > ~[na:1.8.0_40]
> >>> >> > > > > > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > > > >
> >>> >> > > > > > > > > > > ------------------------------
> >>> >> > > ------------------------------
> >>> >> > > > > > ---------
> >>> >> > > > > > > > > > > To unsubscribe, e-mail:
> >>> >> > > > > > > > > > > users-unsubscribe@qpid.apache.
> >>> >> > org
> >>> >> > > > > > > > > > > For additional commands, e-mail:
> >>> >> > > users-h...@qpid.apache.org
> >>> >> > > > > > > > > > >
> >>> >> > > > > >
> >>> >> > > > > >
> >>> >> > > > > > ------------------------------
> ------------------------------
> >>> >> > > ---------
> >>> >> > > > > > To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> >>> >> > > > > > For additional commands, e-mail:
> users-h...@qpid.apache.org
> >>> >> > > > > >
> >>> >> > > > >
> >>> >> > > >
> >>> >> > >
> >>> >> >
> >>> >>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscr...@qpid.apache.org
> >>> For additional commands, e-mail: users-h...@qpid.apache.org
> >>>
> >>
>

Re: Java broker OOM due to DirectMemory

Reply via email to