This is only a partial answer (I'll try to get time this weekend to answer
the parts I don't have time for now), but I want to get you something to
start with.

On Feb 15, 2018 5:03 AM, "Thiago Veronezi" <thi...@veronezi.org> wrote:

Hi, ActiveMQ community,

I'm actively working on a documentation for "out of memory" protection on
ActiveMQ. Recently I was working on this POC project where I stressed a
default broker configuration with 1.000.000 messages with 20KB payload
each, where each message took 1 second to be consumed. It caused the
"Pending Messages" numbers go up pretty fast.


Are these persistent or non-persistent messages? How large (capacity) is
your persistent store and your temp store?

My understanding is that AMQ, out of the box, has the "Producer Flow
Control" feature activated for all Topics and Queues; and it has
"usedMemory" threshold set as 70% of 512MB.


Did PFC kick in? You'd see it in the broker's logs.

Still, with the load I used, I
saw OOM issues. The 1.000.000 messages actually killed the server.

In my tests, I use several threads and nodes to send all the 1.000.000
messages in parallel. That means I have several connections to the broker.
Once I used the sendFailIfNoSpace="true" option, the OOM issues ceased; The
consumers were able to catch up, And the broker survived. One thing that I
noticed is that even when the "Pending messages" number reached 0, it took
some time for the server to allow new producer connections again.


When it didn't allow new producer connections, what was the symptom?

Questions:

* Is it possible that AMQ doesn't count the memory used by each active
connection as variable to the final used memory calculation?


Yes. Those limits are solely on the memory message store (used for
non-persistent messages and for paging in persistent messages from the
persistent store), so it's possible to OOM even though you don't exceed
those limits.

* Is there any configuration where we set a refresh rate so the server
notices faster when the memory is below the maximum threshold again?


To the best of my knowledge, the metrics are captured instantaneously by
modifying an object in memory, not via a periodic poll, so I think
something else is going on. I'll come back to this.

* Is the use of sendFailIfNoSpace="true" the ultimate solution for OOM
issues? Is this something I can advise a customer to use so he is 99.9%
guaranteed to not have OOM crashes?


No. SendFailIfNoSpace just means that the client won't wait forever on a
send. The only reason you're not seeing OOMs when you used it is because
you're not retrying when you catch it.

Thanks,
Thiago.

Ps.: I think this is my first message here. :)

Reply via email to