Just want to add my experience with issues like this but im still at the learning level with Artemis. Watch out for a delivering count with the address not getting ACQs as this has always meant a problem with the consumer or a poison message, in our experiences. Also redeliveries and/or duplicates are something to watch out for as this can cause performance issues over time because of possible infinite redelivery settings.
On Tue, Jan 3, 2023 at 8:06 AM Cezary Majchrzak < cezary.majchrza...@gmail.com> wrote: > John, > It seems to me that this is not the reason. If it was an > issue of slow or hung consumers we would see it in thread dumps. > > Justin, > Answering your questions: > > - We are aware of this version difference and have prepared to > implement a new version of the application with an upgrade of > spring-boot-starter-artemis to the broker version. Although we have not yet > deployed these changes on the environment. > > - We haven't tried this yet, mainly because of concerns about high > memory consumption. One of the consumers of large messages, pulls messages > from the queue, at a speed about 3 times less than they are produced. > > - We only use CORE clients, and we set this parameter because we > overlooked the fact that it only applies to AMQP clients. Thanks for > pointing this out. > > - Yes, we collected thread dumps from the broker (back when it was > still in version 2.22.0) when this problem occurred. I am not sure if these > dumps indicate that the broker is working correctly, please help me analyze > them. I attach the dumps to this message. > > - I was not very precise, sorry about that. All services > publish/consume to/from a single address that has multiple multicast > queues. Some of these queues (the ones that large messages fall into after > filtering) have the problems described while others work just fine. > > - The services in our system consume messages from the queue, > execute business logic and finally publish the message to address. We want > to make sure that any errors that may occur along the way will cause the > message to be rolled back and possibly re-processed. > > > Thanks, > > Cezary > > wt., 3 sty 2023 o 03:19 Justin Bertram <jbert...@apache.org> napisaĆ(a): > >> Couple of questions: >> >> - Version 2.6.13 of the spring-boot-starter-artemis Maven component uses >> artemis-jms-client 2.19.1. Have you tried upgrading this to a later version? >> - Have you tried adjusting the minLargeMessageSize URL parameter on your >> clients so that *no* message is actually considered "large"? This would use >> more memory on the broker and therefore wouldn't necessarily be >> recommended, but it would be worth testing to conclusively isolate the >> problem to "large" messages. >> - I see that you tried adjusting amqpMinLargeMessageSize, but that only >> applies to clients using AMQP. Are you using any AMQP clients? I'm guessing >> you aren't since you didn't see any change in behavior after adjusting that >> parameter. >> - Have you collected any thread dumps from the broker once a consumer >> stops receiving messages? If so, what did they show? If not, could you? >> - Can you elaborate on what kind of and how many destinations you're >> using? You talk about some queues operating normally while other queues are >> having problems, but you also say that you're only using "one topic." >> - Is there a specific reason you're using transacted sessions? >> >> >> Justin >> >> On Mon, Jan 2, 2023 at 12:17 PM Cezary Majchrzak < >> cezary.majchrza...@gmail.com> wrote: >> >>> Hello, >>> >>> We are observing strange communication problems with the ActiveMQ >>> Artemis broker in our system. When the problem occurs JmsListener stops >>> receiving further messages despite the fact that previously consuming >>> worked perfectly. The problem can occur on several queues but others at the >>> same time work properly. The Artemis management panel on the problematic >>> queues then indicates that deliveringCount > 0 and this value does not >>> change. Consumer count at this time is non-zero. Restarting the broker or >>> message consuming services does not always help. Sometimes messages are >>> consumed for a short time after which the problem reappears. We noticed >>> that this happens only when sending large messages (size of about 250 KB, >>> Artemis saves them with a size twice as large due to encoding). Problematic >>> queues process large and small messages or only large messages. Queues that >>> work properly process only small messages. At the same time, the problem >>> does not occur with every sending of large messages. We use message >>> grouping, assigning each message a UUID at the beginning of processing, >>> which is then used as a group identifier. We wonder if the large number of >>> such groups (sometimes even several million new messages per day) can have >>> a significant impact on memory consumption. >>> >>> >>> >>> *Artemis configuration* >>> >>> - Single instance of ActiveMQ Artemis broker (configured for >>> master-slave operation, but only one instance is enabled). >>> >>> - The broker is running on AlmaLinux 8.4 OS. >>> >>> - Artemis version is 2.27.1 (updated from version 2.22.0 where >>> the problem also occurred). >>> >>> - The broker.xml configuration file is attached. >>> >>> - One topic (omitting DLQ and ExpiryQueue) for which queues are >>> created with appropriate filters. >>> >>> *Application side configuration* >>> >>> - Spring Boot version 2.6.13 with spring-boot-starter-artemis. >>> >>> - Subscriptions configured as durable and shared. >>> >>> - Sessions are transacted. >>> >>> *What have we tried to solve the issue* >>> >>> - JmsListener used a container with dynamic scaling of the >>> number of consumers, while caching of consumers was enabled. We thought >>> that this might pose a problem for a broker trying to deliver messages to >>> consumers that no longer existed. We disabled caching of consumers and set >>> maxMessagePerTask property, unfortunately this did not solve the problem. >>> >>> - We tried changing Spring Boot's CachingConnectionFactory to >>> JmsPoolConnectionFactory from lib >>> https://github.com/messaginghub/pooled-jms, but again the problem was >>> not solved. >>> >>> - We took thread dumps in the services to make sure that the >>> processing doesn't get stuck when executing business logic and interacting >>> with external services. All threads of type JmsListenerEndpointContainer >>> are in TIMED_WAITING state and the stacktrace indicates that they are >>> waiting for messages from the broker in the receive method of class >>> org.apache.activemq.artemis.core.client.impl.ClientConsumerImpl. >>> >>> - Updated the broker version to the latest 2.27.1, but the same >>> problem still occurs. >>> >>> - We tried changing the parameters of the acceptor in the >>> broker.xml file, such as: amqpMinLargeMessageSize (despite changing this >>> parameter, messages in the broker continue to be seen as large, despite the >>> smaller size than declared), remotingThreads and directDeliver. No apparent >>> effect on broker performance. >>> >>> - TCP dumps of the network traffic between the broker and the >>> services consuming the messages show that the network communication is >>> established and some data is sent from the broker. >>> >>> - We have changed the broker settings related to memory. >>> Previously, the host had 32GB of RAM and the Artemis process was configured >>> with the JVM -Xms and -Xmx parameters equal to 26GB and the global-max-size >>> parameter set by default. We noticed that during a heavy load of large >>> messages, in addition to the problem of not consuming messages, the host >>> would sometimes reset itself through out of out of memory errors. For this >>> reason, we increased the amount of RAM available to the host to 64GB and >>> set the -Xms and -Xmx parameters to 50G, and changed the global-max-size to >>> 10G as recommended by >>> https://activemq.apache.org/components/artemis/documentation/latest/perf-tuning.html. >>> The broker seemed to work more stably (one day processed about 3 million >>> large messages without any problems), unfortunately after about a week of >>> operation the problem of not consuming messages returned. I've attached >>> below graphs of memory consumption during one such problem. I have numbered >>> on them the consecutive times when we restarted the broker (coinciding with >>> high GC time and high committed memory value). During the first three >>> reboots, consuming resumed only for a moment, then stopped again. After the >>> fourth reboot, consuming started working properly and all the messages came >>> off the queues. >>> >>> >>> [image: memory_dump_1.png] >>> >>> >>> [image: memory_dump_2.png] >>> >>> >>> Similar symptoms have been described here >>> <https://stackoverflow.com/questions/74792977/no-data-being-sent-to-consumers-even-though-connection-and-session-are-created> >>> but the proposed solutions do not seem to apply to us. Please provide ideas >>> on how to solve the problem. >>> >>> Many thanks, >>> Cezary Majchrzak >>> >>