[
https://issues.apache.org/jira/browse/AMQCPP-760?focusedWorklogId=990624&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-990624
]
ASF GitHub Bot logged work on AMQCPP-760:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 06/Nov/25 19:34
Start Date: 06/Nov/25 19:34
Worklog Time Spent: 10m
Work Description: Nautilus009 opened a new pull request, #22:
URL: https://github.com/apache/activemq-cpp/pull/22
In ActiveMQ-CPP 3.9.5, several methods in
activemq/core/ActiveMQMessageAudit.cpp use subtraction against
Integer::MAX_VALUE when normalizing the ProducerSequenceId. When the sequence
exceeds 2,147,483,647, the subtraction results in a negative scaled index,
which causes the bit array (BitSet) access to throw or behave incorrectly. As a
result, messages — especially Advisory messages — are incorrectly marked as
duplicates and discarded under failover conditions after long broker uptime.
Issue Time Tracking
-------------------
Worklog Id: (was: 990624)
Remaining Estimate: 0h
Time Spent: 10m
> ActiveMQ-CPP 3.9.5 Failover Timeout Issue with Advisory Topic in PROD
> ---------------------------------------------------------------------
>
> Key: AMQCPP-760
> URL: https://issues.apache.org/jira/browse/AMQCPP-760
> Project: ActiveMQ C++ Client
> Issue Type: Bug
> Components: Transports
> Environment: * {*}ActiveMQ-CPP Version{*}: 3.9.5
> * {*}ActiveMQ Broker Version{*}: 5.17.3
> * {*}Broker Hosts{*}: cca-prdappqua01.icc.corp:20801,
> cca-prdappqua02.icc.corp:20801
> * {*}Client Host{*}: cca-prdappbta05.icc.corp
> * {*}OS{*}: RHEL 7.9
> * {*}AMQ_URL{*}:
> "failover:(tcp://cca-prdappqua01.icc.corp:20801,tcp://cca-prdappqua02.icc.corp:20801)?maxReconnectAttempts=10&initialReconnectDelay=1000&timeout=30000&randomize=false"
> * {*}Demo Application{*}: C++ program (demo_sys_amq_modules.exe) subscribing
> to ActiveMQ.Advisory.Consumer.Queue.101 with a 30-second receive timeout.
> (attached)
> Reporter: Oren
> Priority: Major
> Fix For: 3.9.6
>
> Attachments: activemq.log, activemq.xml, demo_sys_amq_modules.cpp
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> h2. Summary
> We are experiencing a timeout issue in our PROD environment when using the
> ActiveMQ-CPP 3.9.5 client with a {{failover:}} URI, specifically when
> subscribing to the advisory topic
> {{{}ActiveMQ.Advisory.Consumer.Queue.101{}}}. The issue persists despite
> upgrading from 3.9.3 (where AMQCPP-610 was suspected) and correcting a typo
> in the {{{}AMQ_URL{}}}. The same code works in non-PROD environments
> (TEST/BETA) and in PROD with a direct {{tcp://}} URI.
> h2. Issue Description
> The demo times out in PROD when using the failover: protocol to receive
> advisory messages for Queue.101. Key observations:
> * Works in TEST/BETA with identical broker settings and failover: URI
> (maxReconnectAttempts=10).
> {noformat}
> [BETA l_onissa@cca-betappbta01 cca_domain]$
> AMQ_URL="failover:([tcp://cca-betappqua01.icc.corp:20801])"
> [BETA l_onissa@cca-betappbta01 cca_domain]$ demo_sys_amq_modules.exe 101
> ActiveMQ-CPP initialized. Connecting to
> failover:([tcp://cca-betappqua01.icc.corp:20801])
> Waiting for advisory messages on: ActiveMQ.Advisory.Consumer.Queue.101 ...
> Received advisory message of type: Advisory^{noformat}
> * Works in PROD with direct [tcp://cca-prdappqua01.icc.corp:20801].
> {noformat}
> *[PROD l_onissa@cca-prdappbta05 cca_domain]$
> AMQ_URL="[tcp://cca-prdappqua01.icc.corp:20801]"*
> *[PROD l_onissa@cca-prdappbta05 cca_domain]$ demo_sys_amq_modules.exe 101*
> ActiveMQ-CPP initialized. Connecting to [tcp://cca-prdappqua01.icc.corp:20801]
> Waiting for advisory messages on: ActiveMQ.Advisory.Consumer.Queue.101 ...
> Received advisory message of type: Advisory{noformat}
> * Fails in PROD with failover: even after upgrading to 3.9.5 (fixing
> [AMQCPP-610]) and correcting a typo in the original AMQ_URL
> (maxReconnectAttemps=0, which defaulted to maxReconnectAttempts=-1).
> {noformat}
> *[PROD l_onissa@cca-prdappbta05 cca_domain]$
> AMQ_URL="failover:([tcp://cca-prdappqua01.icc.corp:20801])"*
> *[PROD l_onissa@cca-prdappbta05 cca_domain]$ demo_sys_amq_modules.exe 101*
> ActiveMQ-CPP initialized. Connecting to
> failover:([tcp://cca-prdappqua01.icc.corp:20801])
> Waiting for advisory messages on: ActiveMQ.Advisory.Consumer.Queue.101 ...
> No advisory message received (timeout).{noformat}
> * Java-based clients work reliably in PROD with failover:.
> h2. Relevant Broker Log Entries
> From activemq.log on cca-prdappqua01 (ActiveMQ 5.17.3) activemq.log file
> attached:
> {noformat}
> 2025-05-19 21:21:23,839 | WARN | TopicSubscription:
> consumer=ID:cca-prdappbta07.icc.corp-14259-1747678799256-0:0:-1:1, ...
> dispatched=1000, delivered=0, matched=1001, ... has twice its prefetch limit
> pending, without an ack; it appears to be slow: [tcp://10.222.12.83:31247]
> 2025-05-19 17:31:22,146 | WARN | Transport Connection to:
> [tcp://10.222.12.74:25787] failed: Broken pipe (Write failed)
> 2025-05-24 20:25:15,567 | WARN | Transport Connection to:
> [tcp://10.222.12.84:33681] failed: Cannot send, channel has already
> failed{noformat}
> * Slow topic consumers on cca-prdappbta07, cca-prdappbta02, cca-prdappbta08
> suggest broker resource contention.
> * Failed connections indicate potential network instability affecting
> failover clients.
> h2. Suspected Causes
> # {*}Broker Overload{*}: Slow consumers may delay advisory message delivery,
> impacting failover clients.
> # {*}Network Issues{*}: Failed connections suggest network instability,
> disrupting failover retries or subscriptions.
> # {*}Advisory Message Absence{*}: Possible lack of consumer activity on
> Queue.101 in PROD.
> # {*}Residual Effects{*}: Earlier 3.9.3 clients with maxReconnectAttemps=0
> (defaulting to -1) may have stressed the broker, affecting current 3.9.5
> clients.
> # {*}Failover Transport{*}: Possible edge case in 3.9.5 failover handling
> for advisory topics.
> h2. Steps Taken
> * Upgraded from ActiveMQ-CPP 3.9.3 to 3.9.5 to address AMQCPP-610.
> * Corrected {{AMQ_URL}} typo (maxReconnectAttemps to
> maxReconnectAttempts=10).
> * Tested with direct {{tcp://}} URI (works) and {{failover:}} (times out).
> * Verified broker connectivity via telnet (successful).
> * Added consumer for {{Queue.101}} to trigger advisory messages (no success
> yet).
> * Enabled debug logging in the demo (logs pending).
> h2. Request for Assistance
> * Is there a known issue in ActiveMQ-CPP 3.9.5 with failover and advisory
> topic subscriptions under high broker load or network instability?
> * Could slow consumers (as seen in the log) affect failover client
> subscriptions? Any recommended configurations to mitigate this?
> * Are there additional failover parameters or patches in 3.9.5 to improve
> stability for advisory topics?
> * Suggestions for diagnosing broker-side issues (e.g., clearing stale
> connections without restart)?
> h2. Additional Information
> * {*}Broker Config{*}: advisorySupport is true (default for 5.17.3 see
> attached activemq.xml).
> * {*}Compile Command{*}:
> {noformat}
> g++ -O2 -finline-functions -D_XOPEN_SOURCE=600 -m64 -fPIC -Wall \-L/usr/lib64
> -lactivemq-cpp \-isystem /usr/include/activemq-cpp-3.9.5 \-std=c++11
> \src/demo_sys_amq_modules.cpp -o target/bin/demo_sys_amq_modules.exe{noformat}
> * {*}Next Steps Planned{*}: Test with non-advisory topic, increase receive
> timeout to 60s, check activemq.xml for advisory settings, and consider
> upgrading to 3.10.0.
> Please advise on next steps or known issues. Thank you for your support!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information, visit: https://activemq.apache.org/contact