[ 
https://issues.apache.org/jira/browse/ARTEMIS-4579?focusedWorklogId=901242&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-901242
 ]

ASF GitHub Bot logged work on ARTEMIS-4579:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Jan/24 17:14
            Start Date: 23/Jan/24 17:14
    Worklog Time Spent: 10m 
      Work Description: jbertram commented on PR #4752:
URL: 
https://github.com/apache/activemq-artemis/pull/4752#issuecomment-1906543152

   > I need to find out if there's an old message in a queue that's been there 
for longer than a specified interval, and I'm not allowed to use DLQs for that, 
because it means a manual intervention on the broker if processing fails too 
many times.
   
   I'd really like to focus on exactly why you need to know if there's an old 
message in a queue that's been there for longer than a specified interval?
   
   As noted in my previous comment, in order to have a robust detection of 
stalled consumers (a.k.a. stuck messages) you _already_ need to look at 
multiple metrics. If you're relying solely on a metric like `firstMessageAge` 
you're liable to get false positives. It's just not a good solution. I've long 
considered deprecating these metrics because folks tend to misunderstand and 
misuse them.
   
   Keep in mind that we have [slow consumer 
detection](https://activemq.apache.org/components/artemis/documentation/latest/slow-consumers.html#detecting-slow-consumers)
 built into the broker.
   
   > I know this method may be heavy but so is our workflow here, which is 
pretty much based on wasting resources in exchange for usability and easy 
maintenance.
   
   The problem, as I see it, is that by implementing this method you're going 
to force this wasting of resources on other users who likely don't want it. As 
noted previously, it's very common for JMX monitoring tools to fetch the values 
of every attribute on a given MBean. By adding this method you're implicitly 
impacting that use-case. Lots of folks use JMX monitoring tools that scan 
MBeans more than once per minute.




Issue Time Tracking
-------------------

    Worklog Id:     (was: 901242)
    Time Spent: 1h 50m  (was: 1h 40m)

> Add the *FirstMessage* API for scheduled messages
> -------------------------------------------------
>
>                 Key: ARTEMIS-4579
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-4579
>             Project: ActiveMQ Artemis
>          Issue Type: Improvement
>          Components: API
>    Affects Versions: 2.31.2
>            Reporter: Jan Å mucr
>            Priority: Major
>             Fix For: 2.32.0
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Alerting on issues with messages not being received properly for a period of 
> time is an uneasy task. We use the {{getFirstMessageAge()}} command to 
> trigger alerts in Zabbix, and it works as long as there are no consumers.
> But this approach fails when there are consumers repeatedly failing to 
> receive a message. That message is getting scheduled for redelivery over and 
> over, and even though there still is an old message in the queue to be 
> reported, it's no longer visible via {{getFirstMessage*()}} API.
> The goal here is to add a set of functions working with messages scheduled 
> for delivery:
> {noformat}
> getFirstScheduledMessageAsJSON()
> getFirstScheduledMessageTimestamp()
> getFirstScheduledMessageAge()
> {noformat}
> It may be not the most effective approach but it's quite a convenient one, 
> especially when monitoring a wide set of queues, each with its own set of 
> alerts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to