> On Jan 25, 2022, at 4:36 PM, Endre Stølsvik <en...@stolsvik.com> wrote:
>
> Wrt. the "next": I would argue that if one does not see value of aligning
> with Artemis, i.e. firstMessageTimestamp, then *head* is much better.
> "Head" is, AFAIK, a quite common terminology of the "next to be dequeued"
> in queue contexts (check JavaDoc of Deque), and in functional contexts, a
> (linked) list is often divided in a head (first element) and tail (the
> rest). I would argue that this name is pretty self-explanatory and
> intuitive. (It also happens to align with RabbitMQ, which if not very
> weighty in itself, shouldn't be particularly negative! - I will argue that
> trying to find common terminology wrt. such concepts in a MQ context is a
> net positive.)
Naming is fun, let’s get the core constructs down and then we can see how the
names look and if it makes sense. ‘Head’ makes sense to me.
> With information about the timestamp of the *last* message in the queue as
> well as the timestamp of the head message, you can calculate the "span of
> time" currently on the queue. With the existing information about the depth
> (length/size), and the new ability to calculate age of head, you have a
> good view of the queue situation.
Yep.
> I still do not see the point of the "first message on destination since
> boot of ActiveMQ": What is the use of this? Why would I in an operational
> context care? But of course, if it has zero cost, then why not! ;-)
First-ever message tells you how long you’ve been processing messages in the
history of the boot of the broker. Head and Tail only provide you the active
window.
> I also still don't quite understand the value of knowing any timestamps of
> messages that *have been* on the queue, i.e. "last to be dispatched" or "..
> dequeued", or "last to be enqueued". If there are no messages on the queue,
> then there is no timestamp for head or last. That indeed also means that
> there is no "timespan" on the queue, and the *current* latency is 0. If the
> head == last, i.e. 1 message, then the span is also 0, which also makes
> sense - but that one message might be old.
This tells you the last time a message was consumed. This is useful when the
queue is empty. Again, this is about total life of the queue, not just the
active window.
> Evaluating the time it took to process the last message that was processed
> often bears little information about the throughput of messages: One
> message might be heavy to process, while the next is light and fast. In
> addition, we have GC pauses and similar that can make single message
> processings times differ wildly.
Currently, ActiveMQ has min, max and average processing times. The ability to
view how the last message performed provides you the snapshot info on the
current consumer(s). Are you inching towards being a slow consumer situation
that could trigger a slow consumer policy? ie.. abort the connection?
This also aligns w/ IBM MQ which has LAST_GET_DATE/TIME and would ease their
transition.
> If you keep the lastEnqueuedMessageTimestamp on the destination, then it
> will be very fast to find the value for lastMessageTimestamp: If there are
>> 0 messages on queue, then return that directly, if there are 0 messages on
> queue, then return "null".
This metric is different than current ‘head’ and ’tail’ metrics. This is about
entire life of the queue. If there haven’t been messages on a queue for hours,
you’d want to know that to go track down why producers aren’t sending.
> There is a bit of a point to not include *too* much information in those
> statistics reply messages - the set of datapoints should have an
> operational value. I have several thousand queues, and query for ">", so I
> will get several thousand reply messages. I'd rather query more often, than
> get dozens of datapoints that is of little value.
The idea is to re-use the metrics from the destination for the plugin. It
should do very little actual ‘work’.
> My current ideal set of properties would be, in order of importance:
> 1. headMessageTimestamp (with the specification that it is "null" if there
> are no message)
> 1b. brokerTimestamp (already present, as JMSTimestamp) - to calculate age
> at time of capture, but both timestamp and age are relevant. (You may also
> get an understanding of time skew between you and the broker)
> 2. size of queue (already present)
> 3. lastMessageTimestamp (with the specification that it is "null" if there
> are no message)
I think we could add ‘head’ and ’tail’ concepts to clarify the ‘window view’
from the ‘lifetime view’.
> I could change the PR to reflect the wanted name change for
> firstMessageTimestamp (to headMessageTimestamp or whatever you would want),
> but it will be a bit heavier for me to dive deeper down into the ActiveMQ
> internals to include those other datapoints.
I’ve got some other Destination enhancements planned for 5.16.x.. I’m happy to
incorporate your concepts in a comprehensive ’statistics update’ PR.
> At any rate, I will be happy no matter what things are named, if I only get
> the "BrokerInTime" timestamp of the current first message on the
> destination, and that this is "null" (not set/0) if there are no message on
> the destination!
Let me update the requirements draft w/ a new set to include ‘head’ and ’tail’
values.
> PS: If you do not merge my PR, then I hope that you could include the other
> small change in the PR, i.e. the improvement of how to request
> "null-termination" of the replies, as I found the existing solution to be
> quite hard to decipher how to enable! Setting a query message property to
> enable features of the query was much simpler than making a long and
> somewhat convoluted query destination.
Let’s revisit this once we’ve settled on design for destination metrics.
Perhaps you submit a new PR once those changes are in.
> PPS: Note that in the current implementation in the PR, the gathering of
> the information is async, and it is thus possible that size=1, while
> firstMessageTimestamp is "null" - and also the opposite, that size=0, while
> firstMessageTimestamp is set. It will both be hard and probably pretty
> costly to make this consistent (due to synchronization/blocking), and in an
> operational monitoring setting, it doesn't matter at all: Those values are
> evaluated independently, and any transient skew is of no importance.
Not sure about null.. maybe we’ll need to use 0 or -1 to indicate some
“non-value” concept.
> PPPS: Another thing to consider at a later point, is to include good
> metrics, as in Prometheus/Grafana-style. The current solution for ActiveMQ
> seems to be to hook on a JMX exporter, but adding this natively to ActiveMQ
> would probably give value. Spring's Micrometer is pretty nice, selling
> itself as "slf4j for metrics" / "Vendor-neutral application metrics
> facade": If you instrument using Micrometer, you can get metrics gathering
> by Prometheus, but also 18 other such solutions. https://micrometer.io/. By
> exposing all the information we've discussed as meters, you could really
> get a good view of throughput, queue sizes, latencies etc.
Metrics collection systems are like popular music acts, there is a new ‘best’
every few months ;-)
I think as long as we provide standard ways to access the data, incorporating
them into any system is straight-forward.
Thanks!
Matt Pavlovich
> On Tue, Jan 25, 2022 at 9:47 PM Matt Pavlovich <mattr...@gmail.com> wrote:
>
>> Good point on the ’most recent dequeued’ message being a good marker. I
>> think ideally we wouldn’t have to browse the message or have a sliding
>> window of updated metrics for all messages, but achieving both of those is
>> probably unavoidable.
>>
>> I think first “all time” and “next to dispatch” are useful enough to paint
>> the picture of flow health.
>>
>> Next draft of metrics:
>>
>> (’next’ Aka ‘head’ or ’next’ message to be dequeued— requires a browse)
>> nextMessageEnqueuedTimestamp (from the message’s brokerInTime)
>> nextMessageID
>>
>> (‘last’ aka the most recent message to be ack’d — processed in-flight)
>> lastEnqueuedMessageTimestamp (from brokerInTime on the message)
>> lastDequeuedTimestamp (clock timestamp of when the message was dequeued)
>> lastDequeuedMessageTimestamp (from brokerInTime on the message)
>> lastDequeuedMessageID
>>
>> (‘First’ aka first all time in the life of the destination since boot —
>> processed in-flight)
>> firstEnqueuedMessageTimestamp (from brokerInTime)
>> firstDequeuedTimestamp (wall clock time when the message was dequeued)
>> firstDequeuedMessageTimestamp (from brokerInTime)
>>
>> Calculations:
>> Queue processing latency can then be calculated as =
>> lastDequeuedMessageTimestamp - lastDequeuedTimestamp
>>
>> Oldest message age on the queue (when queueSize > 0):
>> lastEnqueuedMessageTimestamp
>>
>> Thoughts?
>>
>> -Matt Pavlovich
>>
>>> On Jan 25, 2022, at 1:11 PM, Endre Stølsvik <en...@stolsvik.com> wrote:
>>>
>>> Hi!
>>>
>>> Thanks for the positive feedback!
>>>
>>> From the issue AMQ-8463, I do not quite understand the terminology.
>>>
>>> The original name I came up with for this property was
>>> "headMessageBrokerInTime", to indicate that it was the "message at head
>> of
>>> queue", and the *amqMessage.getBrokerInTime()* of that message.
>>> However, when looking at corresponding functionality in
>> QueueControl(Impl)
>>> in Artemis, I found that the existing functionality there was called
>>> "firstMessageTimestamp", and thought that it would make sense to align
>> the
>>> names when it is a new feature anyway.
>>> For comparison, in RabbitMQ, the corresponding feature's name is a mix of
>>> those two, "headMessageTimestamp" -
>>> https://www.rabbitmq.com/rabbitmqctl.8.html#head_message_timestamp, but
>> you
>>> evidently want to install a small plugin to get it on all messages:
>>> https://github.com/rabbitmq/rabbitmq-message-timestamp, as there is
>>> something with small messages not hitting the message store and thus not
>>> actually getting the timestamp set in default config:
>>>
>> https://github.com/deadtrickster/prometheus_rabbitmq_exporter/issues/15#issuecomment-376684478
>>>
>>> (From that last issue comment, this sentence resonates: *"The timestamp
>> of
>>> the head message is the best metric available to evaluate the entire
>>> system's ability to keep up with demand."*
>>> .. but I would also add the queue length, which already is present in the
>>> Destination info from StatisticsBroker(Plugin) as 'size'. Thus, with
>> these
>>> two properties available, it is possible to make a remote monitoring
>> system
>>> for all destinations on ActiveMQ, which is ideal for what I want to
>>> accomplish!)
>>>
>>> So, back to the AMQ-8463, I am not sure what these two lines means:
>>>
>>> *> "firstMessageTimestamp - the timestamp of the first message to
>> traverse
>>> the destination" > "lastMessageTimestamp - the timestamp of the most
>> recent
>>> message to traverse the destination (aka 'message age' for messages still
>>> on the queue)"*
>>>
>>> If you by the first line mean "the timestamp of the first message to
>> *ever
>>> traverse the destination in the lifetime of this ActiveMQ instance*", I
>>> believe one ends up with a different concept here than those other two
>>> brokers, and it does not accomplish any of my goals of knowing the
>>> liveliness of the queues.
>>>
>>> The second concept I would rather give an even more specific name, like
>>> "mostRecentDequeuedMessageTimestamp" or something like that, if I
>>> understood you correctly.
>>>
>>> HOWEVER: The big point of the property that I need and have implemented,
>> is
>>> that you basically get two monitoring features in one go: If a message at
>>> the head of the queue is *old*, then it *either* means that the
>> consumer(s)
>>> is stuck or slow, OR it means that there is a substantial queue, where
>> the
>>> messages get old *while waiting on the queue*. By also getting the queue
>>> length, you can distinguish between those two cases.
>>>
>>> But crucially: If there *is no message at the head of the queue*, then
>> from
>>> a monitoring perspective wrt. asserting that there is progress/liveliness
>>> on the queue (wrt. consumers), you are actually in the clear! (If you
>> want
>>> to monitor that there is actually messages flowing at all, one may
>> consider
>>> the already existing properties enqueueCount, dequeueCount, and/or
>>> dispatchCount.)
>>>
>>>
>>> *> "One side benefit of this approach is that you’d have the timestamp
>>> values to work with even when the queue is empty."*
>>>
>>> This means that - assuming that I understood the concept of your two
>>> suggested timestamps and that comment above - neither of those two values
>>> will actually accomplish what I need: I *do not want* a timestamp if
>> there
>>> are no messages on the queue, as the queue is then empty, and evidently
>>> consumers are keeping up! Getting a timestamp of the last message to
>>> traverse would undermine the functionality, as I could then not use this
>>> value alone. Just consider a situation where a message traversed and was
>>> consumed properly, but then there was no new message for 2 hours.
>>>
>>> Kind regards,
>>> Endre Stølsvik
>>>
>>>
>>> On Tue, Jan 25, 2022 at 4:42 PM Matt Pavlovich <mattr...@gmail.com>
>> wrote:
>>>
>>>> Endre-
>>>>
>>>> Thank you for the contribution! I think this is a good metric to track
>> on
>>>> the _destination_ itself.. along w/ the lastMessageTimestamp. Other
>>>> products have it and it provides good observability feedback as to the
>>>> traffic progressing over the destinations.
>>>>
>>>> One side benefit of this approach is that you’d have the timestamp
>> values
>>>> to work with even when the queue is empty.
>>>>
>>>> I’ve created JIRA AMQ-8463 <
>> https://issues.apache.org/jira/browse/AMQ-8463>
>>>> to track this and will target it for 5.17.0 and 5.16.5. If you have
>> other
>>>> use cases or scenarios, please add your notes to the JIRA.
>>>>
>>>> Thanks!
>>>> Matt Pavlovich
>>>>
>>>>> On Jan 24, 2022, at 6:13 PM, Endre Stølsvik <en...@stolsvik.com>
>> wrote:
>>>>>
>>>>> Hi!
>>>>>
>>>>> I have a small contribution I was hoping that you would consider.
>>>>>
>>>>> The feature is in the StatisticsBroker(Plugin), where I have added an
>>>>> optional request property, which if present will on the reply message
>>>>> include the "BrokerInTime" for the first message in the destination,
>> as a
>>>>> field "firstMessageTimestamp".
>>>>>
>>>>> How do you want me to do such a suggestion? Should I first open an
>> issue
>>>> /
>>>>> feature request, and then do a PR for it, or just submit a PR directly?
>>>>>
>>>>> Kind regards,
>>>>> Endre Stølsvik
>>>>
>>>>
>>
>>