[ 
https://issues.apache.org/jira/browse/AMQ-7340?focusedWorklogId=686195&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-686195
 ]

ASF GitHub Bot logged work on AMQ-7340:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Nov/21 03:14
            Start Date: 25/Nov/21 03:14
    Worklog Time Spent: 10m 
      Work Description: lucastetreault opened a new pull request #728:
URL: https://github.com/apache/activemq/pull/728


   We recently encountered a similar performance degradation as is described 
here: https://issues.apache.org/jira/browse/AMQ-7340 
   
   A specific example we encountered was with around 20k messages scheduled 
very quickly where each message adds 100ms delay times with the previous 
message.
   Example:
   message 1 > delay time 100ms
   message 2 > delay time 200ms
   message 3 > delay time 300ms
   ...
   message 10,000 >> delay time 1000000ms (1000s)
   ...
   message 20,000 >> delay time 2000000ms (2000s)
   
   If delivered on schedule all these messages should be moved to the queue 
within ~33 minutes but we observed that it took nearly 3 hours for all the 
messages to be moved to the queue. 
   
   After diving deep on the issue it seems the main loop that processes 
scheduled messages process them by traversing the B+ Tree index from the root 
to find the leftmost leaf node which contains the messages with the earliest 
executionTime. This traversal does a disk read at each branch and unmarshals 
the raw data before moving to the next branch and is not cached for future 
reads as far as I can tell. The scheduled jobs in that leaf are processed then 
we repeat the traversal to find the execution time of the next batch of jobs to 
calculate how long the loop should sleep for. So for every loop, we find the 
left most node twice, at the start and end of the loop. It seems like this loop 
can take a long time and we end up falling behind and not being able to catch 
up since we're processing one node at a time. 
   
   This change still does the traversal from the root node to the leftmost node 
at the start of the loop but once it finds that leftmost node it will iterate 
over the leaf nodes sequentially until it finds a job with an execution time 
greater than the current time. This means that it will always "catch up" on 
every iteration of the loop. The index is locked for the duration of the 
iteration so there's probably some risk that scheduled messages can't make it 
in but in practice it doesn't seem like an issue. I ran 100 connections against 
a broker scheduling messages as fast as they could with 100ms delays as in the 
example above and I was able to schedule 120k messages within a couple of 
seconds without any issues. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@activemq.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 686195)
    Time Spent: 20m  (was: 10m)

> Scheduled messages performance degrade
> --------------------------------------
>
>                 Key: AMQ-7340
>                 URL: https://issues.apache.org/jira/browse/AMQ-7340
>             Project: ActiveMQ
>          Issue Type: Bug
>         Environment: ActiveMQ broker has been started in a docker container, 
> with (most likely) sufficient allocation of resources.
>            Reporter: Daynews
>            Assignee: Matt Pavlovich
>            Priority: Minor
>         Attachments: ScheduleActiveMQ.zip
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have sent lot of scheduled messages with 10ms delay between each to see if 
> the broker can cope with high load of scheduled messages. Sending delayed 
> messages to the queue works fine, however I get a problem when those messages 
> need to be put to the main queue when next schedule time is reached. The rate 
> of putting scheduled messages to the main queue drops drastically at around 
> 1500-3000 messages. I tried to search for a potential cause why this happen, 
> but was not able to indicate anything. Even restarting the broker or cleaning 
> the main queue, the rate of putting scheduled messages stays at ~0.5s leaving 
> many scheduled messages behind. 
> Does anyone know a potential cause for his problem? Is this performance 
> bottleneck or insufficient resources or badly configured RabitMQ (I've used 
> default settings).
> Thanks for the support.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to