[ 
https://issues.apache.org/jira/browse/ARTEMIS-2926?focusedWorklogId=499313&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-499313
 ]

ASF GitHub Bot logged work on ARTEMIS-2926:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 12/Oct/20 10:16
            Start Date: 12/Oct/20 10:16
    Worklog Time Spent: 10m 
      Work Description: gemmellr commented on pull request #3287:
URL: https://github.com/apache/activemq-artemis/pull/3287#issuecomment-707028383


   I think the changes seem ok, but I think perhaps the PR overlooks another 
simpler and more important behaviour that may be leading to the observed issue?
   
   The given period at construction of the scheduled tasks is documented as 
"the delay between the termination of one execution and the start of the next". 
Thats unsurprisingly consistent with the behaviour of 
scheduledExecutorService.scheduleWithFixedDelay(), which is whats used by the 
'not onDemand' instances of the scheduled tasks. However, the tasks dont 
actually run in the scheduledExecutorService thread if the additional executor 
is given during construction. If the second executor is given, the scheduled 
task is just offloaded by the scheduledExecutorService for execution on the 
provided executor and entirely forgotten about. That seems like it could be the 
core of the observed issue to me?
   
   The above means there is no further tracking by the scheduler of when the 
task actually runs or how long the given task takes, meaning the periodic 
contract is somewhat lost at that point forward. Consider a situation:
   1.  Say there is a backlog of existing (related or unrelated) things for the 
executor still to run, so a new 'scheduled offloaded' task may not run for a 
little while until that is processed. Or instead say that thread scheduling 
means the second executor doesnt immediately get to executing the task. 
Whatever the reason, something means there is a small delay, but eventually the 
task does run.
   2. A second task instance comes along from the scheduledExecutorService at 
some point, very closely after the configured period since it isnt affected by 
actual execution of the task, which gets offloaded. Maybe now there isnt any or 
as much backlog on the second executor, or theres a better thread scheduling 
environment, and this second task may get run relatively quicker than the prior 
instance actualy did.
   3. Due to the 'lastTime' tracking occuring within the task itself, on the 
second executor, this second task instance which was offloaded by the 
scheduledExecutorService at its precise period, will now be observed to have 
occurred within the configured period of the previous tasks 'lstTime' and so 
get skipped.
   4. This means nothing happened, and wont until the scheduledExecutorService 
comes along after a 3rd period and offloads the task another time, by which 
point approx double the expected period has elapsed and the task actually 
executes. Rinse and repeat this process over and over.
   
   If the second executor is provided, its actual execution + 'lastTime' period 
checks are essentially happening independently of the scheduling, and it seems 
like the scheduledExecutorService is trying to somewhat blindly throw tasks 
over a wall such that they land at the right time and actually get to run as 
opposed to skipping and waiting for next time.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 499313)
    Time Spent: 20m  (was: 10m)

> Scheduled task executions are skipped randomly
> ----------------------------------------------
>
>                 Key: ARTEMIS-2926
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2926
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.13.0
>            Reporter: Apache Dev
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Scheduled tasks extending {{ActiveMQScheduledComponent}} could randomly skip 
> an execution, logging:
> {code}
> Execution ignored due to too many simultaneous executions, probably a 
> previous delayed execution
> {code}
> The problem is in the "ActiveMQScheduledComponent#runForExecutor" Runnable.
> Times to be compared ({{currentTimeMillis()}} and {{lastTime}}) are taken 
> inside the runnable execution itself. So, depending on relative execution 
> times, it could happen that the difference is less than the given period 
> (e.g. 1 ms), resulting in a skipped execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to