[ 
https://issues.apache.org/jira/browse/ARTEMIS-2926?focusedWorklogId=505213&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505213
 ]

ASF GitHub Bot logged work on ARTEMIS-2926:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Oct/20 13:17
            Start Date: 27/Oct/20 13:17
    Worklog Time Spent: 10m 
      Work Description: gemmellr commented on a change in pull request #3287:
URL: https://github.com/apache/activemq-artemis/pull/3287#discussion_r512555293



##########
File path: 
artemis-commons/src/main/java/org/apache/activemq/artemis/core/server/ActiveMQScheduledComponent.java
##########
@@ -188,41 +202,53 @@ public ClassLoader run() {
 
    }
 
+   /**
+    * If {@code overBooking} is {@code true} a delay request can succeed only 
if:
+    * <ul>
+    *    <li>there is no other pending delay request
+    *    <li>there is no pending execution request
+    * </ul>
+    * <p>
+    * If {@code overBooking} is {@code false} a delay request must always 
succeed.<br>
+    * When a delay request succeed it schedule a new execution to happen in 
{@link #getPeriod()}.<br>
+    */
    public void delay() {
-      int value = delayed.incrementAndGet();
-      if (value > 10) {
-         delayed.decrementAndGet();
-      } else {
-         // We only schedule up to 10 periods upfront.
-         // this is to avoid a window where a current one would be running and 
a next one is coming.
-         // in theory just 2 would be enough. I'm using 10 as a precaution 
here.
-         scheduledExecutorService.schedule(runForScheduler, Math.min(period, 
period * value), timeUnit);
+      final AtomicBoolean bookedForRunning = this.bookedForRunning;
+      if (bookedForRunning == null || bookedForRunning.compareAndSet(false, 
true)) {
+         try {
+            scheduledExecutorService.schedule(() -> 
delayedRunForScheduler(bookedForRunning), nsPeriod, TimeUnit.NANOSECONDS);
+         } catch (RejectedExecutionException e) {
+            if (bookedForRunning == null) {
+               bookedForRunning.set(false);
+            }
+            throw e;
+         }
       }
    }
 
    public long getPeriod() {
-      return period;
+      return timeUnit.convert(nsPeriod, TimeUnit.NANOSECONDS);

Review comment:
       May not matter, but did you check if this behaviour change has any 
impact?
   (The units were not taken into account originally, whatever checkPeriod 
value was passed in at construction was passed back out unchanged without 
considering the timeUnit...while now it will be converted from nanos into the 
given timeUnit)

##########
File path: 
artemis-commons/src/main/java/org/apache/activemq/artemis/core/server/ActiveMQScheduledComponent.java
##########
@@ -188,41 +202,53 @@ public ClassLoader run() {
 
    }
 
+   /**
+    * If {@code overBooking} is {@code true} a delay request can succeed only 
if:
+    * <ul>
+    *    <li>there is no other pending delay request
+    *    <li>there is no pending execution request
+    * </ul>
+    * <p>
+    * If {@code overBooking} is {@code false} a delay request must always 
succeed.<br>
+    * When a delay request succeed it schedule a new execution to happen in 
{@link #getPeriod()}.<br>
+    */
    public void delay() {
-      int value = delayed.incrementAndGet();
-      if (value > 10) {
-         delayed.decrementAndGet();
-      } else {
-         // We only schedule up to 10 periods upfront.
-         // this is to avoid a window where a current one would be running and 
a next one is coming.
-         // in theory just 2 would be enough. I'm using 10 as a precaution 
here.
-         scheduledExecutorService.schedule(runForScheduler, Math.min(period, 
period * value), timeUnit);
+      final AtomicBoolean bookedForRunning = this.bookedForRunning;
+      if (bookedForRunning == null || bookedForRunning.compareAndSet(false, 
true)) {
+         try {
+            scheduledExecutorService.schedule(() -> 
delayedRunForScheduler(bookedForRunning), nsPeriod, TimeUnit.NANOSECONDS);
+         } catch (RejectedExecutionException e) {
+            if (bookedForRunning == null) {
+               bookedForRunning.set(false);
+            }
+            throw e;
+         }
       }
    }
 
    public long getPeriod() {
-      return period;
+      return timeUnit.convert(nsPeriod, TimeUnit.NANOSECONDS);
    }
 
    public synchronized ActiveMQScheduledComponent setPeriod(long period) {
-      this.period = period;
+      this.nsPeriod = TimeUnit.NANOSECONDS.convert(period, timeUnit);
       restartIfNeeded();
       return this;
    }
 
    public synchronized ActiveMQScheduledComponent setPeriod(long period, 
TimeUnit unit) {
-      this.period = period;
       this.timeUnit = unit;
+      this.nsPeriod = TimeUnit.NANOSECONDS.convert(period, unit);
       restartIfNeeded();
       return this;
    }
 
    public long getInitialDelay() {
-      return initialDelay;
+      return timeUnit.convert(initialNsDelay, TimeUnit.NANOSECONDS);

Review comment:
       Same question about unit conversion as earlier.

##########
File path: 
artemis-commons/src/main/java/org/apache/activemq/artemis/core/server/ActiveMQScheduledComponent.java
##########
@@ -287,35 +321,63 @@ private void restartIfNeeded() {
       }
    }
 
-   final Runnable runForExecutor = new Runnable() {
-      @Override
-      public void run() {
-         if (onDemand && delayed.get() > 0) {
-            delayed.decrementAndGet();
+   private void runForExecutor(AtomicBoolean bookedForRunning) {
+      if (bookedForRunning != null) {
+         if (!bookedForRunning.compareAndSet(true, false)) {
+            // this can happen only if there is a racing stop
+            return;
          }
+      }
+      ActiveMQScheduledComponent.this.run();
+   }
 
-         if (!onDemand && lastTime > 0) {
-            if (System.currentTimeMillis() - lastTime < millisecondsPeriod) {
-               logger.trace("Execution ignored due to too many simultaneous 
executions, probably a previous delayed execution");
+   private void delayedRunForScheduler(AtomicBoolean bookedForRunning) {
+      if (executor != null) {
+         if (bookedForRunning != null) {
+            if (!bookedForRunning.get()) {
+               // this can happen only if there is a racing stop
                return;
             }
          }
-
-         lastTime = System.currentTimeMillis();
-
-         ActiveMQScheduledComponent.this.run();
+         try {
+            executor.execute(() -> runForExecutor(bookedForRunning));
+         } catch (RejectedExecutionException e) {
+            if (bookedForRunning != null) {
+               bookedForRunning.set(false);
+            }
+            throw e;
+         }
+      } else {
+         runForExecutor(bookedForRunning);
       }
-   };
-
-   final Runnable runForScheduler = new Runnable() {
-      @Override
-      public void run() {
-         if (executor != null) {
-            executor.execute(runForExecutor);
-         } else {
-            runForExecutor.run();
+   }
+
+   private void runForScheduler(AtomicBoolean bookedForRunning) {
+      if (executor != null) {
+         if (bookedForRunning != null && 
!bookedForRunning.compareAndSet(false, true)) {

Review comment:
       Both of the 'if executor != null' legs do this check first, I think it 
would be simpler to move it outside the if and do it once atthe start. The 
descriptions dont differ enough to justify both IMO, so its more rather than 
less complicated/confusing having both.

##########
File path: 
artemis-commons/src/main/java/org/apache/activemq/artemis/core/server/ActiveMQScheduledComponent.java
##########
@@ -188,41 +202,53 @@ public ClassLoader run() {
 
    }
 
+   /**
+    * If {@code overBooking} is {@code true} a delay request can succeed only 
if:
+    * <ul>
+    *    <li>there is no other pending delay request
+    *    <li>there is no pending execution request
+    * </ul>
+    * <p>
+    * If {@code overBooking} is {@code false} a delay request must always 
succeed.<br>
+    * When a delay request succeed it schedule a new execution to happen in 
{@link #getPeriod()}.<br>
+    */
    public void delay() {
-      int value = delayed.incrementAndGet();
-      if (value > 10) {
-         delayed.decrementAndGet();
-      } else {
-         // We only schedule up to 10 periods upfront.
-         // this is to avoid a window where a current one would be running and 
a next one is coming.
-         // in theory just 2 would be enough. I'm using 10 as a precaution 
here.
-         scheduledExecutorService.schedule(runForScheduler, Math.min(period, 
period * value), timeUnit);
+      final AtomicBoolean bookedForRunning = this.bookedForRunning;

Review comment:
       Might be simpler to follow without the variable shadowing (here and in 
the called methods).

##########
File path: 
artemis-commons/src/main/java/org/apache/activemq/artemis/core/server/ActiveMQScheduledComponent.java
##########
@@ -60,41 +58,58 @@
     * @param checkPeriod              the delay between the termination of one 
execution and the start of the next
     * @param timeUnit                 the time unit of the {@code 
initialDelay} and {@code checkPeriod} parameters
     * @param onDemand                 if {@code true} the task won't be 
scheduled on {@link #start()}, {@code false} otherwise
+    * @param overBooking              if {@code true} {@link #delay()} 
executions and period ones are allowed to accumulate, {@code false} otherwise. 
Default is {@code false}.
     */
    public ActiveMQScheduledComponent(ScheduledExecutorService 
scheduledExecutorService,
                                      Executor executor,
                                      long initialDelay,
                                      long checkPeriod,
                                      TimeUnit timeUnit,
-                                     boolean onDemand) {
+                                     boolean onDemand,
+                                     boolean overBooking) {

Review comment:
       If this isnt used I would just remove it. It makes the actually-used 
code more complicated to understand than it needs to be. Ignoring things that 
take longer than 'period' to run, the previous/existing stuff couldnt 
overbook-without-end (was capped at ~10 outstanding 'on demands'), while this 
seemingly could, so its in fact worse in that regard.
   
   I would remove the overbooking, eliminate the possibility of null for the 
atomic, and simplify the rest of the code accordingly. It will still be plenty 
complicated :)

##########
File path: 
artemis-commons/src/main/java/org/apache/activemq/artemis/core/server/ActiveMQScheduledComponent.java
##########
@@ -188,41 +202,53 @@ public ClassLoader run() {
 
    }
 
+   /**
+    * If {@code overBooking} is {@code true} a delay request can succeed only 
if:
+    * <ul>
+    *    <li>there is no other pending delay request
+    *    <li>there is no pending execution request
+    * </ul>
+    * <p>
+    * If {@code overBooking} is {@code false} a delay request must always 
succeed.<br>
+    * When a delay request succeed it schedule a new execution to happen in 
{@link #getPeriod()}.<br>
+    */
    public void delay() {
-      int value = delayed.incrementAndGet();
-      if (value > 10) {
-         delayed.decrementAndGet();
-      } else {
-         // We only schedule up to 10 periods upfront.
-         // this is to avoid a window where a current one would be running and 
a next one is coming.
-         // in theory just 2 would be enough. I'm using 10 as a precaution 
here.
-         scheduledExecutorService.schedule(runForScheduler, Math.min(period, 
period * value), timeUnit);
+      final AtomicBoolean bookedForRunning = this.bookedForRunning;
+      if (bookedForRunning == null || bookedForRunning.compareAndSet(false, 
true)) {
+         try {
+            scheduledExecutorService.schedule(() -> 
delayedRunForScheduler(bookedForRunning), nsPeriod, TimeUnit.NANOSECONDS);
+         } catch (RejectedExecutionException e) {
+            if (bookedForRunning == null) {

Review comment:
       Guessing that was meant to be != null? EDIT: as later commenting, think 
the null check should be eliminated through simplification.

##########
File path: 
artemis-commons/src/main/java/org/apache/activemq/artemis/core/server/ActiveMQScheduledComponent.java
##########
@@ -265,6 +291,14 @@ public void stop() {
       if (future != null) {
          future.cancel(false);
          future = null;
+         final AtomicBoolean bookedForRunning = this.bookedForRunning;
+         if (bookedForRunning != null) {
+            // this is trying to fast-fail any running task or pending one:
+            bookedForRunning.compareAndSet(false, true);

Review comment:
       This goes at odds with the prior behaviour, where no attempt was made to 
stop the tasks already handed to the executor or the final scheduling already 
being in the process of either handing off another to the executor or running 
it directly on the scheduledExecutorService. The latter is still true to an 
extent, the future.cancel(false) remains above, but this new bit will ensure it 
doesnt actually do anything unless it already passed the gate and starts the 
'ActiveMQScheduledComponent.this.run();'.
   
   Have you checked if there is there any impact of this behaviour change? If 
not, might it be worth omitting this bit, and just leaving the reassignment 
below for now?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 505213)
    Time Spent: 1h  (was: 50m)

> Scheduled task executions are skipped randomly
> ----------------------------------------------
>
>                 Key: ARTEMIS-2926
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-2926
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.13.0
>            Reporter: Apache Dev
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Scheduled tasks extending {{ActiveMQScheduledComponent}} could randomly skip 
> an execution, logging:
> {code}
> Execution ignored due to too many simultaneous executions, probably a 
> previous delayed execution
> {code}
> The problem is in the "ActiveMQScheduledComponent#runForExecutor" Runnable.
> Times to be compared ({{currentTimeMillis()}} and {{lastTime}}) are taken 
> inside the runnable execution itself. So, depending on relative execution 
> times, it could happen that the difference is less than the given period 
> (e.g. 1 ms), resulting in a skipped execution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to