[ 
https://issues.apache.org/jira/browse/HBASE-13146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345164#comment-14345164
 ] 

Ted Yu commented on HBASE-13146:
--------------------------------

lgtm

> Race Condition in ScheduledChore and ChoreService
> -------------------------------------------------
>
>                 Key: HBASE-13146
>                 URL: https://issues.apache.org/jira/browse/HBASE-13146
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 2.0.0, 1.1.0
>            Reporter: zhangduo
>            Assignee: zhangduo
>             Fix For: 2.0.0, 1.1.0
>
>         Attachments: HBASE-13146.patch
>
>
> Here is my findings when addressing HBASE-13145.
> {code:title=ChoreService.java}
>   public synchronized boolean scheduleChore(ScheduledChore chore) {
>       ...
>       ScheduledFuture<?> future =
>           scheduler.scheduleAtFixedRate(chore, chore.getInitialDelay(), 
> chore.getPeriod(),
>             chore.getTimeUnit());
>       chore.setChoreServicer(this);
>       ...
>   }
> {code}
> So we schedule the chore first, and then set chore servicer. And for 
> CompactionChecker, the initialDelay is 0, so it is possible that the chore is 
> run before we set chore servicer for it. And see this
> {code:title=ScheduledChore.java}
>   public void run() {
>     ...
>     else if (stopper.isStopped() || !isScheduled()) {
>       cancel(false);
>       cleanup();
>       if (LOG.isInfoEnabled()) LOG.info("Chore: " + getName() + " was 
> stopped");
>     }
>     ...
>   }
>     ...
>   public synchronized boolean isScheduled() {
>     return choreServicer != null && choreServicer.isChoreScheduled(this);
>   }
> {code}
> So it is possible that isScheduled() returns false and we start to cancel the 
> chore. You can insert a sleep between scheduled chore and set chore servicer, 
> then you can always get the log ' Chore: CompactionChecker was stopped'. But 
> it does not always actually cancel the chore because the cancel method's 
> implementation.
> {code:title=ScheduledChore.java}
>   public synchronized void cancel(boolean mayInterruptIfRunning) {
>     if (isScheduled()) choreServicer.cancelChore(this, mayInterruptIfRunning);
>     choreServicer = null;
>   }
> {code}
> So if you insert a sleep before cancel(remember to set a larger sleep time 
> here), then you can make the test always fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to