[ 
https://issues.apache.org/jira/browse/HDDS-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarveksha Yeshavantha Raju updated HDDS-13694:
----------------------------------------------
    Description: 
If {{stopBalancer}} is called when balancer is already in the STOPPED state, it 
throws an exception instead of returning successfully.

Proposed change:

1. Split the current {{validateState(boolean expectedRunning)}} into two

methods:
 * {{validateEligibility()}} – checks leader-ready and safe-mode only.

 * {{validateState(expectedRunning)}} – delegates to {{validateEligibility()}} 
and then performs the running / stopped assertions.

2. Change {{stopBalancer()}} to call {{validateEligibility()}} instead of 
{{validateState(true)}}, persist {{shouldRun = false}} before looking at 
{{taskStatus}}, and then interrupt a running task if present.

Roughly how the changes look like in code:
{code:java}
private void validateEligibility() throws 
IllegalContainerBalancerStateException {
  if (!scmContext.isLeaderReady()) {
    LOG.warn("SCM is not leader ready");
    throw new IllegalContainerBalancerStateException("SCM is not leader " +
          "ready");
  }
  if (scmContext.isInSafeMode()) {
    LOG.warn("SCM is in safe mode");
    throw new IllegalContainerBalancerStateException("SCM is in safe mode");
  }
}

private void validateState(boolean expectedRunning) throws 
IllegalContainerBalancerStateException {
  validateEligibility();
  if (!expectedRunning && !canBalancerStart()) {
    ...
  }
  if (expectedRunning && !canBalancerStop()) {
    ...
  }
}

public void stopBalancer()
    throws IOException, IllegalContainerBalancerStateException {

  Thread balancingThread = null;
  lock.lock();
  try {
    validateEligibility();               // only leadership / safemode

    saveConfiguration(config, false, 0);
    
    if (isBalancerRunning()) {
      LOG.info("Trying to stop ContainerBalancer service.");
      task.stop();
      balancingThread = currentBalancingThread;
    }
  } finally {
    lock.unlock();
  }

  if (balancingThread != null) {
    blockTillTaskStop(balancingThread);
  }
}
{code}
The main change here is the {{saveConfiguration()}} call being made regardless 
of whether the in-memory state of balancer is running or stopped. It persists 
the fact that balancer should not run to rocksDB and replicates it via Ratis to 
the other SCMs. Then, if balancer is not running, we do nothing else. This 
makes the API idempotent but also fixes a race condition:

1. When a leader steps down it stops the balancer thread locally but does not 
flip the persisted flag ({{{}shouldRun{}}} stays true).

Result: {{task != null}}, {{taskStatus == STOPPED}}.

2. When that SCM later regains leadership its {{notifyStatusChanged()}} thread 
reads {{shouldRun = true}} and – because {{taskStatus == STOPPED}} – starts a 
new balancer thread.

3. If, in the same time-window, an administrator issues {{stopBalancer}} from 
the CLI, that method
 * acquires the same lock first,
 * calls {{validateState(true)}} which expects the balancer to be RUNNING,
 * finds it STOPPED and throws an exception before persisting {{shouldRun = 
false}}.

The command silently fails and the balancer continues to run, when it should 
have actually stopped. The changes proposed above fix this race condition as 
well.

  was:
The CLI attempted to stop the Container Balancer, but it was already in the 
STOPPED state, causing an exception instead of a success message.

Proposed change:

Split the current validateState(boolean expectedRunning) into two

methods:

validateEligibility() – checks leader-ready and safe-mode only.

validateState(expectedRunning) – delegates to validateEligibility() and then 
performs the running / stopped assertions.

Change stopBalancer() to call validateEligibility() instead of 
validateState(true), persist shouldRun = false before looking at taskStatus, 
and then interrupt a running task if present.

Roughly how the changes look like in code:



private void validateEligibility() throws 
IllegalContainerBalancerStateException {
  if (!scmContext.isLeaderReady()) {
    LOG.warn("SCM is not leader ready");
    throw new IllegalContainerBalancerStateException("SCM is not leader " +
          "ready");
  }
  if (scmContext.isInSafeMode()) {
    LOG.warn("SCM is in safe mode");
    throw new IllegalContainerBalancerStateException("SCM is in safe mode");
  }
}

private void validateState(boolean expectedRunning) throws 
IllegalContainerBalancerStateException {
  validateEligibility();
  if (!expectedRunning && !canBalancerStart()) {
    ...
  }
  if (expectedRunning && !canBalancerStop()) {
    ...
  }
}

public void stopBalancer()
    throws IOException, IllegalContainerBalancerStateException {

  Thread balancingThread = null;
  lock.lock();
  try {
    validateEligibility();               // only leadership / safemode

    saveConfiguration(config, false, 0);
    
    if (isBalancerRunning()) {
      LOG.info("Trying to stop ContainerBalancer service.");
      task.stop();
      balancingThread = currentBalancingThread;
    }
  } finally {
    lock.unlock();
  }

  if (balancingThread != null) {
    blockTillTaskStop(balancingThread);
  }
}
(@Sarveksha Yeshavantha Raju this is rough code, please do due diligence)

The main change here is the saveConfiguration() call being made regardless of 
whether the in-memory state of balancer is running or stopped. It persists the 
fact that balancer should not run to rocksDB and replicates it via Ratis to the 
other SCMs. Then, if balancer is not running, we do nothing else. This makes 
the API idempotent but also fixes a race condition:

When a leader steps down it stops the balancer thread locally but does not flip 
the persisted flag (shouldRun stays true).

Result: task != null, taskStatus == STOPPED.

When that SCM later regains leadership its notifyStatusChanged() thread reads 
shouldRun = true and – because taskStatus == STOPPED – starts a new balancer 
thread.

If, in the same time-window, an administrator issues stopBalancer from the CLI, 
that method

acquires the same lock first,

calls validateState(true) which expects the balancer to be RUNNING,

finds it STOPPED and throws an exception before persisting shouldRun = false.

The command silently fails and the balancer continues to run, when it should 
have actually stopped. The changes proposed above fix this race condition as 
well.


> Container Balancer Stop Command Fails with Error as Already Stopped
> -------------------------------------------------------------------
>
>                 Key: HDDS-13694
>                 URL: https://issues.apache.org/jira/browse/HDDS-13694
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Sarveksha Yeshavantha Raju
>            Assignee: Sarveksha Yeshavantha Raju
>            Priority: Major
>              Labels: pull-request-available
>
> If {{stopBalancer}} is called when balancer is already in the STOPPED state, 
> it throws an exception instead of returning successfully.
> Proposed change:
> 1. Split the current {{validateState(boolean expectedRunning)}} into two
> methods:
>  * {{validateEligibility()}} – checks leader-ready and safe-mode only.
>  * {{validateState(expectedRunning)}} – delegates to 
> {{validateEligibility()}} and then performs the running / stopped assertions.
> 2. Change {{stopBalancer()}} to call {{validateEligibility()}} instead of 
> {{validateState(true)}}, persist {{shouldRun = false}} before looking at 
> {{taskStatus}}, and then interrupt a running task if present.
> Roughly how the changes look like in code:
> {code:java}
> private void validateEligibility() throws 
> IllegalContainerBalancerStateException {
>   if (!scmContext.isLeaderReady()) {
>     LOG.warn("SCM is not leader ready");
>     throw new IllegalContainerBalancerStateException("SCM is not leader " +
>           "ready");
>   }
>   if (scmContext.isInSafeMode()) {
>     LOG.warn("SCM is in safe mode");
>     throw new IllegalContainerBalancerStateException("SCM is in safe mode");
>   }
> }
> private void validateState(boolean expectedRunning) throws 
> IllegalContainerBalancerStateException {
>   validateEligibility();
>   if (!expectedRunning && !canBalancerStart()) {
>     ...
>   }
>   if (expectedRunning && !canBalancerStop()) {
>     ...
>   }
> }
> public void stopBalancer()
>     throws IOException, IllegalContainerBalancerStateException {
>   Thread balancingThread = null;
>   lock.lock();
>   try {
>     validateEligibility();               // only leadership / safemode
>     saveConfiguration(config, false, 0);
>     
>     if (isBalancerRunning()) {
>       LOG.info("Trying to stop ContainerBalancer service.");
>       task.stop();
>       balancingThread = currentBalancingThread;
>     }
>   } finally {
>     lock.unlock();
>   }
>   if (balancingThread != null) {
>     blockTillTaskStop(balancingThread);
>   }
> }
> {code}
> The main change here is the {{saveConfiguration()}} call being made 
> regardless of whether the in-memory state of balancer is running or stopped. 
> It persists the fact that balancer should not run to rocksDB and replicates 
> it via Ratis to the other SCMs. Then, if balancer is not running, we do 
> nothing else. This makes the API idempotent but also fixes a race condition:
> 1. When a leader steps down it stops the balancer thread locally but does not 
> flip the persisted flag ({{{}shouldRun{}}} stays true).
> Result: {{task != null}}, {{taskStatus == STOPPED}}.
> 2. When that SCM later regains leadership its {{notifyStatusChanged()}} 
> thread reads {{shouldRun = true}} and – because {{taskStatus == STOPPED}} – 
> starts a new balancer thread.
> 3. If, in the same time-window, an administrator issues {{stopBalancer}} from 
> the CLI, that method
>  * acquires the same lock first,
>  * calls {{validateState(true)}} which expects the balancer to be RUNNING,
>  * finds it STOPPED and throws an exception before persisting {{shouldRun = 
> false}}.
> The command silently fails and the balancer continues to run, when it should 
> have actually stopped. The changes proposed above fix this race condition as 
> well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to