[
https://issues.apache.org/jira/browse/HDDS-15138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aryan Gupta updated HDDS-15138:
-------------------------------
Description:
Today SCM safemode pipeline exit checks are effectively hardcoded to
{{{}RATIS/THREE{}}}, which is incorrect when cluster default replication is EC.
In EC-default deployments, safemode should validate pipelines for the
configured default replication instead of only {{{}RATIS/THREE{}}}.
This patch generalizes safemode pipeline validation to use
{{{}ReplicationConfig.getDefault(conf){}}}:
* {{HealthyPipelineSafeModeRule}} now evaluates pipelines matching the
configured default replication config and uses required node count from that
config.
* {{OneReplicaPipelineSafeModeRule}} now tracks/report-validates pipelines
matching the configured default replication config.
To keep behavior consistent, {{BackgroundPipelineCreator}} is updated to
include EC pipeline creation when default replication type is EC, while
preserving existing RATIS behavior when EC is not configured.
h3. Expected outcome
* RATIS default: behavior remains unchanged.
* EC default: safemode pipeline checks validate EC pipelines, and background
creation can create EC pipelines accordingly.
h3. Validation
Added/updated SCM tests for EC-default and RATIS-default paths. Focused suite
passes:
* {{TestHealthyPipelineSafeModeRule}}
* {{TestOneReplicaPipelineSafeModeRule}}
* {{TestSCMSafeModeManager}}
* {{TestBackgroundPipelineCreator}}
h3. Note
For EC-default setups in SCM safemode paths, ensure both are set:
* {{ozone.replication.type=EC}}
* {{ozone.replication=RS-3-2-1024k}}
was:
Today SCM safemode pipeline exit checks are effectively hardcoded to
{{{}RATIS/THREE{}}}, which is incorrect when cluster default replication is EC.
In EC-default deployments, safemode should validate pipelines for the
configured default replication instead of only RATIS/THREE.
This change generalizes safemode pipeline validation to use
{{{}ReplicationConfig.getDefault(conf){}}}:
* {{HealthyPipelineSafeModeRule}} now evaluates pipelines matching the
configured default replication config and uses required node count from that
config.
* {{OneReplicaPipelineSafeModeRule}} now tracks/report-validates pipelines for
the configured default replication config.
To keep behavior consistent, {{BackgroundPipelineCreator}} is updated to
include EC pipeline creation when default replication type is EC, while
preserving existing RATIS behavior when EC is not configured.
Expected outcome
* RATIS default: behavior remains unchanged.
* EC default: safemode pipeline checks validate EC pipelines and background
creation can create EC pipelines accordingly.
Validation
* Added/updated SCM tests for EC-default and RATIS-default paths.
* Focused test suite passes for:
** {{TestHealthyPipelineSafeModeRule}}
** {{TestOneReplicaPipelineSafeModeRule}}
** {{TestSCMSafeModeManager}}
** {{TestBackgroundPipelineCreator}}
> SCM safemode pipeline rules should honor default EC replication config
> ----------------------------------------------------------------------
>
> Key: HDDS-15138
> URL: https://issues.apache.org/jira/browse/HDDS-15138
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Aryan Gupta
> Assignee: Aryan Gupta
> Priority: Major
>
> Today SCM safemode pipeline exit checks are effectively hardcoded to
> {{{}RATIS/THREE{}}}, which is incorrect when cluster default replication is
> EC. In EC-default deployments, safemode should validate pipelines for the
> configured default replication instead of only {{{}RATIS/THREE{}}}.
> This patch generalizes safemode pipeline validation to use
> {{{}ReplicationConfig.getDefault(conf){}}}:
> * {{HealthyPipelineSafeModeRule}} now evaluates pipelines matching the
> configured default replication config and uses required node count from that
> config.
> * {{OneReplicaPipelineSafeModeRule}} now tracks/report-validates pipelines
> matching the configured default replication config.
> To keep behavior consistent, {{BackgroundPipelineCreator}} is updated to
> include EC pipeline creation when default replication type is EC, while
> preserving existing RATIS behavior when EC is not configured.
> h3. Expected outcome
> * RATIS default: behavior remains unchanged.
> * EC default: safemode pipeline checks validate EC pipelines, and background
> creation can create EC pipelines accordingly.
> h3. Validation
> Added/updated SCM tests for EC-default and RATIS-default paths. Focused suite
> passes:
> * {{TestHealthyPipelineSafeModeRule}}
> * {{TestOneReplicaPipelineSafeModeRule}}
> * {{TestSCMSafeModeManager}}
> * {{TestBackgroundPipelineCreator}}
> h3. Note
> For EC-default setups in SCM safemode paths, ensure both are set:
> * {{ozone.replication.type=EC}}
> * {{ozone.replication=RS-3-2-1024k}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]