[jira] [Updated] (HDFS-11412) Maintenance minimum replication config value allowable range should be [0, DefaultReplication]
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-11412: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha3 2.9.0 Status: Resolved (was: Patch Available) Committed to branch-2. Thanks [~manojg] for the contribution. > Maintenance minimum replication config value allowable range should be [0, > DefaultReplication] > -- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: HDFS-11412.01.patch, HDFS-11412.02.patch, > HDFS-11412-branch-2.01.patch > > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11412) Maintenance minimum replication config value allowable range should be [0, DefaultReplication]
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11412: -- Attachment: HDFS-11412-branch-2.01.patch [~mingma], attached branch2 patch for the one committed to trunk. Kindly take a look. TestMaintenanceState, TestDecommission passes through. > Maintenance minimum replication config value allowable range should be [0, > DefaultReplication] > -- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11412.01.patch, HDFS-11412.02.patch, > HDFS-11412-branch-2.01.patch > > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11412) Maintenance minimum replication config value allowable range should be [0, DefaultReplication]
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-11412: --- Summary: Maintenance minimum replication config value allowable range should be [0, DefaultReplication] (was: Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}) > Maintenance minimum replication config value allowable range should be [0, > DefaultReplication] > -- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11412.01.patch, HDFS-11412.02.patch > > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11412: -- Attachment: HDFS-11412.02.patch [~mingma], bq. Maybe we can modify getMinReplicationToBeInMaintenance to return the less of {file replication factor, minReplicationToBeInMaintenance} This sounds good to me. This will cover for the files whose block replication factor is less than maintenance min, and will not trigger unnecessary re-replication. {{BlockManager#getMinMaintenanceStorageNum()}} is modified to return the min value. {{BlockManager#getExpectedLiveRedundancyNum()}} is a common routine used for reconstruction work apart from DecommissionManager. The current implementation of this routine looks good to me. -- (A) In the context of general reconstruction needed for a block and when there is no maintenance operations, the expected live redundancy for any block should be equal to its block replication factor. -- (B) When the blocks are on maintenance nodes, then the expected live redundancy for the block is the min of its block replication factor or maintenance min, that is BlockManager#getMinMaintenanceStorageNum() -- And, BlockManager#getExpectedLiveRedundancyNum() should be the Max(A, B) to work for both non-maintenance and maintenance operations. If you set this to Min(A, B), getExpectedLiveRedundancyNum() will end up as Min(A, Min(block_repl, maint_min) => which can become 0 whenever maintenance min is 0 and can cause adverse affects. Can you please take a look at the latest patch and share your comments ? > Maintenance minimum replication config value allowable range should be {0 - > DefaultReplication} > --- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11412.01.patch, HDFS-11412.02.patch > > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11412: -- Status: Patch Available (was: Open) > Maintenance minimum replication config value allowable range should be {0 - > DefaultReplication} > --- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11412.01.patch > > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11412: -- Attachment: HDFS-11412.01.patch [~mingma], [~eddyxu], Attached v01 patch to address the following * maintenance minimum repl config value range to be less restrictive * couple of unit tests in TestMaintenanceState to trigger re-replication during maintenance state and validate config. Please let me know your comments. > Maintenance minimum replication config value allowable range should be {0 - > DefaultReplication} > --- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11412.01.patch > > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11412: -- Component/s: (was: hdfs) namenode datanode > Maintenance minimum replication config value allowable range should be {0 - > DefaultReplication} > --- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org