[jira] [Updated] (HDFS-17158) Show the rate of metrics in EC recovery task.
[ https://issues.apache.org/jira/browse/HDFS-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17158: --- Description: From !image2023-8-18_16-26-14.png|width=551,height=83! To !123124124.png|width=559,height=100! These metrics may show the network and CPU load of the machine. was: From !image2023-8-18_16-26-14.png|width=551,height=83! To !123124124.png|width=559,height=100! > Show the rate of metrics in EC recovery task. > - > > Key: HDFS-17158 > URL: https://issues.apache.org/jira/browse/HDFS-17158 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, metrics >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Labels: pull-request-available > Attachments: 123124124.png, image2023-8-18_16-26-14.png > > > From > !image2023-8-18_16-26-14.png|width=551,height=83! > To > !123124124.png|width=559,height=100! > These metrics may show the network and CPU load of the machine. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17158) Show the rate of metrics in EC recovery task.
[ https://issues.apache.org/jira/browse/HDFS-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17158: --- Description: From !image2023-8-18_16-26-14.png|width=551,height=83! To !123124124.png|width=559,height=100! > Show the rate of metrics in EC recovery task. > - > > Key: HDFS-17158 > URL: https://issues.apache.org/jira/browse/HDFS-17158 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, metrics >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Labels: pull-request-available > Attachments: 123124124.png, image2023-8-18_16-26-14.png > > > From > !image2023-8-18_16-26-14.png|width=551,height=83! > To > !123124124.png|width=559,height=100! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17158) Show the rate of metrics in EC recovery task.
[ https://issues.apache.org/jira/browse/HDFS-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17158: --- Attachment: image2023-8-18_16-26-14.png > Show the rate of metrics in EC recovery task. > - > > Key: HDFS-17158 > URL: https://issues.apache.org/jira/browse/HDFS-17158 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, metrics >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Labels: pull-request-available > Attachments: 123124124.png, image2023-8-18_16-26-14.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17158) Show the rate of metrics in EC recovery task.
[ https://issues.apache.org/jira/browse/HDFS-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17158: --- Attachment: 123124124.png > Show the rate of metrics in EC recovery task. > - > > Key: HDFS-17158 > URL: https://issues.apache.org/jira/browse/HDFS-17158 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding, metrics >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Labels: pull-request-available > Attachments: 123124124.png, image2023-8-18_16-26-14.png > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-17158) Show the rate of metrics in EC recovery task.
WangYuanben created HDFS-17158: -- Summary: Show the rate of metrics in EC recovery task. Key: HDFS-17158 URL: https://issues.apache.org/jira/browse/HDFS-17158 Project: Hadoop HDFS Issue Type: Improvement Components: erasure-coding, metrics Reporter: WangYuanben Assignee: WangYuanben -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17016) Cleanup method calls to static Assert and Assume methods.
[ https://issues.apache.org/jira/browse/HDFS-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben resolved HDFS-17016. Resolution: Not A Problem > Cleanup method calls to static Assert and Assume methods. > - > > Key: HDFS-17016 > URL: https://issues.apache.org/jira/browse/HDFS-17016 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Cleanup method calls to static Assert and Assume methods. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-17113) Reconfig transfer and write bandwidth for datanode.
[ https://issues.apache.org/jira/browse/HDFS-17113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben reassigned HDFS-17113: -- Assignee: WangYuanben > Reconfig transfer and write bandwidth for datanode. > --- > > Key: HDFS-17113 > URL: https://issues.apache.org/jira/browse/HDFS-17113 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Major > Labels: pull-request-available > > To avoid frequent rolling restarts of the DN, we should make > dfs.datanode.data.transfer.bandwidthPerSec and > dfs.datanode.data.write.bandwidthPerSec reconfigurable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-17113) Reconfig transfer and write bandwidth for datanode.
WangYuanben created HDFS-17113: -- Summary: Reconfig transfer and write bandwidth for datanode. Key: HDFS-17113 URL: https://issues.apache.org/jira/browse/HDFS-17113 Project: Hadoop HDFS Issue Type: New Feature Components: datanode Reporter: WangYuanben To avoid frequent rolling restarts of the DN, we should make dfs.datanode.data.transfer.bandwidthPerSec and dfs.datanode.data.write.bandwidthPerSec reconfigurable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17091) Blocks on DECOMMISSIONING DNs should be sorted properly in LocatedBlocks
[ https://issues.apache.org/jira/browse/HDFS-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17091: --- Description: Similar to [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076], I think decommissioning DNs needs to be taken into consideration. After sorting the expected location list will be: live -> slow -> stale -> staleAndSlow -> entering_maintenance -> decommissioning -> decommissioned. (was: Being similar to [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076], I think decommissioning DNs needs to be taken into consideration. After sorting the expected location list will be: live -> slow -> stale -> staleAndSlow -> entering_maintenance -> decommissioned -> decommissioning.) > Blocks on DECOMMISSIONING DNs should be sorted properly in LocatedBlocks > > > Key: HDFS-17091 > URL: https://issues.apache.org/jira/browse/HDFS-17091 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Major > Labels: pull-request-available > > Similar to [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076], I > think decommissioning DNs needs to be taken into consideration. After sorting > the expected location list will be: live -> slow -> stale -> staleAndSlow -> > entering_maintenance -> decommissioning -> decommissioned. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-17091) Blocks on DECOMMISSIONING DNs should be sorted properly in LocatedBlocks
WangYuanben created HDFS-17091: -- Summary: Blocks on DECOMMISSIONING DNs should be sorted properly in LocatedBlocks Key: HDFS-17091 URL: https://issues.apache.org/jira/browse/HDFS-17091 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: WangYuanben Assignee: WangYuanben Being similar to [HDFS-16076|https://issues.apache.org/jira/browse/HDFS-16076], I think decommissioning DNs needs to be taken into consideration. After sorting the expected location list will be: live -> slow -> stale -> staleAndSlow -> entering_maintenance -> decommissioned -> decommissioning. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17033) Update fsck to display stale state info of blocks accurately
[ https://issues.apache.org/jira/browse/HDFS-17033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17033: --- Description: When the DN is stale, Block replica on this DN should be "STALE" instead of "HEALTHY" in block check of fsck. (was: When the DN is stale, blocks on this DN should be "STALE" instead of "HEALTHY" in block check of fsck.) > Update fsck to display stale state info of blocks accurately > > > Key: HDFS-17033 > URL: https://issues.apache.org/jira/browse/HDFS-17033 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namanode >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Labels: pull-request-available > > When the DN is stale, Block replica on this DN should be "STALE" instead of > "HEALTHY" in block check of fsck. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17033) Update fsck to display stale state info of blocks accurately
[ https://issues.apache.org/jira/browse/HDFS-17033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17033: --- Description: When the DN is stale, blocks on this DN should be "STALE" instead of "HEALTHY" in block check of fsck. > Update fsck to display stale state info of blocks accurately > > > Key: HDFS-17033 > URL: https://issues.apache.org/jira/browse/HDFS-17033 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, namanode >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Labels: pull-request-available > > When the DN is stale, blocks on this DN should be "STALE" instead of > "HEALTHY" in block check of fsck. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17738576#comment-17738576 ] WangYuanben commented on HDFS-17061: [~sodonnell] Thank you for the comment. I need some examples to validate this idea, but it seems there is currently no direct way to obtain the number of data blocks and parity blocks. Therefore, it is necessary to develop a functionality to retrieve the number of data blocks and parity blocks first and do some tests in the subtask. I will create it later. > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover, erasure-coding, hdfs >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=600,height=333! > +If we can let data blocks and parity blocks on DNs more balanced, the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=600,height=333! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17061: --- Description: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=600,height=333! +If we can let data blocks and parity blocks on DNs more balanced, the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=600,height=333! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. was: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=600,height=333! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=600,height=333! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover, erasure-coding, hdfs >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=600,height=333! > +If we can let data blocks and parity blocks on DNs more balanced, the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=600,height=333! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17061: --- Component/s: hdfs > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover, erasure-coding, hdfs >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=600,height=333! > If we can let data blocks and parity blocks on DNs more balanced, +the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=600,height=333! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17061: --- Component/s: balancer & mover (was: balamcer) > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover, erasure-coding >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=600,height=333! > If we can let data blocks and parity blocks on DNs more balanced, +the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=600,height=333! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17061: --- Description: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=600,height=333! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=600,height=333! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. was: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=815,height=550! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=815,height=550! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balamcer, erasure-coding >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=600,height=333! > If we can let data blocks and parity blocks on DNs more balanced, +the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=600,height=333! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17061: --- Description: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=815,height=550! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=815,height=550! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. was: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=700,height=550! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=700,height=550! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balamcer, erasure-coding >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=815,height=550! > If we can let data blocks and parity blocks on DNs more balanced, +the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=815,height=550! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17061: --- Description: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=700,height=550! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=700,height=550! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. was: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=850,height=550! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=850,height=550! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balamcer, erasure-coding >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=700,height=550! > If we can let data blocks and parity blocks on DNs more balanced, +the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=700,height=550! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17061: --- Description: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=850,height=550! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=850,height=550! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. was: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balamcer, erasure-coding >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png|width=850,height=550! > If we can let data blocks and parity blocks on DNs more balanced, +the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png|width=850,height=550! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
WangYuanben created HDFS-17061: -- Summary: EC: Let data blocks and parity blocks on DNs more balanced Key: HDFS-17061 URL: https://issues.apache.org/jira/browse/HDFS-17061 Project: Hadoop HDFS Issue Type: Improvement Components: balamcer, erasure-coding Reporter: WangYuanben Attachments: figure1, unbalanced traffic load on DNs.png, figure2, balanced traffic load on DNs.png When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=650,height=650! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=650,height=650! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17061) EC: Let data blocks and parity blocks on DNs more balanced
[ https://issues.apache.org/jira/browse/HDFS-17061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17061: --- Description: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. was: When choosing DN for placing data block or parity block, the existing number of data block and parity block on datanode is not taken into consideration. This may lead to *uneven traffic load*. As shown in the figure 1, when reading block group A, B, C, D and E from five different EC files without any missing block, datanodes like DN1 and DN2 will have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low or even no traffic load. !figure1, unbalanced traffic load on DNs.png|width=650,height=650! If we can let data blocks and parity blocks on DNs more balanced, +the traffic load in cluster will be more balanced and the peak traffic load on DN will be reduced+. Here "balance" refers to the matching of the number of data blocks and parity blocks on DN with its EC policy. In the ideal state, each DN has a balanced traffic load just like what figure 2 shows. !figure2, balanced traffic load on DNs.png|width=650,height=650! Then how to reduce this imbalance? I think it's related to EC policy and the ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's appropriate to let the ratio close to 3:2. There are two solutions: 1.Improve the block placement policy. 2.Improve the Balancer. > EC: Let data blocks and parity blocks on DNs more balanced > -- > > Key: HDFS-17061 > URL: https://issues.apache.org/jira/browse/HDFS-17061 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balamcer, erasure-coding >Reporter: WangYuanben >Priority: Minor > Attachments: figure1, unbalanced traffic load on DNs.png, figure2, > balanced traffic load on DNs.png > > > When choosing DN for placing data block or parity block, the existing number > of data block and parity block on datanode is not taken into consideration. > This may lead to *uneven traffic load*. > As shown in the figure 1, when reading block group A, B, C, D and E from five > different EC files without any missing block, datanodes like DN1 and DN2 will > have high traffic load. However, datanodes like DN3, DN4 and DN5 may have low > or even no traffic load. > !figure1, unbalanced traffic load on DNs.png! > If we can let data blocks and parity blocks on DNs more balanced, +the > traffic load in cluster will be more balanced and the peak traffic load on DN > will be reduced+. Here "balance" refers to the matching of the number of data > blocks and parity blocks on DN with its EC policy. In the ideal state, each > DN has a balanced traffic load just like what figure 2 shows. > !figure2, balanced traffic load on DNs.png! > Then how to reduce this imbalance? I think it's related to EC policy and the > ratio of data blocks to parity blocks on datanode. For RS-3-2-1024k, it's > appropriate to let the ratio close to 3:2. > There are two solutions: > 1.Improve the block placement policy. > 2.Improve the Balancer. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-17033) Update fsck to display stale state info of blocks accurately
WangYuanben created HDFS-17033: -- Summary: Update fsck to display stale state info of blocks accurately Key: HDFS-17033 URL: https://issues.apache.org/jira/browse/HDFS-17033 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, namanode Reporter: WangYuanben Assignee: WangYuanben Fix For: 3.4.0 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17016) Cleanup method calls to static Assert and Assume methods.
[ https://issues.apache.org/jira/browse/HDFS-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17016: --- Summary: Cleanup method calls to static Assert and Assume methods. (was: Cleanup method calls to static Assert methods in TestCodecRawCoderMapping) > Cleanup method calls to static Assert and Assume methods. > - > > Key: HDFS-17016 > URL: https://issues.apache.org/jira/browse/HDFS-17016 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Fix For: 3.4.0 > > > Cleanup method calls to static Assert methods in TestCodecRawCoderMapping. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17016) Cleanup method calls to static Assert and Assume methods.
[ https://issues.apache.org/jira/browse/HDFS-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17016: --- Description: Cleanup method calls to static Assert and Assume methods. (was: Cleanup method calls to static Assert methods in TestCodecRawCoderMapping.) > Cleanup method calls to static Assert and Assume methods. > - > > Key: HDFS-17016 > URL: https://issues.apache.org/jira/browse/HDFS-17016 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Fix For: 3.4.0 > > > Cleanup method calls to static Assert and Assume methods. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17016) Cleanup method calls to static Assert methods in TestCodecRawCoderMapping
[ https://issues.apache.org/jira/browse/HDFS-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17016: --- Description: Cleanup method calls to static Assert methods in TestCodecRawCoderMapping. (was: Cleanup method calls to static Assert and Assume methods in TestCodecRawCoderMapping.) > Cleanup method calls to static Assert methods in TestCodecRawCoderMapping > - > > Key: HDFS-17016 > URL: https://issues.apache.org/jira/browse/HDFS-17016 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Fix For: 3.4.0 > > > Cleanup method calls to static Assert methods in TestCodecRawCoderMapping. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-17016) Cleanup method calls to static Assert methods in TestCodecRawCoderMapping
[ https://issues.apache.org/jira/browse/HDFS-17016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-17016: --- Summary: Cleanup method calls to static Assert methods in TestCodecRawCoderMapping (was: Cleanup method calls to static Assert and Assume methods in TestCodecRawCoderMapping) > Cleanup method calls to static Assert methods in TestCodecRawCoderMapping > - > > Key: HDFS-17016 > URL: https://issues.apache.org/jira/browse/HDFS-17016 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Fix For: 3.4.0 > > > Cleanup method calls to static Assert and Assume methods in > TestCodecRawCoderMapping. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-17016) Cleanup method calls to static Assert and Assume methods in TestCodecRawCoderMapping
WangYuanben created HDFS-17016: -- Summary: Cleanup method calls to static Assert and Assume methods in TestCodecRawCoderMapping Key: HDFS-17016 URL: https://issues.apache.org/jira/browse/HDFS-17016 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: WangYuanben Assignee: WangYuanben Fix For: 3.4.0 Cleanup method calls to static Assert and Assume methods in TestCodecRawCoderMapping. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16977) Forbid assigned characters in pathname.
[ https://issues.apache.org/jira/browse/HDFS-16977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-16977: --- Attachment: HDFS-16977__Forbid_assigned_characters_in_pathname_.patch > Forbid assigned characters in pathname. > --- > > Key: HDFS-16977 > URL: https://issues.apache.org/jira/browse/HDFS-16977 > Project: Hadoop HDFS > Issue Type: New Feature > Components: dfsclient, namenode >Affects Versions: 3.3.4 >Reporter: WangYuanben >Priority: Minor > Labels: pull-request-available > Attachments: HDFS-16977__Forbid_assigned_characters_in_pathname_.patch > > > Some pathnames which contains special character(s) may lead to unexpected > results. For example, there is a file named "/foo/file*" in my cluster, > created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I > want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I > remove all the files with the prefix of "/foo/file*" unexpectedly. There are > also some other characters just like '*', such as ' ', '|', '&', etc. > > Therefore, it's necessary to restrict the occurrence of these characters in > pathname. A simple but effective way is to forbid assigned characters in > pathname when new file or directory is created. > > It is also important to add the same function on the Router model and WebHdfs > model. I will add them as two subtasks later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16977) Forbid assigned characters in pathname.
[ https://issues.apache.org/jira/browse/HDFS-16977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben resolved HDFS-16977. Resolution: Works for Me > Forbid assigned characters in pathname. > --- > > Key: HDFS-16977 > URL: https://issues.apache.org/jira/browse/HDFS-16977 > Project: Hadoop HDFS > Issue Type: New Feature > Components: dfsclient, namenode >Affects Versions: 3.3.4 >Reporter: WangYuanben >Priority: Minor > Labels: pull-request-available > Attachments: HDFS-16977__Forbid_assigned_characters_in_pathname_.patch > > > Some pathnames which contains special character(s) may lead to unexpected > results. For example, there is a file named "/foo/file*" in my cluster, > created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I > want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I > remove all the files with the prefix of "/foo/file*" unexpectedly. There are > also some other characters just like '*', such as ' ', '|', '&', etc. > > Therefore, it's necessary to restrict the occurrence of these characters in > pathname. A simple but effective way is to forbid assigned characters in > pathname when new file or directory is created. > > It is also important to add the same function on the Router model and WebHdfs > model. I will add them as two subtasks later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16977) Forbid assigned characters in pathname.
[ https://issues.apache.org/jira/browse/HDFS-16977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-16977: --- Summary: Forbid assigned characters in pathname. (was: Forbid assigned characters in pathname when new file or directory is created.) > Forbid assigned characters in pathname. > --- > > Key: HDFS-16977 > URL: https://issues.apache.org/jira/browse/HDFS-16977 > Project: Hadoop HDFS > Issue Type: New Feature > Components: dfsclient, namenode >Affects Versions: 3.3.4 >Reporter: WangYuanben >Priority: Minor > Labels: pull-request-available > > Some pathnames which contains special character(s) may lead to unexpected > results. For example, there is a file named "/foo/file*" in my cluster, > created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I > want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I > remove all the files with the prefix of "/foo/file*" unexpectedly. There are > also some other characters just like '*', such as ' ', '|', '&', etc. > > Therefore, it's necessary to restrict the occurrence of these characters in > pathname. A simple but effective way is to forbid assigned characters in > pathname when new file or directory is created. > > It is also important to add the same function on the Router model and WebHdfs > model. I will add them as two subtasks later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16977) Forbid assigned characters in pathname when new file or directory is created.
[ https://issues.apache.org/jira/browse/HDFS-16977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-16977: --- Description: Some pathnames which contains special character(s) may lead to unexpected results. For example, there is a file named "/foo/file*" in my cluster, created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I remove all the files with the prefix of "/foo/file*" unexpectedly. There are also some other characters just like '*', such as ' ', '|', '&', etc. Therefore, it's necessary to restrict the occurrence of these characters in pathname. A simple but effective way is to forbid assigned characters in pathname when new file or directory is created. It is also important to add the same function on the Router model and WebHdfs model. I will add them as two subtasks later. was: Some pathnames which contains special character(s) may lead to unexpected results. For example, there is a file named "/foo/file*" in my cluster, created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I remove all the files with the prefix of "/foo/file*" unexpectedly. There are also some other characters just like '*', such as ' ', '|', '&', etc. Therefore, it's necessary to restrict the occurrence of these characters in pathname. A simple but effective way is to forbid assigned characters in pathname when new file or directory is created. It is also important to add the same function on the Router model and WebHdfs model. I will add them as two subtasks later. > Forbid assigned characters in pathname when new file or directory is created. > - > > Key: HDFS-16977 > URL: https://issues.apache.org/jira/browse/HDFS-16977 > Project: Hadoop HDFS > Issue Type: New Feature > Components: dfsclient, namenode >Affects Versions: 3.3.4 >Reporter: WangYuanben >Priority: Minor > Labels: pull-request-available > > Some pathnames which contains special character(s) may lead to unexpected > results. For example, there is a file named "/foo/file*" in my cluster, > created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I > want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I > remove all the files with the prefix of "/foo/file*" unexpectedly. There are > also some other characters just like '*', such as ' ', '|', '&', etc. > > Therefore, it's necessary to restrict the occurrence of these characters in > pathname. A simple but effective way is to forbid assigned characters in > pathname when new file or directory is created. > > It is also important to add the same function on the Router model and WebHdfs > model. I will add them as two subtasks later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-16977) Forbid assigned characters in pathname when new file or directory is created.
[ https://issues.apache.org/jira/browse/HDFS-16977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangYuanben updated HDFS-16977: --- Description: Some pathnames which contains special character(s) may lead to unexpected results. For example, there is a file named "/foo/file*" in my cluster, created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I remove all the files with the prefix of "/foo/file*" unexpectedly. There are also some other characters just like '*', such as ' ', '|', '&', etc. Therefore, it's necessary to restrict the occurrence of these characters in pathname. A simple but effective way is to forbid assigned characters in pathname when new file or directory is created. It is also important to add the same function on the Router model and WebHdfs model. I will add them as two subtasks later. was: Some pathnames which contains special character(s) may lead to unexpected results. For example, there is a file named "/foo/file*" in my cluster, created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I remove all the files with the prefix of "/foo/file*" unexpectedly. There are also some other characters just like '*', such as ' ', '|', '&', etc. Therefore, it's necessary to restrict the occurrence of these characters in pathname. A simple but effective way is to forbid assigned characters in pathname when new file or directory is created. > Forbid assigned characters in pathname when new file or directory is created. > - > > Key: HDFS-16977 > URL: https://issues.apache.org/jira/browse/HDFS-16977 > Project: Hadoop HDFS > Issue Type: New Feature > Components: dfsclient, namenode >Affects Versions: 3.3.4 >Reporter: WangYuanben >Priority: Minor > Labels: pull-request-available > > Some pathnames which contains special character(s) may lead to unexpected > results. For example, there is a file named "/foo/file*" in my cluster, > created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I > want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I > remove all the files with the prefix of "/foo/file*" unexpectedly. There are > also some other characters just like '*', such as ' ', '|', '&', etc. > Therefore, it's necessary to restrict the occurrence of these characters in > pathname. A simple but effective way is to forbid assigned characters in > pathname when new file or directory is created. > > It is also important to add the same function on the Router model and WebHdfs > model. I will add them as two subtasks later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16977) Forbid assigned characters in pathname when new file or directory is created.
WangYuanben created HDFS-16977: -- Summary: Forbid assigned characters in pathname when new file or directory is created. Key: HDFS-16977 URL: https://issues.apache.org/jira/browse/HDFS-16977 Project: Hadoop HDFS Issue Type: New Feature Components: dfsclient, namenode Affects Versions: 3.3.4 Reporter: WangYuanben Some pathnames which contains special character(s) may lead to unexpected results. For example, there is a file named "/foo/file*" in my cluster, created by "DistributedFileSystem.create(new Path("/foo/file*"))". When I want to remove it, I type in "hadoop fs -rm /foo/file*" in shell. However, I remove all the files with the prefix of "/foo/file*" unexpectedly. There are also some other characters just like '*', such as ' ', '|', '&', etc. Therefore, it's necessary to restrict the occurrence of these characters in pathname. A simple but effective way is to forbid assigned characters in pathname when new file or directory is created. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-16965) Add switch to decide whether to enable native codec.
WangYuanben created HDFS-16965: -- Summary: Add switch to decide whether to enable native codec. Key: HDFS-16965 URL: https://issues.apache.org/jira/browse/HDFS-16965 Project: Hadoop HDFS Issue Type: New Feature Components: erasure-coding Affects Versions: 3.3.4 Reporter: WangYuanben Sometimes we need to create codec without ISA-L, while priority is given to native codec by default. So it is necessary to add switch to decide whether to enable native codec. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org