[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic
[ https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9657: -- Issue Type: Improvement (was: Bug) > Schedule EC tasks at proper time to reduce the impact of recovery traffic > - > > Key: HDFS-9657 > URL: https://issues.apache.org/jira/browse/HDFS-9657 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Li Bo > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch > > > The EC recover tasks consume a lot of network bandwidth and disk I/O. > Recovering a corrupt block requires transferring 6 blocks , hence creating a > 6X overhead in network bandwidth and disk I/O. When a datanode fails , the > recovery of the whole blocks on this datanode may use up the network > bandwith. We need to start a recovery task at a proper time in order to give > less impact to the system. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic
[ https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9657: -- Component/s: erasure-coding > Schedule EC tasks at proper time to reduce the impact of recovery traffic > - > > Key: HDFS-9657 > URL: https://issues.apache.org/jira/browse/HDFS-9657 > Project: Hadoop HDFS > Issue Type: Improvement > Components: erasure-coding >Reporter: Li Bo > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch > > > The EC recover tasks consume a lot of network bandwidth and disk I/O. > Recovering a corrupt block requires transferring 6 blocks , hence creating a > 6X overhead in network bandwidth and disk I/O. When a datanode fails , the > recovery of the whole blocks on this datanode may use up the network > bandwith. We need to start a recovery task at a proper time in order to give > less impact to the system. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic
[ https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9657: -- Issue Type: Bug (was: Sub-task) Parent: (was: HDFS-8031) > Schedule EC tasks at proper time to reduce the impact of recovery traffic > - > > Key: HDFS-9657 > URL: https://issues.apache.org/jira/browse/HDFS-9657 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Li Bo > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch > > > The EC recover tasks consume a lot of network bandwidth and disk I/O. > Recovering a corrupt block requires transferring 6 blocks , hence creating a > 6X overhead in network bandwidth and disk I/O. When a datanode fails , the > recovery of the whole blocks on this datanode may use up the network > bandwith. We need to start a recovery task at a proper time in order to give > less impact to the system. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic
[ https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9657: -- Status: Open (was: Patch Available) > Schedule EC tasks at proper time to reduce the impact of recovery traffic > - > > Key: HDFS-9657 > URL: https://issues.apache.org/jira/browse/HDFS-9657 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch > > > The EC recover tasks consume a lot of network bandwidth and disk I/O. > Recovering a corrupt block requires transferring 6 blocks , hence creating a > 6X overhead in network bandwidth and disk I/O. When a datanode fails , the > recovery of the whole blocks on this datanode may use up the network > bandwith. We need to start a recovery task at a proper time in order to give > less impact to the system. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic
[ https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-9657: -- Labels: hdfs-ec-3.0-nice-to-have (was: ) > Schedule EC tasks at proper time to reduce the impact of recovery traffic > - > > Key: HDFS-9657 > URL: https://issues.apache.org/jira/browse/HDFS-9657 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Labels: hdfs-ec-3.0-nice-to-have > Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch > > > The EC recover tasks consume a lot of network bandwidth and disk I/O. > Recovering a corrupt block requires transferring 6 blocks , hence creating a > 6X overhead in network bandwidth and disk I/O. When a datanode fails , the > recovery of the whole blocks on this datanode may use up the network > bandwith. We need to start a recovery task at a proper time in order to give > less impact to the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic
[ https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-9657: Status: Patch Available (was: Open) > Schedule EC tasks at proper time to reduce the impact of recovery traffic > - > > Key: HDFS-9657 > URL: https://issues.apache.org/jira/browse/HDFS-9657 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch > > > The EC recover tasks consume a lot of network bandwidth and disk I/O. > Recovering a corrupt block requires transferring 6 blocks , hence creating a > 6X overhead in network bandwidth and disk I/O. When a datanode fails , the > recovery of the whole blocks on this datanode may use up the network > bandwith. We need to start a recovery task at a proper time in order to give > less impact to the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic
[ https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-9657: Attachment: HDFS-9657-002.patch Currently only implements one policy, i.e ECRecoveryPolicyTimeSegment. I think it satisfies most situations. Will add more policies if needed. > Schedule EC tasks at proper time to reduce the impact of recovery traffic > - > > Key: HDFS-9657 > URL: https://issues.apache.org/jira/browse/HDFS-9657 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-9657-001.patch, HDFS-9657-002.patch > > > The EC recover tasks consume a lot of network bandwidth and disk I/O. > Recovering a corrupt block requires transferring 6 blocks , hence creating a > 6X overhead in network bandwidth and disk I/O. When a datanode fails , the > recovery of the whole blocks on this datanode may use up the network > bandwith. We need to start a recovery task at a proper time in order to give > less impact to the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9657) Schedule EC tasks at proper time to reduce the impact of recovery traffic
[ https://issues.apache.org/jira/browse/HDFS-9657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Bo updated HDFS-9657: Attachment: HDFS-9657-001.patch > Schedule EC tasks at proper time to reduce the impact of recovery traffic > - > > Key: HDFS-9657 > URL: https://issues.apache.org/jira/browse/HDFS-9657 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Li Bo >Assignee: Li Bo > Attachments: HDFS-9657-001.patch > > > The EC recover tasks consume a lot of network bandwidth and disk I/O. > Recovering a corrupt block requires transferring 6 blocks , hence creating a > 6X overhead in network bandwidth and disk I/O. When a datanode fails , the > recovery of the whole blocks on this datanode may use up the network > bandwith. We need to start a recovery task at a proper time in order to give > less impact to the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)