[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: (was: HDFS-14997.005.patch) > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, > HDFS-14997.003.patch, HDFS-14997.004.patch, HDFS-14997.005.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: HDFS-14997.005.patch > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, > HDFS-14997.003.patch, HDFS-14997.004.patch, HDFS-14997.005.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: HDFS-14997.005.patch > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, > HDFS-14997.003.patch, HDFS-14997.004.patch, HDFS-14997.005.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: HDFS-14997.004.patch > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, > HDFS-14997.003.patch, HDFS-14997.004.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: HDFS-14997.003.patch > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, > HDFS-14997.003.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: HDFS-14997.002.patch > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: HDFS-14997.001.patch > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: (was: HDFS-14997.001.patch) > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14997: --- Attachment: HDFS-14997.001.patch Assignee: Xiaoqiao He (was: Aiphago) Status: Patch Available (was: Open) submit demo patch and change to process commands asynchronously. > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org