[jira] [Commented] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed misreplicated block counts should be retrieved with NN lock protection
[ https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413118#comment-15413118 ] GAO Rui commented on HDFS-10710: Thanks for your comment and commit, [~jingzhao]! > In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed misreplicated > block counts should be retrieved with NN lock protection > -- > > Key: HDFS-10710 > URL: https://issues.apache.org/jira/browse/HDFS-10710 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Fix For: 2.8.0 > > Attachments: HDFS-10710.1.patch > > > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock. Or, log records like "-1 > blocks are removed" which indicate minus blocks are removed could be > generated. > For example, following scenario: > 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}} currently > startPostponedMisReplicatedBlocksCount get the value 20. > 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment > postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount > is 21 now. > 3. thread1 end the iteration, but no postponed block is removed, so after run > {{long endPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}}, > endPostponedMisReplicatedBlocksCount get the value of 21. > 4. thread 1 generate the log: > {noformat} > LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + > (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) > + > " msecs. " + endPostponedMisReplicatedBlocksCount + > " blocks are left. " + (startPostponedMisReplicatedBlocksCount - > endPostponedMisReplicatedBlocksCount) + " blocks are removed."); > {noformat} > Then, we'll get the log record like "-1 blocks are removed." -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock
[ https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10710: --- Status: Patch Available (was: Open) > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock > - > > Key: HDFS-10710 > URL: https://issues.apache.org/jira/browse/HDFS-10710 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10710.1.patch > > > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock. Or, log records like "-1 > blocks are removed" which indicate minus blocks are removed could be > generated. > For example, following scenario: > 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}} currently > startPostponedMisReplicatedBlocksCount get the value 20. > 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment > postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount > is 21 now. > 3. thread1 end the iteration, but no postponed block is removed, so after run > {{long endPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}}, > endPostponedMisReplicatedBlocksCount get the value of 21. > 4. thread 1 generate the log: > {noformat} > LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + > (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) > + > " msecs. " + endPostponedMisReplicatedBlocksCount + > " blocks are left. " + (startPostponedMisReplicatedBlocksCount - > endPostponedMisReplicatedBlocksCount) + " blocks are removed."); > {noformat} > Then, we'll get the log record like "-1 blocks are removed." -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock
[ https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10710: --- Attachment: HDFS-10710.1.patch > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock > - > > Key: HDFS-10710 > URL: https://issues.apache.org/jira/browse/HDFS-10710 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10710.1.patch > > > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock. Or, log records like "-1 > blocks are removed" which indicate minus blocks are removed could be > generated. > For example, following scenario: > 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}} currently > startPostponedMisReplicatedBlocksCount get the value 20. > 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment > postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount > is 21 now. > 3. thread1 end the iteration, but no postponed block is removed, so after run > {{long endPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}}, > endPostponedMisReplicatedBlocksCount get the value of 21. > 4. thread 1 generate the log: > {noformat} > LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + > (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) > + > " msecs. " + endPostponedMisReplicatedBlocksCount + > " blocks are left. " + (startPostponedMisReplicatedBlocksCount - > endPostponedMisReplicatedBlocksCount) + " blocks are removed."); > {noformat} > Then, we'll get the log record like "-1 blocks are removed." -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock
[ https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10710: --- Description: In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock. Or, log records like "-1 blocks are removed" which indicate minus blocks are removed could be generated. For example, following scenario: 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}} currently startPostponedMisReplicatedBlocksCount get the value 20. 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount is 21 now. 3. thread1 end the iteration, but no postponed block is removed, so after run {{long endPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}}, endPostponedMisReplicatedBlocksCount get the value of 21. 4. thread 1 generate the log: {noformat} LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) + " msecs. " + endPostponedMisReplicatedBlocksCount + " blocks are left. " + (startPostponedMisReplicatedBlocksCount - endPostponedMisReplicatedBlocksCount) + " blocks are removed."); {noformat} Then, we'll get the log record like "-1 blocks are removed." was: In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock. Or, log records like "-1 blocks are removed" which indicate minus blocks are removed could be generated. For example, following scenario: 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}} currently startPostponedMisReplicatedBlocksCount get the value 20. 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount is 21 now. 3. thread1 end the iteration, but no postponed block is removed, so after run {{long endPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}}, endPostponedMisReplicatedBlocksCount get the value of 21. 4. thread 1 generate the log: {code} LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) + " msecs. " + endPostponedMisReplicatedBlocksCount + " blocks are left. " + (startPostponedMisReplicatedBlocksCount - endPostponedMisReplicatedBlocksCount) + " blocks are removed."); {code} Then, we'll get the log record like "-1 blocks are removed." > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock > - > > Key: HDFS-10710 > URL: https://issues.apache.org/jira/browse/HDFS-10710 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock. Or, log records like "-1 > blocks are removed" which indicate minus blocks are removed could be > generated. > For example, following scenario: > 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}} currently > startPostponedMisReplicatedBlocksCount get the value 20. > 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment > postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount > is 21 now. > 3. thread1 end the iteration, but no postponed block is removed, so after run > {{long endPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}}, > endPostponedMisReplicatedBlocksCount get the value of 21. > 4. thread 1 generate the log: > {noformat} > LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + > (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) > + > " msecs. " + endPostponedMisReplicatedBlocksCount + > " blocks are left. " + (startPostponedMisReplicatedBlocksCount - > endPostponedMisReplicatedBlocksCount) + " blocks are removed."); > {noformat} > Then, we'll get the log record like "-1 blocks are removed." -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock
[ https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10710: --- Description: In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock. Or, log records like "-1 blocks are removed" which indicate minus blocks are removed could be generated. For example, following scenario: 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}} currently startPostponedMisReplicatedBlocksCount get the value 20. 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount is 21 now. 3. thread1 end the iteration, but no postponed block is removed, so after run {{long endPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}}, endPostponedMisReplicatedBlocksCount get the value of 21. 4. thread 1 generate the log: {code} LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) + " msecs. " + endPostponedMisReplicatedBlocksCount + " blocks are left. " + (startPostponedMisReplicatedBlocksCount - endPostponedMisReplicatedBlocksCount) + " blocks are removed."); {code} Then, we'll get the log record like "-1 blocks are removed." was: In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock. Or, log records like "-1 blocks are removed" which indicate minus blocks are removed could be generated. For example, following scenario: 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}} currently startPostponedMisReplicatedBlocksCount get the value 20. 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount is 21 now. 3. thread1 end the iteration, but no postponed block is removed, so after run {{long endPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}}, endPostponedMisReplicatedBlocksCount get the value of 21. 4. thread 1 generate the log: ``` LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) + " msecs. " + endPostponedMisReplicatedBlocksCount + " blocks are left. " + (startPostponedMisReplicatedBlocksCount - endPostponedMisReplicatedBlocksCount) + " blocks are removed."); ``` Then, we'll get the log record like "-1 blocks are removed." > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock > - > > Key: HDFS-10710 > URL: https://issues.apache.org/jira/browse/HDFS-10710 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > > In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block > counts should be get with the protect with lock. Or, log records like "-1 > blocks are removed" which indicate minus blocks are removed could be > generated. > For example, following scenario: > 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}} currently > startPostponedMisReplicatedBlocksCount get the value 20. > 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment > postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount > is 21 now. > 3. thread1 end the iteration, but no postponed block is removed, so after run > {{long endPostponedMisReplicatedBlocksCount = > getPostponedMisreplicatedBlocksCount();}}, > endPostponedMisReplicatedBlocksCount get the value of 21. > 4. thread 1 generate the log: > {code} > LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + > (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) + > " msecs. " + endPostponedMisReplicatedBlocksCount + > " blocks are left. " + (startPostponedMisReplicatedBlocksCount - > endPostponedMisReplicatedBlocksCount) + " blocks are removed."); > {code} > Then, we'll get the log record like "-1 blocks are removed." -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock
GAO Rui created HDFS-10710: -- Summary: In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock Key: HDFS-10710 URL: https://issues.apache.org/jira/browse/HDFS-10710 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: GAO Rui Assignee: GAO Rui In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock. Or, log records like "-1 blocks are removed" which indicate minus blocks are removed could be generated. For example, following scenario: 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}} currently startPostponedMisReplicatedBlocksCount get the value 20. 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount is 21 now. 3. thread1 end the iteration, but no postponed block is removed, so after run {{long endPostponedMisReplicatedBlocksCount = getPostponedMisreplicatedBlocksCount();}}, endPostponedMisReplicatedBlocksCount get the value of 21. 4. thread 1 generate the log: ``` LOG.info("Rescan of postponedMisreplicatedBlocks completed in " + (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) + " msecs. " + endPostponedMisReplicatedBlocksCount + " blocks are left. " + (startPostponedMisReplicatedBlocksCount - endPostponedMisReplicatedBlocksCount) + " blocks are removed."); ``` Then, we'll get the log record like "-1 blocks are removed." -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Attachment: HDFS-10530.3.patch Patch 3 attached to address unit test failures. For {{TestBalancer#doTestBalancerWithStripedFile()}}, adjust rack counts to avoid conflict between {{Balancer}} and {{BlockPlacementPolicyRackFaultTolerant}} in tests. For {{TestBlockManager#testPlacementPolicySatisfied()}}, add {{conf.setLong(DFSConfigKeys.DFS_NAMENODE_REPLICATION_INTERVAL_KEY, Integer.MAX_VALUE)}} to avoid block reconstruction work while testing placement policy satisfaction. > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch, HDFS-10530.2.patch, > HDFS-10530.3.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Status: Patch Available (was: In Progress) > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch, HDFS-10530.2.patch, > HDFS-10530.3.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Status: In Progress (was: Patch Available) > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch, HDFS-10530.2.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15354199#comment-15354199 ] GAO Rui commented on HDFS-10530: Patch 2 had been attached to address TestBalancer failures. {{hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer}} seems not related to block placement changes, and the other tests should could pass this time. > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch, HDFS-10530.2.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Status: In Progress (was: Patch Available) > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch, HDFS-10530.2.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Status: Patch Available (was: In Progress) > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch, HDFS-10530.2.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Attachment: HDFS-10530.2.patch > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch, HDFS-10530.2.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350895#comment-15350895 ] GAO Rui commented on HDFS-10530: I've investigated the failure of {{hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped}}. Found out that, reconstruction works caused by block placement policy interfered {{Balancer}} to balance utilization of DatanodeStorages. We do could have conflict between block placement policy and balancer policy. Like in the scenario of {{hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped}}, the last added datanode would be filled up with internal block/parity block of all block groups according to {{BlockPlacementPolicyRackFaultTolerant}}, while this would make this datanode always be recognized as {{over-utilized}} by {{Balancer}}. This may make {{Balancer}} could never finish it's work successfully. I suggest we make {{Balancer}} tolerate certain percent( say 10%) of datanodes as {{over-utilized}}, after {{Balancer#runOneIteration()}} runs for 5 times, and less than 10% of datanodes is {{over-utilized}}, we make {{Balancer}} finish it's work successfully. [~zhz], could you share your opinions? Thank you. > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-10530 started by GAO Rui. -- > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Status: Patch Available (was: In Progress) > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Attachment: HDFS-10530.1.patch [~zhz], thanks for your ideas. I've attached a patch with additional unit test. Please feel free to commend on the patch. For the priority of rack placement policy related reconstruction work, I'd like to be assigned, and report to you in another JIRA :D > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > Attachments: HDFS-10530.1.patch > > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331473#comment-15331473 ] GAO Rui commented on HDFS-10530: Any comments are welcome. Let's discuss and try to reach to an agreement for this issue. > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10531) Add EC policy and storage policy related usage summarization function to dfs du command
GAO Rui created HDFS-10531: -- Summary: Add EC policy and storage policy related usage summarization function to dfs du command Key: HDFS-10531 URL: https://issues.apache.org/jira/browse/HDFS-10531 Project: Hadoop HDFS Issue Type: Sub-task Reporter: GAO Rui Currently du command output: {code} [ ~]$ hdfs dfs -du -h /home/rgao/ 0 /home/rgao/.Trash 0 /home/rgao/.staging 100 M /home/rgao/ds 250 M /home/rgao/ds-2 200 M /home/rgao/noECBackup-ds 500 M /home/rgao/noECBackup-ds-2 {code} For hdfs users and administrators, EC policy and storage policy related usage summarization would be very helpful when managing storages of cluster. The imitate output of du could be like the following. {code} [ ~]$ hdfs dfs -du -h -t( total, parameter to be added) /home/rgao 0 /home/rgao/.Trash 0 /home/rgao/.staging [Archive] [EC:RS-DEFAULT-6-3-64k] 100 M /home/rgao/ds [DISK] [EC:RS-DEFAULT-6-3-64k] 250 M /home/rgao/ds-2 [DISK] [Replica] 200 M /home/rgao/noECBackup-ds [DISK] [Replica] 500 M /home/rgao/noECBackup-ds-2 Total: [Archive][EC:RS-DEFAULT-6-3-64k] 100 M [Archive][Replica]0 M [DISK] [EC:RS-DEFAULT-6-3-64k] 250 M [DISK] [Replica] 700 M [Archive][ALL] 100M [DISK][ALL] 950M [ALL] [EC:RS-DEFAULT-6-3-64k]350M [ALL] [Replica] 700M {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
[ https://issues.apache.org/jira/browse/HDFS-10530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10530: --- Description: This issue was found by [~tfukudom]. Under RS-DEFAULT-6-3-64k EC policy, 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) of the cluster. 2. Reconstruction work would be scheduled if the 6th rack is added. 3. While adding the 7th rack or more racks will not trigger reconstruction work. Based on default EC block placement policy defined in “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be scheduled to distribute to 9 racks if possible. In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, instead of *getRealDataBlockNum()*. was: Under RS-DEFAULT-6-3-64k EC policy, This issue was found by [~tfukudom]. 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) of the cluster. 2. Reconstruction work would be scheduled if the 6th rack is added. 3. While adding the 7th rack or more racks will not trigger reconstruction work. Based on default EC block placement policy defined in “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be scheduled to distribute to 9 racks if possible. In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, instead of *getRealDataBlockNum()*. > BlockManager reconstruction work scheduling should correctly adhere to EC > block placement policy > > > Key: HDFS-10530 > URL: https://issues.apache.org/jira/browse/HDFS-10530 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: GAO Rui >Assignee: GAO Rui > > This issue was found by [~tfukudom]. > Under RS-DEFAULT-6-3-64k EC policy, > 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) > of the cluster. > 2. Reconstruction work would be scheduled if the 6th rack is added. > 3. While adding the 7th rack or more racks will not trigger reconstruction > work. > Based on default EC block placement policy defined in > “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be > scheduled to distribute to 9 racks if possible. > In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , > *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, > instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-10530) BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy
GAO Rui created HDFS-10530: -- Summary: BlockManager reconstruction work scheduling should correctly adhere to EC block placement policy Key: HDFS-10530 URL: https://issues.apache.org/jira/browse/HDFS-10530 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: GAO Rui Assignee: GAO Rui Under RS-DEFAULT-6-3-64k EC policy, This issue was found by [~tfukudom]. 1. Create an EC file, the file was witten to all the 5 racks( 2 dns for each) of the cluster. 2. Reconstruction work would be scheduled if the 6th rack is added. 3. While adding the 7th rack or more racks will not trigger reconstruction work. Based on default EC block placement policy defined in “BlockPlacementPolicyRackFaultTolerant.java”, EC file should be able to be scheduled to distribute to 9 racks if possible. In *BlockManager#isPlacementPolicySatisfied(BlockInfo storedBlock)* , *numReplicas* of striped blocks might should be *getRealTotalBlockNum()*, instead of *getRealDataBlockNum()*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10201) Implement undo log in parity datanode for hflush operations
[ https://issues.apache.org/jira/browse/HDFS-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15259341#comment-15259341 ] GAO Rui commented on HDFS-10201: [~liuml07], I have attached the demo patch of the new UndoLog design we have discussed last week. The point is keeping the undo log related handling inside of FlushUndoLogManager, and for the specific undo log of each respective internal parity block file, both the writer and reader access the undo log via the same instance of FlushUndoLog. By doing this way, the reader and writer could suffer from minimum affects, almost both writing and reading undo log is handled in datanode side, and nearly be transparent to writer and reader. > Implement undo log in parity datanode for hflush operations > --- > > Key: HDFS-10201 > URL: https://issues.apache.org/jira/browse/HDFS-10201 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-10201-demo.patch > > > According to the current design doc for hflush support in erasure coding (see > [HDFS-7661]), the parity datanode (DN) needs an undo log for flush > operations. After hflush/hsync, the last cell will be overwritten when 1) the > current strip is full, 2) the file is closed, 3) or the hflush/hsync is > called again for the current non-full stripe. To serve new reader client and > to tolerate failures between successful hflush/hsync and overwrite operation, > the parity DN should preserve the old cell in the undo log before overwriting > it. > As parities correspond to block group (BG) length and parity data of > different BG length may have the same block length, the undo log should also > save the respective block group (BG) length information for the flushed data. > This jira is to track the effort of designing and implementing an undo log in > parity DN to support hflush/hsync operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-10201) Implement undo log in parity datanode for hflush operations
[ https://issues.apache.org/jira/browse/HDFS-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-10201: --- Attachment: HDFS-10201-demo.patch > Implement undo log in parity datanode for hflush operations > --- > > Key: HDFS-10201 > URL: https://issues.apache.org/jira/browse/HDFS-10201 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding >Reporter: Mingliang Liu >Assignee: Mingliang Liu > Attachments: HDFS-10201-demo.patch > > > According to the current design doc for hflush support in erasure coding (see > [HDFS-7661]), the parity datanode (DN) needs an undo log for flush > operations. After hflush/hsync, the last cell will be overwritten when 1) the > current strip is full, 2) the file is closed, 3) or the hflush/hsync is > called again for the current non-full stripe. To serve new reader client and > to tolerate failures between successful hflush/hsync and overwrite operation, > the parity DN should preserve the old cell in the undo log before overwriting > it. > As parities correspond to block group (BG) length and parity data of > different BG length may have the same block length, the undo log should also > save the respective block group (BG) length information for the flushed data. > This jira is to track the effort of designing and implementing an undo log in > parity DN to support hflush/hsync operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) [umbrella] support hflush and hsync for erasure coded files
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15234667#comment-15234667 ] GAO Rui commented on HDFS-7661: --- Thanks for your comments, [~drankye],[~tlipcon],[~zhz],[~walter.k.su]. I am totally agree that we should not let this JIRA block the 3.0 release. We could make hflush/hsync of EC files as TO-BE-IMPLEMENTED, and release 3.0 . For the future phrase, along with the new decode/encode library and hardware support of EC, maybe we could use striped EC instead of replication as the default and major file format of HDFS. The function of hflush/hsync would become necessary, so [~liuml07] and I may should continue to implement hflush/hsync based on the latest design. We would break down this JIRA to several sub tasks and solved them one by one. :D > [umbrella] support hflush and hsync for erasure coded files > --- > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-v20160323.pdf, > HDFS-EC-file-flush-sync-design-version1.1.pdf, Undo-Log-Design-20160406.jpg > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) [umbrella] support hflush and hsync for erasure coded files
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227810#comment-15227810 ] GAO Rui commented on HDFS-7661: --- Hi [~zhz], [~liuml07] and I have discussed about {{two version cells undo log design}}. I have attached an illustration graph [^Undo-Log-Design-20160406.jpg]. In the design, Undo Log is consisted of three parts, first and second part used to store latest flushed parity cell and current flushed parity cell. The length of these two parts was depended on EC policy. Each part could to long enough to store a full cell and it's checksum. The third part of Undo Log is a list of flush records just the same as described in the design document, only the latest successfully flushed cell pointer is added. This list could be appended as much times as needed(Generally, this would not cause the Undo Log to be too big). With the third part of Undo Log, two phrase commit mechanism could be used to control the data safety. For example: 1. The last successfully flushed cell was stored in parity-cell-1(the second part of Undo Log). 2. Current flush happens. 3. The first part of the Undo Log file updated according to current flushed parity cell. 4. New record added to the third part(the record list) of Undo Log file. For failure happens during step.3 (The first part of the Undo Log file updated.), we still have latest successful flushed parity cell in the second part of Undo Log. And the last record of the record list is pointing the second part as well. > [umbrella] support hflush and hsync for erasure coded files > --- > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-v20160323.pdf, > HDFS-EC-file-flush-sync-design-version1.1.pdf, Undo-Log-Design-20160406.jpg > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7661) [umbrella] support hflush and hsync for erasure coded files
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-7661: -- Attachment: Undo-Log-Design-20160406.jpg > [umbrella] support hflush and hsync for erasure coded files > --- > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-v20160323.pdf, > HDFS-EC-file-flush-sync-design-version1.1.pdf, Undo-Log-Design-20160406.jpg > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) [umbrella] support hflush and hsync for erasure coded files
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15223571#comment-15223571 ] GAO Rui commented on HDFS-7661: --- Hi [~zhz]. The reason of using two versions is we need to keep the latest flushed parity cells and the current flushed cells, in case of the current flush failing. We could still keep the safety of latest flushed datas even if the current flush operation failed. So the minimum request is using two partial parity cell files. I think using two versions in Parity DNS could limit both the writing and reading operations changes to mainly source code of datanode. For writing/reading client, only minor changes need to be implemented. While, storing data cells on parity DNs need totally different logical for writing/reading client implementation. > [umbrella] support hflush and hsync for erasure coded files > --- > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-v20160323.pdf, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) [umbrella] support hflush and hsync for erasure coded files
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217583#comment-15217583 ] GAO Rui commented on HDFS-7661: --- Very creative idea, [~zhz]. {{Without any overwriting}} actually could simplify {{hflush/hsync}}. Inspired by your idea, I have came up with some new thoughts. It may be a little strange to store data cells to parity DNs. Instead, maybe we could store IPB(Internal Parity Block) file as two parts(two seperate files). The first part is parity data which would not be modified. The second part is the flushed parity cell of the being written stripe. For the second part, we could keep the latest two version, for example, {{last-flushed-parity-cell-0}} and {{last-flushed-parity-cell-1}}. And the structure of {{last-flushed-parity-cell-X}} could be: logical block group length + parity cell data. So, for writing, whenever the being written stipe is been hflush/hsync, we replace the older {{last-flushed-parity-cell-X}} file with the new flushed logical block group length and new parity cell data. For reading, parity DN locally choose on of the two {{last-flushed-parity-cell-X}} files based on read client requests. With this kind of design we avoid {{overwriting}} IPB file, which simplify code implementation as well. Also we always keep the safety of the last flushed data by switch from two files names ({{last-flushed-parity-cell-0}} and {{last-flushed-parity-cell-1}}). > [umbrella] support hflush and hsync for erasure coded files > --- > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-v20160323.pdf, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) [umbrella] support hflush and hsync for erasure coded files
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209581#comment-15209581 ] GAO Rui commented on HDFS-7661: --- [~liuml07], thanks for uploading the new design doc [^HDFS-EC-file-flush-sync-design-v20160323.pdf]. Will try to add read client part and illustration figures soon. > [umbrella] support hflush and hsync for erasure coded files > --- > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-v20160323.pdf, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177578#comment-15177578 ] GAO Rui commented on HDFS-7661: --- [~liuml07], thanks for updating the design doc. I agree with you about we should not block write client, so lock is not an option. So, I think the problem has two points: 1.First, for a frequently flushed file, the read client may never got a same version from all the parity DNs. {quote} Last, given the expected BG length, if the parity DN finds that the requested data was already overwritten, it will throw exception to the reader client, which will also retry to catch up the latest flushed data. {quote} The read client could retry to get BGlen from all the DNs, but it could get version info sequences like {{v1, v3, v5}}, {{v5,v7,v9}},{{v11,v14,v16}}... This is quite a extreme scenario. But, still might could happen. {quote} The worst case for retry is that, reader client retries until the strip is full when all the parity DNs have the same data. {quote} For the current design, even if the full stripe is written, the read client still retry to read up to latest flushed data, so the version info sequences could continue to append according to the write client still frequently flush in the new stripe. I suggest, if exception was thrown, the read client do not retry to catch up the latest flushed data. Instead, the read client retry to read up to the V1 GBlen. Cause that's the time point when read client is created. So, the worst case is the write client finished the stripe related to V1 GBlen, so this stripe will not be overwritten anymore. And V1 GBlen is good enough for this read client according to created time point. 2.Second, between "the read client get BGlen from DNs" and "the read client issue read request to parity DNs provided the expected BG length", BGlen of each DNs could have already changed. So, it's not OK to resolve BGlen confliction only before the read client issue read request, also need to resolve confliction during read data. Still, if the read client could not get a same version from DNs before the current stripe becomes full, read client just read up to the first version BGlen it received. The reason I think it's not good to read up to the finished stripe is illustrated in bellowing scenario: 1. Reader1 try to read file, get V1, but never get a same version from all DNs. 2.Reader2 try to read file, because of luck, it get a same version, say V5. Because of read consistency, Reader1 should not read more data than Reader2, because Reader2 is a newer reader than Reader1. So, we should not let Reader1 read up to the stripe end, instead it reads up to the first version it received. Feel free to give more comments. > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf, > HDFS-EC-file-flush-sync-design-version2.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174860#comment-15174860 ] GAO Rui commented on HDFS-7661: --- Hi [~liuml07]. For the data consistency issue, I figured out an scenario under R-S-6-3 EC policy: 0. Write client call flush, we have V1 parity in all of parity dns: DN0,DN1,DN2 1. Two IDB(internal data block) dns failed. 2. Read client read 4 IDBs, and V1 parity in DN0. 3. Write client call flush twice. 4. Read client read V3 parity in DN1. 5. Write client call flush twice again. 6. Read client read V5 parity in DN2. 7. Read client have only five internal blocks for all of V1,V3 and V5. 8. Read fail. This is quite a extreme scenario, but still might could happen some time. Based on current design, we have only the overwritten parity data for last one version, and we do not have a lock, that could cause problem in data consistency. I think if a lock in NN is too heavy, maybe we could consider to maintain the lock in write client. So the read client get file infos and write client info from NN, then use the lock in write client to control the data consistency. Ping [~szetszwo], [~jingzhao],[~zhz], [~drankye], [~walter.k.su] and [~ikki407] for discussion :D > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158295#comment-15158295 ] GAO Rui commented on HDFS-7661: --- [~liuml07], OK, I am looking forward to discuss with you : ) > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156653#comment-15156653 ] GAO Rui commented on HDFS-7661: --- [~drankye], I have been doing some investigation about how hbase using hflush/hsync. According to this [slider | http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usage] shared by [~enis] : In page 12 and 15, for hbase, {{hflush()}} is used for writing WAL(Write Ahead Logs), WAL sync( hflush() ) hundreds of times per sec. So, I think create new bg for each flush call is not an practical option then. Maybe we could continue to discuss the previous option way: {quote} I was planing to truncate both the overwritten data in the end of both data file and .meta file in parity datanode, then store the overwritten data in the end of .meta file. One possible way to keep the data before first flush safe even if the second flush fails, maybe we could add {{upgrade/rollback}} mechanism of {{DataStorage}} alike method to data/checksum file of parity datanodes. {quote} Though, if the dns failure cause writing process to fail. We can not guarantee the data safety before first flush. Even in Replica file, we flush at some time and then continue to write file to 3dns. If we flush again in the same block, the write process is failed by dns failures, we either could not guarantee the data safety before the first flush I think. [~walter.k.su], is this make sense to you? Based on {{upgrade/rollback}} mechanism of data/checksum file of parity datanodes, we could recovery data before the first flush only in scenarios like bellow: 1.first flush successes 2.parity dn0 dies 3.data dn4,dn5 and parity dn1 failed during second flush, but parity dn2 success At, this time of point, if parity dn0 comes back, we could roll back dn2 to the status before second flush. This might be the only kind of scenario using {{upgrade/rollback}} mechanism of data/checksum file of parity datanodes. Guys, do we need to implement {{upgrade/rollback}} mechanism for this kind of scenarios? [~liuml07], [~jingzhao], for the data consistency issue. If we do not implement a lock in NN, maybe we could make the read client to check bg data length in the .meta files of 3 parity dns to check if they are in the same version like [~zhz] suggested. But, if the read client find the bg data lengths are different, the read client could try to read the .mata file again against the less bg data length parity dns. But, the bg data length could change several times, maybe the read client could not get a consistence bg data length all the time. Am I missing something? > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156395#comment-15156395 ] GAO Rui commented on HDFS-7661: --- [~zhz],[~walter.k.su], [~drankye] and [~liuml07], I thought maybe we could close current block group when flush is called. And write the new data to a new block group. This could avoid lots of trouble, but I am not sure how hive/hbase and other apps using flush. If these apps always using flush/sync lots of times against on file, we might create too much new and small block groups. Could you share your opinions? > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154002#comment-15154002 ] GAO Rui commented on HDFS-7661: --- [~zhz] good idea to use {{FsDatasetImpl#truncateBlock}} to overwrite parity block. When trying to keep the data safety of flushed data. I found that if we failed in the second flush in the same stripe, that means we have less than {{NumDataBlk}} living dns. So, both the second flush and file write client failed. In this kind of scenario, we could not keep the data safety of flushed data in the first flush anyway. Is that make sense to you? [~zhz], [~walter.k.su], [~liuml07]. If it makes sense, the key point becomes the data consistency issue between readers and writers, right? I attached a rough wip patch to demo adding {{overwrite}} and {{blockGroupLength}}, and also the position to implement {{overwrite}} file operation, although I am still working on how to implement {{overwrite}}. > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-7661: -- Attachment: HDFS-7661-wip.01.patch > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15152326#comment-15152326 ] GAO Rui commented on HDFS-7661: --- I am running into trouble when trying to overwrite and roll by data file file channel position by directly operate on underlying {{RandomAccessFile}} of data file {{OutputStream}} in {{BlockReceiver#receivePacket()}} just like what we are doing for overwrite checksum file. I was planing to truncate both the overwritten data in the end of both data file and .meta file in parity datanode, then store the overwritten data in the end of .meta file. One possible way to keep the data before first flush safe even if the second flush fails, maybe we could add {{upgrade/rollback}} mechanism of {{DataStorage}} alike method to data/checksum file of parity datanodes. I will try to address the {{RandomAccessFile}} issue and attach this wip patch ASAP. > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151916#comment-15151916 ] GAO Rui commented on HDFS-7661: --- [~liuml07], your work separation look good. I have added {{isOverwrite}} in packet header in the wip patch, like said in design document 2, we could also add {{blockGroutLen}} into packet header as well. For the data safety issue [~walter.k.su] , [~zhz] and I discussed earlier in this jira. I prefer to store the overwritten data(the last KBs of partial stripe parity internal block) into the related .meta file, in case of the failure of the second flush on the same partial stripe damage the data written before the first flush. {{two-phase commit alike method}} is also trying to keep the data data written before the first failure, right? And, Without a lock in NN, the reader might got two different version of parity blocks when issue a internal data block recovery, right? > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151628#comment-15151628 ] GAO Rui commented on HDFS-7661: --- [~liuml07], thank you very much to join this task. I will learn your design document and share my opinions. And, I am also working on a wip patch which contains unit test and add a overwrite protocol, also some changes about fsdataset operations in datanode to support file overwrite. I will sharp the patch and attache it soon. To avoid redundant work, could you share more about the detail parts of code you are dealing with? > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf, > HDFS-EC-file-flush-sync-design-version2.0.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15129954#comment-15129954 ] GAO Rui commented on HDFS-9403: --- Thanks for your comments and help, [~zhz] > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch, HDFS-9403-origin-trunk.02.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125588#comment-15125588 ] GAO Rui commented on HDFS-9494: --- Thank you all very much for your patient comments and kind help, [~jingzhao],[~szetszwo],[~rakeshr]. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120515#comment-15120515 ] GAO Rui commented on HDFS-9494: --- Thanks for your comments [~jingzhao]. bq. In DFSStripedOutputStream, we do not need to move all the data fields to the beginning of the class. I move them to the beginning of the class body cause when doing the fields related coding, it had been a little trouble to find and edit(adding) fields to {{DFSStripedOutputStream}}. Would it be easier for future coding if we move the fields to the beginning? bq. flushAllExecutor.shutdownNow() can be called in the finally section bq. flushAllFuturesMap does not need to be a class level field. It can be a temporary variable defined and used in flushAllInternals. That makes a lot of sense. I will move {{flushAllExecutor.shutdownNow()}} to finally section and make {{flushAllFuturesMap}} an local variable in the next patch. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120540#comment-15120540 ] GAO Rui commented on HDFS-9494: --- BTW. will it reduce the resource cost to make {{flushAllExecutorCompletionService}} as a class level field, or it's better to make it as local variable in {{flushAllInternals()}} to limit unnecessary exposure as well as {{flushAllFuturesMap}}? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: In Progress (was: Patch Available) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120834#comment-15120834 ] GAO Rui commented on HDFS-9494: --- Oh, I see. I am agree with you. I have upload an new 08 patch to address previous mentioned issues. Thank you again for your detailed comments! > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Attachment: HDFS-9494-origin-trunk.07.patch > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: In Progress (was: Patch Available) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: Patch Available (was: In Progress) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Attachment: HDFS-9494-origin-trunk.08.patch > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch, HDFS-9494-origin-trunk.08.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Fix Version/s: 3.0.0 Status: Patch Available (was: In Progress) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch, > HDFS-9494-origin-trunk.07.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Attachment: HDFS-9494-origin-trunk.06.patch > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: Patch Available (was: In Progress) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: In Progress (was: Patch Available) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118504#comment-15118504 ] GAO Rui commented on HDFS-9494: --- Thank you very much for your great comment, [~jingzhao]. I have update a new 06 patch to address the comments. Please let me know if you have further advice. Thank you. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15118559#comment-15118559 ] GAO Rui commented on HDFS-9494: --- For the javac and checkstyle issue, I would like to fix them as: {code} // Size of each striping cell, must be a multiple of bytesPerChecksum. {code} and {code} Future future = null; {code} > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch, HDFS-9494-origin-trunk.06.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116748#comment-15116748 ] GAO Rui commented on HDFS-9494: --- [~szetszwo],[~rakeshr], thanks a lot for your advice. For {{executor.shutdownNow()}}, if not all the tasks had been completed, we might not arrive to the end of {{flushAllInternals()}}. But there is no harmless to ensure executor shutdown, so I add {{executor.shutdownNow()}} as well. After checking the related codes, it seems that we haven't set a timeout for {{waitForAckedSeqno()}}. Maybe we could consider to set a timeout for it in another new Jira. I have updated the 05 patch. Could you kindly review it? Thank you very much. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: In Progress (was: Patch Available) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: Patch Available (was: In Progress) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Attachment: HDFS-9494-origin-trunk.05.patch > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch, > HDFS-9494-origin-trunk.05.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114789#comment-15114789 ] GAO Rui commented on HDFS-9494: --- Hi [~szetszwo], Could you share your opinions about the last patch? Thank you very much. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110348#comment-15110348 ] GAO Rui commented on HDFS-9403: --- [~zhz], sorry for the delay. I have updated a new 02 patch. Please review it when you got time. Thank you very much. > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch, HDFS-9403-origin-trunk.02.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9403: -- Attachment: HDFS-9403-origin-trunk.02.patch > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch, HDFS-9403-origin-trunk.02.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9403: -- Status: Patch Available (was: Open) > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch, HDFS-9403-origin-trunk.02.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9403: -- Status: Open (was: Patch Available) > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085278#comment-15085278 ] GAO Rui commented on HDFS-7661: --- Thank you [~walter.k.su]. I understand your concern now. You are right, in {{Scenario 4: flush/sync was invoked more than one time in the same strip}}, the described dealing steps risk the safety of data written before 1st flush when over written parity internal blocks. I will consider this problem and try to figure out a better design. Thank you very much for pointing out this problem! > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085157#comment-15085157 ] GAO Rui commented on HDFS-7661: --- Based on GS, when the 2nd flush invoked, and datanode failure happened, if the new internal block(internal block with new GS) count is less than needed, the writing process should be judged as failed. Error message should be printed to the user I think. > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15085134#comment-15085134 ] GAO Rui commented on HDFS-7661: --- [~zhz], these make sense to me. Let's store data length in {{.meta}} file. For replication block .meta file, data length means the length of the related block. While, for EC internal block .meta file, data length means the related block group data length. > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082069#comment-15082069 ] GAO Rui commented on HDFS-9403: --- It is marked as *Deprecated* in the Junit 4.12 document: http://junit.org/apidocs/org/junit/rules/Timeout.html#seconds(long {code} Timeout @Deprecated public Timeout(int millis) Deprecated. Create a Timeout instance with the timeout specified in milliseconds. This constructor is deprecated. Instead use Timeout(long, java.util.concurrent.TimeUnit), millis(long), or seconds(long). Parameters: millis - the maximum time in milliseconds to allow the test to run before it should timeout {code} > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082366#comment-15082366 ] GAO Rui commented on HDFS-7661: --- See the {{4. Scenario 4: flush/sync was invoked more than one time in the same strip.}} in [^HDFS-EC-file-flush-sync-design-version1.1.pdf]. Consider the follow events sequences: 1. flush at {{Flush point 2.1}}, parity internal block: PIB1, PIB2 and PIB3 were calculated. 2. flush at {{Flush point 2.1.1}}, PIB1 and PIB2 were updated to PIB1` and PIB 2`, but PIB3 failed in updating. 3. PIB1`, the first and third data internal block were down. 4. Read file. 5. Recovery the missing data internal block using PIB2` and PIB3. There we got a problem to recovery data correctly. > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15082145#comment-15082145 ] GAO Rui commented on HDFS-7661: --- Hi [~zhz], it's a good idea to store the length info in {{.meta}} file. The additional locks is for the situation in which failures happened. For example, we have a block group consisted by 6 data internal blocks and 3 parity internal blocks. When updating the length of this block group, we might failed to update 1 of the 3 parity internal blocks. But, the *older* parity internal block comes back later, then we have different version parity blocks. It could cause problems when further possible internal block recovery happens. > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067772#comment-15067772 ] GAO Rui commented on HDFS-9494: --- Hi [~szetszwo], I agree processing failures sequentially is more safer than trying to confirm {{checkStreamers()}}'s thread safety. I have updated a new patch. Could you review it, again? I use a {{ConcurrentHashMap}} to keep both the streamer and the related exception of this streamer inside of {{call()}}, so that we could provide more specific informations to {{handleStreamerFailure()}} later. For the handling codes: {code} Iterator> iterator = streamersExceptionMap.entrySet().iterator(); while (iterator.hasNext()) { Map.Entry entry = iterator.next(); StripedDataStreamer s = entry.getKey(); Exception e = entry.getValue(); handleStreamerFailure("flushInternal " + s, e, s); iterator.remove(); } {code} I think maybe we could move these codes to the position after this the for loop, so codes will become look like: {code} for (int i = 0; i < healthyStreamerCount; i++) { try { executorCompletionService.take().get(); } catch (InterruptedException ie) { throw DFSUtilClient.toInterruptedIOException( "Interrupted during waiting all streamer flush, ", ie); } catch (ExecutionException ee) { LOG.warn( "Caught ExecutionException while waiting all streamer flush, ", ee); } } Iterator > iterator = streamersExceptionMap.entrySet().iterator(); while (iterator.hasNext()) { Map.Entry entry = iterator.next(); StripedDataStreamer s = entry.getKey(); Exception e = entry.getValue(); handleStreamerFailure("flushInternal " + s, e, s); iterator.remove(); } {code} But for safe consideration, in the new 04 patch, I put these handling codes inside the for loop, as finally sub case. Could you share your opinions? Thank you. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: Patch Available (was: In Progress) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: In Progress (was: Patch Available) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Attachment: HDFS-9494-origin-trunk.04.patch > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch, HDFS-9494-origin-trunk.04.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067363#comment-15067363 ] GAO Rui commented on HDFS-9494: --- Thanks Nicholas. I have considered {{checkStreamers()}} issue before too. Cause in {{checkStreamers()}}, we just read {{streamers}} and {{failedStreamers}}, no operation to change them. So, even different thread call {{checkStreamers()}}, there may not have a conflict I think. For race condition, could you share more details? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066214#comment-15066214 ] GAO Rui commented on HDFS-9494: --- [~szetszwo], great catch! Sorry, I ignored that handleStreamerFailure itself might throw an IOException. I will upload a new patch to address this soon. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066224#comment-15066224 ] GAO Rui commented on HDFS-9494: --- Additionally, I think we might encounter other thread related ExecutionException as well, right? So, I suggest to keep printing a warning while also re-throw the ExecutionException as an IOException. {code} } catch (ExecutionException ee) { LOG.warn( "Caught ExecutionException while waiting all streamer flush, ", ee); throw new IOException(ee); } {code} Do you think the above codes is OK? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066009#comment-15066009 ] GAO Rui commented on HDFS-9494: --- [~rakeshr], thank you very much for your kind review and advice. Let's wait for [~szetszwo] and [~jingzhao]'s comment to see if it's OK to commit the latest patch :) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15065993#comment-15065993 ] GAO Rui commented on HDFS-9403: --- Thank you very much for teach me about Junit Rule. I think generally we could set timeout rule for test classes as 1 min. And, for those test classes which including test method timeout longer than 1 min, we could set the timeout rule of those classes as long as the longest test method timeout. I will upload a new patch soon. > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066001#comment-15066001 ] GAO Rui commented on HDFS-9403: --- Hi [~zhz], I just found out that according to maven setting, currently we use Junit 4.11 in trunk branch. So, we could only create a org.junit.rules.Timeout instance by: {code} @Rule public Timeout globalTimeout = new Timeout(6); {code} But, to make the test code more human readable, we might prefer use {{Timeout.seconds(60)}}, but this is not supported until Junit 4.12( http://junit.org/apidocs/org/junit/rules/Timeout.html#seconds(long) ). In this kind of situation, should we using {{new Timeout(6)}} which has been marked as *Deprecated*, or we should try to shift the Junit version? > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Erasure coding: support hflush and hsync
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059507#comment-15059507 ] GAO Rui commented on HDFS-7661: --- Thank you [~vinayrpet]! Would you like to review this design document [^HDFS-EC-file-flush-sync-design-version1.1.pdf] and share your opinions? > Erasure coding: support hflush and hsync > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059388#comment-15059388 ] GAO Rui commented on HDFS-9494: --- [~szetszwo], [~rakeshr], and [~jingzhao], thank you very much for your comments. I agree that refactor {{handleStreamerFailure()}} and handle exception inside {{call()}}, also make {{handleStreamerFailure()}} synchronized. I think in the previous 02 patch: {code} for (int i = 0; i < healthyStreamerCount; i++) { try { executorCompletionService.take().get(); } catch (InterruptedException ie) { throw DFSUtilClient.toInterruptedIOException( "Interrupted during waiting all streamer flush, ", ie); } catch (ExecutionException ee) { LOG.warn( "Caught ExecutionException while waiting all streamer flush, ", ee); } } {code} Exception which might be thrown by {{waitForAckedSeqno}} was not caught. I will draft a new patch soon. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9403: -- Status: Patch Available (was: In Progress) > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: In Progress (was: Patch Available) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9403: -- Status: In Progress (was: Patch Available) > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Attachment: HDFS-9494-origin-trunk.03.patch > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: Patch Available (was: In Progress) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059563#comment-15059563 ] GAO Rui commented on HDFS-9403: --- Hi [~zhz], The failed tests seems not related to the patch. But still I run failed tests in local machine with java version {{1.7.0_75}}. They all passed. Does these tests failed according to java version exactly? I have merged current branch with trunk, and will update a new patch. > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9403) Erasure coding: some EC tests are missing timeout
[ https://issues.apache.org/jira/browse/HDFS-9403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9403: -- Attachment: HDFS-9403-origin-trunk.01.patch Hi [~zhz], new patch was attached(01 patch). Please review it when you are available. Thank you very much. > Erasure coding: some EC tests are missing timeout > - > > Key: HDFS-9403 > URL: https://issues.apache.org/jira/browse/HDFS-9403 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding, test >Affects Versions: 3.0.0 >Reporter: Zhe Zhang >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9403-origin-trunk.00.patch, > HDFS-9403-origin-trunk.01.patch > > > EC data writing pipeline is still being worked on, and bugs could introduce > program hang. We should add a timeout for all tests involving striped > writing. I see at least the following: > * {{TestErasureCodingPolicies}} > * {{TestFileStatusWithECPolicy}} > * {{TestDFSStripedOutputStream}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059643#comment-15059643 ] GAO Rui commented on HDFS-9494: --- [~szetszwo], [~rakeshr] and [~jingzhao], 03 patch was attached. Streamer related exception is handled inside {{call()}}. ExecutorCompletionService related exceptions is handled as the same way with previous 02 patch. Please review the new patch and feel free to give further comments when you are available. Thank you very much. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch, > HDFS-9494-origin-trunk.03.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: In Progress (was: Patch Available) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Status: Patch Available (was: In Progress) > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057377#comment-15057377 ] GAO Rui commented on HDFS-9494: --- [~rakeshr], thank you very much for your detailed comments! I have addressed 1,3,4 in previous comment. For the second one, I drafted the following code: {code} for (int i = 0; i < healthyStreamerCount; i++) { try { executorCompletionService.take().get(); } catch (InterruptedException ie) { throw DFSUtilClient.toInterruptedIOException( "Interrupted during waiting all streamer flush. ", ie); } catch (ExecutionException ee) { LOG.warn("Caught ExecutionException during waiting all streamer " + "flush", ee); } } {code} I think it should be enough for handling {{ExecutionException}}. What do you think? > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7344) [umbrella] Erasure Coding worker and support in DataNode
[ https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057515#comment-15057515 ] GAO Rui commented on HDFS-7344: --- [~zhz] Thank you very much for your information. I am trying to draft a design and then implement {{Converter}} which is mentioned in HDFS-7717. I think maybe we should design as {{Converter}} distribute conversation tasks to several {{ErasureCodingWorker}}, then actually these {{ErasureCodingWorker}} implement conversation tasks. Could you share your opinions? Currently, if put Erasure Coding to product clusters, {{Converter}} should be among the most used functions. Lots of replication files would be converted to EC files. Without {{Converter}}, we can only use distcp. {{Converter}} should could be much more efficient, right? > [umbrella] Erasure Coding worker and support in DataNode > > > Key: HDFS-7344 > URL: https://issues.apache.org/jira/browse/HDFS-7344 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Kai Zheng >Assignee: Li Bo > Attachments: ECWorker-design-v2.pdf, HDFS ECWorker Design.pdf, > hdfs-ec-datanode.0108.zip, hdfs-ec-datanode.0108.zip > > > According to HDFS-7285 and the design, this handles DataNode side extension > and related support for Erasure Coding. More specifically, it implements > {{ECWorker}}, which reconstructs lost blocks (in striping layout). > It generally needs to restore BlockGroup and schema information from coding > commands from NameNode or other entities, and construct specific coding work > to execute. The required block reader, writer, either local or remote, > encoder and decoder, will be implemented separately as sub-tasks. > This JIRA will track all the linked sub-tasks, and is responsible for general > discussions and integration for ECWorker. It won't resolve until all the > related tasks are done. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7661) Support read when a EC file is being written
[ https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057439#comment-15057439 ] GAO Rui commented on HDFS-7661: --- Hi [~zhz], I am agree with you about renaming this JIRA to "Support hflush/hsync". Let's wait for the reply of [~vinayrpet], then merge this two JIRAs and work together to put hflush/hsync down :) > Support read when a EC file is being written > > > Key: HDFS-7661 > URL: https://issues.apache.org/jira/browse/HDFS-7661 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Tsz Wo Nicholas Sze >Assignee: GAO Rui > Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, > HDFS-7661-unitTest-wip-trunk.patch, > HDFS-EC-file-flush-sync-design-version1.1.pdf > > > We also need to support hflush/hsync and visible length. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Attachment: HDFS-9494-origin-trunk.02.patch New patch attached. Please feel free to give any comments, thanks. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch, HDFS-9494-origin-trunk.02.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7344) [umbrella] Erasure Coding worker and support in DataNode
[ https://issues.apache.org/jira/browse/HDFS-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055600#comment-15055600 ] GAO Rui commented on HDFS-7344: --- Hi [~drankye], [~libo-intel], do we still use this Jira to track the implementing process of {{ErasureCodingWorker}} ? Could you give me some advice and Jiras which could help me to catch up with the {{ErasureCodingWorker}} implementing? > [umbrella] Erasure Coding worker and support in DataNode > > > Key: HDFS-7344 > URL: https://issues.apache.org/jira/browse/HDFS-7344 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode >Reporter: Kai Zheng >Assignee: Li Bo > Attachments: ECWorker-design-v2.pdf, HDFS ECWorker Design.pdf, > hdfs-ec-datanode.0108.zip, hdfs-ec-datanode.0108.zip > > > According to HDFS-7285 and the design, this handles DataNode side extension > and related support for Erasure Coding. More specifically, it implements > {{ECWorker}}, which reconstructs lost blocks (in striping layout). > It generally needs to restore BlockGroup and schema information from coding > commands from NameNode or other entities, and construct specific coding work > to execute. The required block reader, writer, either local or remote, > encoder and decoder, will be implemented separately as sub-tasks. > This JIRA will track all the linked sub-tasks, and is responsible for general > discussions and integration for ECWorker. It won't resolve until all the > related tasks are done. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055659#comment-15055659 ] GAO Rui commented on HDFS-9494: --- I have fixed the checkstyle issues found by HADOOP QA. But haven't get a idea about the javac issue. I'll upload the next patch which will not causing checkstyle issues along with addressing further commends. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GAO Rui updated HDFS-9494: -- Attachment: HDFS-9494-origin-trunk.01.patch > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9494) Parallel optimization of DFSStripedOutputStream#flushAllInternals( )
[ https://issues.apache.org/jira/browse/HDFS-9494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1502#comment-1502 ] GAO Rui commented on HDFS-9494: --- Thank you very much [~szetszwo]. I have refactored the code and using {{ExecutorCompletionService.take()}} to waiting for all the streamer flushing. Because, {{ExecutorCompletionService}} could not tell us how many Future objects are still there waiting, so I use {{healthyStreamerCount}} to count how many times we should call take(). And {{Runnable.run()}} can not throw Exception, so I use {{Callable.call() throws Exception}} and return null. > Parallel optimization of DFSStripedOutputStream#flushAllInternals( ) > > > Key: HDFS-9494 > URL: https://issues.apache.org/jira/browse/HDFS-9494 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: GAO Rui >Assignee: GAO Rui >Priority: Minor > Attachments: HDFS-9494-origin-trunk.00.patch, > HDFS-9494-origin-trunk.01.patch > > > Currently, in DFSStripedOutputStream#flushAllInternals( ), we trigger and > wait for flushInternal( ) in sequence. So the runtime flow is like: > {code} > Streamer0#flushInternal( ) > Streamer0#waitForAckedSeqno( ) > Streamer1#flushInternal( ) > Streamer1#waitForAckedSeqno( ) > … > Streamer8#flushInternal( ) > Streamer8#waitForAckedSeqno( ) > {code} > It could be better to trigger all the streamers to flushInternal( ) and > wait for all of them to return from waitForAckedSeqno( ), and then > flushAllInternals( ) returns. -- This message was sent by Atlassian JIRA (v6.3.4#6332)