[jira] [Updated] (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-142: - Fix Version/s: 0.20.205.0 > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.20-append >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Fix For: 0.20-append, 0.20.205.0 > > Attachments: HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > HDFS-142.20-security.1.patch, HDFS-142.20-security.2.patch, > HDFS-142_20-append2.patch, HDFS-142_20.patch, appendFile-recheck-lease.txt, > appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, deleteTmp5_20.txt, > deleteTmp5_20.txt, deleteTmp_0.18.patch, > dont-recover-rwr-when-rbw-available.txt, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > hdfs-142-minidfs-fix-from-409.txt, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, recentInvalidateSets-assertion-fix.txt, > recover-rbw-v2.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-142: -- Attachment: HDFS-142.20-security.2.patch Added Apache License header. > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.20-append >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Fix For: 0.20-append, 0.20.205.0 > > Attachments: HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > HDFS-142.20-security.1.patch, HDFS-142.20-security.2.patch, > HDFS-142_20-append2.patch, HDFS-142_20.patch, appendFile-recheck-lease.txt, > appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, deleteTmp5_20.txt, > deleteTmp5_20.txt, deleteTmp_0.18.patch, > dont-recover-rwr-when-rbw-available.txt, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > hdfs-142-minidfs-fix-from-409.txt, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, recentInvalidateSets-assertion-fix.txt, > recover-rbw-v2.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-142: -- Attachment: HDFS-142.20-security.1.patch Patch for 20-security branch uploaded. > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.20-append >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Fix For: 0.20-append > > Attachments: HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > HDFS-142.20-security.1.patch, HDFS-142_20-append2.patch, HDFS-142_20.patch, > appendFile-recheck-lease.txt, appendQuestions.txt, deleteTmp.patch, > deleteTmp2.patch, deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, > dont-recover-rwr-when-rbw-available.txt, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > hdfs-142-minidfs-fix-from-409.txt, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, recentInvalidateSets-assertion-fix.txt, > recover-rbw-v2.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Spiegelberg updated HDFS-142: - Attachment: HDFS-142_20-append2.patch Patch for 0.20-append branch. Starts with HDFS-142_20.patch and includes all patches up to appendFile-recheck-lease.txt. Had trouble adding the relatively-new recover-rbw-v2.txt, so left that for Todd. Assumes that the 0.20-append patches in HDFS-826, HDFS-988, & HDFS-101 have been applied previously. > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.20-append >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Fix For: 0.20-append > > Attachments: appendFile-recheck-lease.txt, appendQuestions.txt, > deleteTmp.patch, deleteTmp2.patch, deleteTmp5_20.txt, deleteTmp5_20.txt, > deleteTmp_0.18.patch, dont-recover-rwr-when-rbw-available.txt, > handleTmp1.patch, hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20-append2.patch, > HDFS-142_20.patch, recentInvalidateSets-assertion-fix.txt, > recover-rbw-v2.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Attachment: recover-rbw-v2.txt New version of the dont-recover-rbw patch (this is what I've been testing against) > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.20-append >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Fix For: 0.20-append > > Attachments: appendFile-recheck-lease.txt, appendQuestions.txt, > deleteTmp.patch, deleteTmp2.patch, deleteTmp5_20.txt, deleteTmp5_20.txt, > deleteTmp_0.18.patch, dont-recover-rwr-when-rbw-available.txt, > handleTmp1.patch, hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20.patch, > recentInvalidateSets-assertion-fix.txt, recover-rbw-v2.txt, > testfileappend4-deaddn.txt, validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-142: -- Fix Version/s: 0.20-append > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.20-append >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Fix For: 0.20-append > > Attachments: appendFile-recheck-lease.txt, appendQuestions.txt, > deleteTmp.patch, deleteTmp2.patch, deleteTmp5_20.txt, deleteTmp5_20.txt, > deleteTmp_0.18.patch, dont-recover-rwr-when-rbw-available.txt, > handleTmp1.patch, hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20.patch, > recentInvalidateSets-assertion-fix.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-142: -- Affects Version/s: 0.20-append > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.20-append >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Fix For: 0.20-append > > Attachments: appendFile-recheck-lease.txt, appendQuestions.txt, > deleteTmp.patch, deleteTmp2.patch, deleteTmp5_20.txt, deleteTmp5_20.txt, > deleteTmp_0.18.patch, dont-recover-rwr-when-rbw-available.txt, > handleTmp1.patch, hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20.patch, > recentInvalidateSets-assertion-fix.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Attachment: dont-recover-rwr-when-rbw-available.txt Attached patch treats replicas recovered during DN startup as possibly truncated, and thus recovers those replicas from still-running DNs only if such replicas are available. (included test case explains this better) > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendFile-recheck-lease.txt, appendQuestions.txt, > deleteTmp.patch, deleteTmp2.patch, deleteTmp5_20.txt, deleteTmp5_20.txt, > deleteTmp_0.18.patch, dont-recover-rwr-when-rbw-available.txt, > handleTmp1.patch, hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20.patch, > recentInvalidateSets-assertion-fix.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Attachment: appendFile-recheck-lease.txt appendFile() is made up of two synchronized blocks, and there is no re-check of the file existence (or lease) when entering the second one. Uploading a test case and fix. > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendFile-recheck-lease.txt, appendQuestions.txt, > deleteTmp.patch, deleteTmp2.patch, deleteTmp5_20.txt, deleteTmp5_20.txt, > deleteTmp_0.18.patch, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20.patch, > recentInvalidateSets-assertion-fix.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Attachment: recentInvalidateSets-assertion-fix.txt Small fix for another test failure exposed by TestFileAppend2.testComplexAppend (when run with java assertions enabled). When we removed blocks from recentInvalidateSets, we didn't remove the collections when they became empty, which triggered an assertion at the top of DatanodeDescriptor.addBlocksToBeInvalidated (this is not an issue in trunk, trunk has this same fix) > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20.patch, > recentInvalidateSets-assertion-fix.txt, testfileappend4-deaddn.txt, > validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Attachment: validateBlockMetaData-synchronized.txt > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20.patch, > testfileappend4-deaddn.txt, validateBlockMetaData-synchronized.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Attachment: hdfs-142-testleaserecovery-fix.txt Here's a fix for TestLeaseRecovery so that it passes even with the new safeguards in FSDataset.updateBlock > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > hdfs-142-testleaserecovery-fix.txt, HDFS-142_20.patch, > testfileappend4-deaddn.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Attachment: hdfs-142-recovery-reassignment-and-bbw-cleanup.txt Attaching a patch with two more fixes: - If a block is received that is a part of a file that no longer exists, remove it. This prevents blocks from getting orphaned in the blocksBeingWritten directory forever - File recovery happens after reassigning lease to an NN_Recovery client This also includes safeguards and tests to ensure that straggling commitBlockSynchronization calls cannot incorrectly overwrite the last block of a file with an old generation stamp or a different block ID. > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, > hdfs-142-recovery-reassignment-and-bbw-cleanup.txt, hdfs-142-testcases.txt, > HDFS-142_20.patch, testfileappend4-deaddn.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Attachment: hdfs-142-commitBlockSynchronization-unknown-datanode.txt hdfs-142-testcases.txt Uploading two more patches for 0.20 append: - hdfs-142-commitBlockSynchronization-unknown-datanode.txt fixes a case where FSN.getDatanode was throwing an UnregisteredDatanodeException since one of the original recovery targets had departed the cluster (in this case been replaced by a new DN with the same storage but a different port). This exception was causing the commitBlockSynchronization to fail after removing the old block from blocksMap but before putting in the new one, making both old and new blocks inaccessible, and causing any further nextGenerationStamp calls to fail. - hdfs-142-testcases.txt includes two new test cases: -- testRecoverFinalizedBlock stops a writer just before it calls completeFile() and then has another client recover the file -- testDatanodeFailsToCommit() injects an IOE when the DN calls commitBlockSynchronization for the first time, to make sure that the retry succeeds even though updateBlocks() was already called during the first synchronization attempt. -- These tests pass after applying Sam's patch to fix refinalization of a finalized block. > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > hdfs-142-commitBlockSynchronization-unknown-datanode.txt, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, hdfs-142-testcases.txt, > HDFS-142_20.patch, testfileappend4-deaddn.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam rash updated HDFS-142: -- Attachment: HDFS-142-finalize-fix.txt fixes issue with lease recovery failing on blocks that are already finalized > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > HDFS-142-deaddn-fix.patch, HDFS-142-finalize-fix.txt, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, HDFS-142_20.patch, > testfileappend4-deaddn.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Spiegelberg updated HDFS-142: - Attachment: HDFS-142-deaddn-fix.patch Added patch to fix Todd's deaddn problem. The main problem: DFSOutputStream called processDatanodeError() but then ignored the return value. This meant that any slew of pipeline creation exceptions would be ignored and the client would think that append() passed. Good catch! > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > HDFS-142-deaddn-fix.patch, hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, HDFS-142_20.patch, > testfileappend4-deaddn.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-142: - Summary: In 0.20, move blocks being written into a blocksBeingWritten directory (was: Datanode should delete files under tmp when upgraded from 0.17) Renaming JIRA to reflect the actual scope of this issue in the branch-20 sync work > In 0.20, move blocks being written into a blocksBeingWritten directory > -- > > Key: HDFS-142 > URL: https://issues.apache.org/jira/browse/HDFS-142 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Raghu Angadi >Assignee: dhruba borthakur >Priority: Blocker > Attachments: appendQuestions.txt, deleteTmp.patch, deleteTmp2.patch, > deleteTmp5_20.txt, deleteTmp5_20.txt, deleteTmp_0.18.patch, handleTmp1.patch, > hdfs-142-minidfs-fix-from-409.txt, > HDFS-142-multiple-blocks-datanode-exception.patch, HDFS-142_20.patch, > testfileappend4-deaddn.txt > > > Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp > directory since these files are not valid anymore. But in 0.18 it moves these > files to normal directory incorrectly making them valid blocks. One of the > following would work : > - remove the tmp files during upgrade, or > - if the files under /tmp are in pre-18 format (i.e. no generation), delete > them. > Currently effect of this bug is that, these files end up failing block > verification and eventually get deleted. But cause incorrect over-replication > at the namenode before that. > Also it looks like our policy regd treating files under tmp needs to be > defined better. Right now there are probably one or two more bugs with it. > Dhruba, please file them if you rememeber. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.