[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11817: --- Attachment: HDFS-11817.branch-2.7.001.patch > A faulty node can cause a lease leak and NPE on accessing data > -- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: HDFS-11817.branch-2.7.001.patch, > HDFS-11817.branch-2.patch, HDFS-11817.v2.branch-2.8.patch, > HDFS-11817.v2.branch-2.patch, HDFS-11817.v2.trunk.patch, > hdfs-11817_supplement.txt > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-11817: - Fix Version/s: 2.9.0 > A faulty node can cause a lease leak and NPE on accessing data > -- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2 > > Attachments: HDFS-11817.branch-2.patch, hdfs-11817_supplement.txt, > HDFS-11817.v2.branch-2.8.patch, HDFS-11817.v2.branch-2.patch, > HDFS-11817.v2.trunk.patch > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Attachment: HDFS-11817.v2.branch-2.8.patch Attaching what's committed to 2.8 as reference. The cherry-pick from branch-2 was clean, but had to update one method call in the test, since the containing class has been changed. > A faulty node can cause a lease leak and NPE on accessing data > -- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0-alpha3, 2.8.2 > > Attachments: HDFS-11817.branch-2.patch, hdfs-11817_supplement.txt, > HDFS-11817.v2.branch-2.8.patch, HDFS-11817.v2.branch-2.patch, > HDFS-11817.v2.trunk.patch > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.2 3.0.0-alpha3 Status: Resolved (was: Patch Available) Thanks for the review, Daryn. > A faulty node can cause a lease leak and NPE on accessing data > -- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Fix For: 3.0.0-alpha3, 2.8.2 > > Attachments: HDFS-11817.branch-2.patch, hdfs-11817_supplement.txt, > HDFS-11817.v2.branch-2.patch, HDFS-11817.v2.trunk.patch > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Attachment: HDFS-11817.v2.trunk.patch HDFS-11817.v2.branch-2.patch > A faulty node can cause a lease leak and NPE on accessing data > -- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-11817.branch-2.patch, hdfs-11817_supplement.txt, > HDFS-11817.v2.branch-2.patch, HDFS-11817.v2.trunk.patch > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Status: Patch Available (was: Open) > A faulty node can cause a lease leak and NPE on accessing data > -- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-11817.branch-2.patch, hdfs-11817_supplement.txt > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Attachment: HDFS-11817.branch-2.patch I started the patch for branch-2.8 and branch-2. The trunk version is not ready yet, but want to run the branch-2 version through precommit before the weekend. > A faulty node can cause a lease leak and NPE on accessing data > -- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: HDFS-11817.branch-2.patch, hdfs-11817_supplement.txt > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11817) A faulty node can cause a lease leak and NPE on accessing data
[ https://issues.apache.org/jira/browse/HDFS-11817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-11817: -- Attachment: hdfs-11817_supplement.txt Attaching supplemental information including stack traces. > A faulty node can cause a lease leak and NPE on accessing data > -- > > Key: HDFS-11817 > URL: https://issues.apache.org/jira/browse/HDFS-11817 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Kihwal Lee >Assignee: Kihwal Lee >Priority: Critical > Attachments: hdfs-11817_supplement.txt > > > When the namenode performs a lease recovery for a failed write, the > {{commitBlockSynchronization()}} will fail, if none of the new target has > sent a received-IBR. At this point, the data is inaccessible, as the > namenode will throw a {{NullPointerException}} upon {{getBlockLocations()}}. > The lease recovery will be retried in about an hour by the namenode. If the > nodes are faulty (usually when there is only one new target), they may not > block report until this point. If this happens, lease recovery throws an > {{AlreadyBeingCreatedException}}, which causes LeaseManager to simply remove > the lease without finalizing the inode. > This results in an inconsistent lease state. The inode stays > under-construction, but no more lease recovery is attempted. A manual lease > recovery is also not allowed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org