[jira] [Updated] (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2015-03-09 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-1262:
---
Fix Version/s: (was: 0.20-append)

> Failed pipeline creation during append leaves lease hanging on NN
> -
>
> Key: HDFS-1262
> URL: https://issues.apache.org/jira/browse/HDFS-1262
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, namenode
>Affects Versions: 0.20-append
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Critical
> Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt, 
> hdfs-1262-4.txt, hdfs-1262-5.txt
>
>
> Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
> was the following:
> 1) File's original writer died
> 2) Recovery client tried to open file for append - looped for a minute or so 
> until soft lease expired, then append call initiated recovery
> 3) Recovery completed successfully
> 4) Recovery client calls append again, which succeeds on the NN
> 5) For some reason, the block recovery that happens at the start of append 
> pipeline creation failed on all datanodes 6 times, causing the append() call 
> to throw an exception back to HBase master. HBase assumed the file wasn't 
> open and put it back on a queue to try later
> 6) Some time later, it tried append again, but the lease was still assigned 
> to the same DFS client, so it wasn't able to recover.
> The recovery failure in step 5 is a separate issue, but the problem for this 
> JIRA is that the NN can think it failed to open a file for append when the NN 
> thinks the writer holds a lease. Since the writer keeps renewing its lease, 
> recovery never happens, and no one can open or recover the file until the DFS 
> client shuts down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-28 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-1.txt

-test case for append and create failures.
-tried to get it so both cases fail fast, but create will hit the test timeout 
(default for create that gets AlreadyBeingCreatedException is 5 retries with 
60s sleep)
-append case fails in 30s w/o the fix worst case


> Failed pipeline creation during append leaves lease hanging on NN
> -
>
> Key: HDFS-1262
> URL: https://issues.apache.org/jira/browse/HDFS-1262
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, name-node
>Affects Versions: 0.20-append
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Critical
> Fix For: 0.20-append
>
> Attachments: hdfs-1262-1.txt
>
>
> Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
> was the following:
> 1) File's original writer died
> 2) Recovery client tried to open file for append - looped for a minute or so 
> until soft lease expired, then append call initiated recovery
> 3) Recovery completed successfully
> 4) Recovery client calls append again, which succeeds on the NN
> 5) For some reason, the block recovery that happens at the start of append 
> pipeline creation failed on all datanodes 6 times, causing the append() call 
> to throw an exception back to HBase master. HBase assumed the file wasn't 
> open and put it back on a queue to try later
> 6) Some time later, it tried append again, but the lease was still assigned 
> to the same DFS client, so it wasn't able to recover.
> The recovery failure in step 5 is a separate issue, but the problem for this 
> JIRA is that the NN can think it failed to open a file for append when the NN 
> thinks the writer holds a lease. Since the writer keeps renewing its lease, 
> recovery never happens, and no one can open or recover the file until the DFS 
> client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-06-30 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-2.txt

removed hdfs-894 change from patch (commit this to 0.20-append separately)

> Failed pipeline creation during append leaves lease hanging on NN
> -
>
> Key: HDFS-1262
> URL: https://issues.apache.org/jira/browse/HDFS-1262
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, name-node
>Affects Versions: 0.20-append
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Critical
> Fix For: 0.20-append
>
> Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt
>
>
> Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
> was the following:
> 1) File's original writer died
> 2) Recovery client tried to open file for append - looped for a minute or so 
> until soft lease expired, then append call initiated recovery
> 3) Recovery completed successfully
> 4) Recovery client calls append again, which succeeds on the NN
> 5) For some reason, the block recovery that happens at the start of append 
> pipeline creation failed on all datanodes 6 times, causing the append() call 
> to throw an exception back to HBase master. HBase assumed the file wasn't 
> open and put it back on a queue to try later
> 6) Some time later, it tried append again, but the lease was still assigned 
> to the same DFS client, so it wasn't able to recover.
> The recovery failure in step 5 is a separate issue, but the problem for this 
> JIRA is that the NN can think it failed to open a file for append when the NN 
> thinks the writer holds a lease. Since the writer keeps renewing its lease, 
> recovery never happens, and no one can open or recover the file until the DFS 
> client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-07-22 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-3.txt

removed empty file MockitoUtil

> Failed pipeline creation during append leaves lease hanging on NN
> -
>
> Key: HDFS-1262
> URL: https://issues.apache.org/jira/browse/HDFS-1262
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, name-node
>Affects Versions: 0.20-append
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Critical
> Fix For: 0.20-append
>
> Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt
>
>
> Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
> was the following:
> 1) File's original writer died
> 2) Recovery client tried to open file for append - looped for a minute or so 
> until soft lease expired, then append call initiated recovery
> 3) Recovery completed successfully
> 4) Recovery client calls append again, which succeeds on the NN
> 5) For some reason, the block recovery that happens at the start of append 
> pipeline creation failed on all datanodes 6 times, causing the append() call 
> to throw an exception back to HBase master. HBase assumed the file wasn't 
> open and put it back on a queue to try later
> 6) Some time later, it tried append again, but the lease was still assigned 
> to the same DFS client, so it wasn't able to recover.
> The recovery failure in step 5 is a separate issue, but the problem for this 
> JIRA is that the NN can think it failed to open a file for append when the NN 
> thinks the writer holds a lease. Since the writer keeps renewing its lease, 
> recovery never happens, and no one can open or recover the file until the DFS 
> client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-07-22 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-4.txt

fixed bug where calling append() to trigger lease recovery resulted in a 
client-side exception (trying to abandon a file that you don't own lease on).

DFSClient now catches this exception and logs it

> Failed pipeline creation during append leaves lease hanging on NN
> -
>
> Key: HDFS-1262
> URL: https://issues.apache.org/jira/browse/HDFS-1262
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, name-node
>Affects Versions: 0.20-append
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Critical
> Fix For: 0.20-append
>
> Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt, 
> hdfs-1262-4.txt
>
>
> Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
> was the following:
> 1) File's original writer died
> 2) Recovery client tried to open file for append - looped for a minute or so 
> until soft lease expired, then append call initiated recovery
> 3) Recovery completed successfully
> 4) Recovery client calls append again, which succeeds on the NN
> 5) For some reason, the block recovery that happens at the start of append 
> pipeline creation failed on all datanodes 6 times, causing the append() call 
> to throw an exception back to HBase master. HBase assumed the file wasn't 
> open and put it back on a queue to try later
> 6) Some time later, it tried append again, but the lease was still assigned 
> to the same DFS client, so it wasn't able to recover.
> The recovery failure in step 5 is a separate issue, but the problem for this 
> JIRA is that the NN can think it failed to open a file for append when the NN 
> thinks the writer holds a lease. Since the writer keeps renewing its lease, 
> recovery never happens, and no one can open or recover the file until the DFS 
> client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-1262) Failed pipeline creation during append leaves lease hanging on NN

2010-08-22 Thread sam rash (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sam rash updated HDFS-1262:
---

Attachment: hdfs-1262-5.txt

address todd's comments (except for RPC compatibility--pending discussion)

> Failed pipeline creation during append leaves lease hanging on NN
> -
>
> Key: HDFS-1262
> URL: https://issues.apache.org/jira/browse/HDFS-1262
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, name-node
>Affects Versions: 0.20-append
>Reporter: Todd Lipcon
>Assignee: sam rash
>Priority: Critical
> Fix For: 0.20-append
>
> Attachments: hdfs-1262-1.txt, hdfs-1262-2.txt, hdfs-1262-3.txt, 
> hdfs-1262-4.txt, hdfs-1262-5.txt
>
>
> Ryan Rawson came upon this nasty bug in HBase cluster testing. What happened 
> was the following:
> 1) File's original writer died
> 2) Recovery client tried to open file for append - looped for a minute or so 
> until soft lease expired, then append call initiated recovery
> 3) Recovery completed successfully
> 4) Recovery client calls append again, which succeeds on the NN
> 5) For some reason, the block recovery that happens at the start of append 
> pipeline creation failed on all datanodes 6 times, causing the append() call 
> to throw an exception back to HBase master. HBase assumed the file wasn't 
> open and put it back on a queue to try later
> 6) Some time later, it tried append again, but the lease was still assigned 
> to the same DFS client, so it wasn't able to recover.
> The recovery failure in step 5 is a separate issue, but the problem for this 
> JIRA is that the NN can think it failed to open a file for append when the NN 
> thinks the writer holds a lease. Since the writer keeps renewing its lease, 
> recovery never happens, and no one can open or recover the file until the DFS 
> client shuts down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.