[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-15 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-11674:
---
Labels:   (was: release-blocker)

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1
>
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch, HDFS-11674-branch-2.7-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-12 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-11674:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.1
   3.0.0-alpha3
   2.7.4
   2.9.0
   Status: Resolved  (was: Patch Available)

Thanks [~arpitagarwal] and [~brahmareddy] for reviews.
Committed to branch-2.7 as well.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1
>
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch, HDFS-11674-branch-2.7-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-11 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-11674:
-
Attachment: HDFS-11674-branch-2.7-03.patch

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch, HDFS-11674-branch-2.7-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-10 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-11674:
-
Attachment: HDFS-11674-03.patch

Attached the updated patch.

There was a chance that test could timeout, where stopped datanode is chosen as 
primary for block recovery.
Now marked the node as dead just before recovery in test.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch, 
> HDFS-11674-03.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-04 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-11674:
---
Labels: release-blocker  (was: )

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-05-02 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-11674:
-
Attachment: HDFS-11674-02.patch

Updated the patch.
Please review.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HDFS-11674-01.patch, HDFS-11674-02.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-04-19 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-11674:
-
Target Version/s: 2.7.4, 2.8.1
  Status: Patch Available  (was: Open)

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HDFS-11674-01.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-04-19 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-11674:
-
Priority: Critical  (was: Major)

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HDFS-11674-01.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11674) reserveSpaceForReplicas is not released if append request failed due to mirror down and replica recovered

2017-04-19 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-11674:
-
Attachment: HDFS-11674-01.patch

Attached the patch.

Please review.

> reserveSpaceForReplicas is not released if append request failed due to 
> mirror down and replica recovered
> -
>
> Key: HDFS-11674
> URL: https://issues.apache.org/jira/browse/HDFS-11674
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
>Priority: Critical
> Attachments: HDFS-11674-01.patch
>
>
> Scenario:
> 1. 3 Node cluster with 
> "dfs.client.block.write.replace-datanode-on-failure.policy"  as DEFAULT
> Block is written with x data.
> 2. One of the Datanode, NOT the first DN, is down
> 3. Client tries to append data to block and fails since one DN is down.
> 4. calls recoverLease() on the file.
> 5. Successfull recovery happens.
> Issue:
> 1. DNs which were connected from client before encountering mirror down, will 
> have the reservedSpaceForReplicas incremented, BUT never decremented. 
> 2. So in long run DN's all space will be in reservedSpaceForReplicas 
> resulting OutOfSpace errors.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org