[jira] [Assigned] (HDFS-16077) OIV parsing tool throws NPE for a FSImage with multiple InodeSections

2021-06-18 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-16077:
-

Assignee: (was: Renukaprasad C)

> OIV parsing tool throws NPE for a FSImage with multiple InodeSections
> -
>
> Key: HDFS-16077
> URL: https://issues.apache.org/jira/browse/HDFS-16077
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Priority: Major
>
> An FSImage with Multiple InodeSections is resulting in NPE when accessed 
> through OIV Tool with default Parser (WEB)
> This issue is reproducible only with multiple InodeSections (Writing more 
> than 1 Million Files) 
> On analyzing the code further we found that NPE is caused in 
> org.apache.hadoop.hdfs.tools.offlineImageViewer.FSImageLoader.fromINodeId(long).
>  fromINodeId(long) is searching for Inode in an Inodesection which doesn't 
> have the Inode(but exists in another InodeSection)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16077) OIV parsing tool throws NPE for a FSImage with multiple InodeSections

2021-06-18 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-16077:
-

 Summary: OIV parsing tool throws NPE for a FSImage with multiple 
InodeSections
 Key: HDFS-16077
 URL: https://issues.apache.org/jira/browse/HDFS-16077
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Renukaprasad C


An FSImage with Multiple InodeSections is resulting in NPE when accessed 
through OIV Tool with default Parser (WEB)

This issue is reproducible only with multiple InodeSections (Writing more than 
1 Million Files) 

On analyzing the code further we found that NPE is caused in 
org.apache.hadoop.hdfs.tools.offlineImageViewer.FSImageLoader.fromINodeId(long).
 fromINodeId(long) is searching for Inode in an Inodesection which doesn't have 
the Inode(but exists in another InodeSection)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15893) Logs are flooded when dfs.ha.tail-edits.in-progress set to true or dfs.ha.tail-edits.period to 0ms

2021-04-02 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17313852#comment-17313852
 ] 

Ravuri Sushma sree commented on HDFS-15893:
---

Thank you very much for reviewing this JIRA [~jianghuazhu]

Please correct me if I'm wrong but IIUC you are suggesting if the configured 
value of dfs.ha.tail-edits.period is 0 or less than 0, we can set it back to 
default Value in order to avoid flooding of logs.

I have added a check of Observer and pushed the logs to debug level because for 
observer enabled cluster as it is highly recommended to configure 
dfs.ha.tail-edits.period to a very low value (0 ms) so as to decrease the 
waiting time of client requests in the RPC queue.
Due to this low value our logs are flooded with any Observer cluster, going by 
checking just the value of 0 or below may save us from the flooded logs only if 
the configured values are same and may not be very helpful with a value of 0.1 
or so. ( which can also cause the same problem).

> Logs are flooded when dfs.ha.tail-edits.in-progress set to true or 
> dfs.ha.tail-edits.period to 0ms
> --
>
> Key: HDFS-15893
> URL: https://issues.apache.org/jira/browse/HDFS-15893
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15893.001.patch
>
>
> When we set dfs.ha.tail-edits.in-progress to true, dfs.ha.tail-edits.period 
> to 0ms almost all the logs on standby and observer NN are loaded. Such logs 
> will flood useful logs.
> We can adjust the log level of few logs to debug while observer node is in 
> operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15222) HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct

2021-04-01 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17313406#comment-17313406
 ] 

Ravuri Sushma sree commented on HDFS-15222:
---

Thanks [~brahmareddy]

As you mentioned, it is more meaningful ommitting "file". I have attached a 
patch incorporating changes as per the review comment provided by you. Please 
review.

> HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not 
> correct
> ---
>
> Key: HDFS-15222
> URL: https://issues.apache.org/jira/browse/HDFS-15222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, tools
>Affects Versions: 3.1.1
> Environment: 3 node HA cluster
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15222.001.patch, HDFS-15222.002.patch, 
> HDFS-15222.003.patch, output1.PNG, output2.PNG
>
>
> Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct
>  
> Steps :-Steps :-       
>  * Create a directory and put files  -
>  * Corrupt the file blocks
>  * check the corrupted file blocks with "hdfs fsck -list-corruptfileblocks" 
> command    
> It will display corrupted file blocks with message as "The list of corrupt 
> files under path '/path' are:"   at the beginning which is wrong.   
> And at the end of output also the wrong message will display as "The 
> filesystem under path '/path' has  CORRUPT files"
>  
> Actual output : "The list of corrupt files under path '/path' are:"
>                            "The filesystem under path '/path' has  
> CORRUPT files"
> Expected output : "The list of corrupted file blocks under path '/path' are:"
>                               "The filesystem under path '/path' has  
> CORRUPT file blocks"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15222) HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct

2021-04-01 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15222:
--
Attachment: HDFS-15222.003.patch

> HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not 
> correct
> ---
>
> Key: HDFS-15222
> URL: https://issues.apache.org/jira/browse/HDFS-15222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, tools
>Affects Versions: 3.1.1
> Environment: 3 node HA cluster
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15222.001.patch, HDFS-15222.002.patch, 
> HDFS-15222.003.patch, output1.PNG, output2.PNG
>
>
> Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct
>  
> Steps :-Steps :-       
>  * Create a directory and put files  -
>  * Corrupt the file blocks
>  * check the corrupted file blocks with "hdfs fsck -list-corruptfileblocks" 
> command    
> It will display corrupted file blocks with message as "The list of corrupt 
> files under path '/path' are:"   at the beginning which is wrong.   
> And at the end of output also the wrong message will display as "The 
> filesystem under path '/path' has  CORRUPT files"
>  
> Actual output : "The list of corrupt files under path '/path' are:"
>                            "The filesystem under path '/path' has  
> CORRUPT files"
> Expected output : "The list of corrupted file blocks under path '/path' are:"
>                               "The filesystem under path '/path' has  
> CORRUPT file blocks"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15893) Logs are flooded when dfs.ha.tail-edits.in-progress set to true or dfs.ha.tail-edits.period to 0ms

2021-03-13 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15893:
--
Attachment: HDFS-15893.001.patch
Status: Patch Available  (was: Open)

> Logs are flooded when dfs.ha.tail-edits.in-progress set to true or 
> dfs.ha.tail-edits.period to 0ms
> --
>
> Key: HDFS-15893
> URL: https://issues.apache.org/jira/browse/HDFS-15893
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15893.001.patch
>
>
> When we set dfs.ha.tail-edits.in-progress to true, dfs.ha.tail-edits.period 
> to 0ms almost all the logs on standby and observer NN are loaded. Such logs 
> will flood useful logs.
> We can adjust the log level of few logs to debug while observer node is in 
> operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15893) Logs are flooded when dfs.ha.tail-edits.in-progress set to true or dfs.ha.tail-edits.period to 0ms

2021-03-13 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15893:
-

 Summary: Logs are flooded when dfs.ha.tail-edits.in-progress set 
to true or dfs.ha.tail-edits.period to 0ms
 Key: HDFS-15893
 URL: https://issues.apache.org/jira/browse/HDFS-15893
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree


When we set dfs.ha.tail-edits.in-progress to true, dfs.ha.tail-edits.period to 
0ms almost all the logs on standby and observer NN are loaded. Such logs will 
flood useful logs.

We can adjust the log level of few logs to debug while observer node is in 
operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15134) Any write calls with REST API on Standby NN print error message with wrong online help URL

2021-03-03 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294762#comment-17294762
 ] 

Ravuri Sushma sree commented on HDFS-15134:
---

Failed Testcases are not related to this JIRA, Please Review.

> Any write calls with REST API on Standby NN print error message with wrong 
> online help URL
> --
>
> Key: HDFS-15134
> URL: https://issues.apache.org/jira/browse/HDFS-15134
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15134.001.patch
>
>
> vm2:/opt# curl -k -i --negotiate -u : 
> "http://IP:PORT/webhdfs/v1/test?op=MKDIRS;
> HTTP/1.1 403 Forbidden
> Date: Mon, 20 Jan 2020 07:28:19 GMT
> Cache-Control: no-cache
> Expires: Mon, 20 Jan 2020 07:28:20 GMT
> Date: Mon, 20 Jan 2020 07:28:20 GMT
> Pragma: no-cache
> X-FRAME-OPTIONS: SAMEORIGIN
> Content-Type: application/json
> Transfer-Encoding: chunked
> {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation
>  category WRITE is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error"}}
> Invalid link doesnt exists - https://s.apache.org/sbnn-error. This need to be 
> updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2021-03-03 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294761#comment-17294761
 ] 

Ravuri Sushma sree commented on HDFS-15494:
---

Failed Testcases are not related to this JIRA, Please Review.

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15134) Any write calls with REST API on Standby NN print error message with wrong online help URL

2021-03-03 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15134:
--
Status: Patch Available  (was: Open)

> Any write calls with REST API on Standby NN print error message with wrong 
> online help URL
> --
>
> Key: HDFS-15134
> URL: https://issues.apache.org/jira/browse/HDFS-15134
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15134.001.patch
>
>
> vm2:/opt# curl -k -i --negotiate -u : 
> "http://IP:PORT/webhdfs/v1/test?op=MKDIRS;
> HTTP/1.1 403 Forbidden
> Date: Mon, 20 Jan 2020 07:28:19 GMT
> Cache-Control: no-cache
> Expires: Mon, 20 Jan 2020 07:28:20 GMT
> Date: Mon, 20 Jan 2020 07:28:20 GMT
> Pragma: no-cache
> X-FRAME-OPTIONS: SAMEORIGIN
> Content-Type: application/json
> Transfer-Encoding: chunked
> {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation
>  category WRITE is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error"}}
> Invalid link doesnt exists - https://s.apache.org/sbnn-error. This need to be 
> updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15222) HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct

2021-02-14 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284488#comment-17284488
 ] 

Ravuri Sushma sree commented on HDFS-15222:
---

Thank you for the review [~brahmareddy]

> HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not 
> correct
> ---
>
> Key: HDFS-15222
> URL: https://issues.apache.org/jira/browse/HDFS-15222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, tools
>Affects Versions: 3.1.1
> Environment: 3 node HA cluster
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15222.001.patch, HDFS-15222.002.patch, output1.PNG, 
> output2.PNG
>
>
> Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct
>  
> Steps :-Steps :-       
>  * Create a directory and put files  -
>  * Corrupt the file blocks
>  * check the corrupted file blocks with "hdfs fsck -list-corruptfileblocks" 
> command    
> It will display corrupted file blocks with message as "The list of corrupt 
> files under path '/path' are:"   at the beginning which is wrong.   
> And at the end of output also the wrong message will display as "The 
> filesystem under path '/path' has  CORRUPT files"
>  
> Actual output : "The list of corrupt files under path '/path' are:"
>                            "The filesystem under path '/path' has  
> CORRUPT files"
> Expected output : "The list of corrupted file blocks under path '/path' are:"
>                               "The filesystem under path '/path' has  
> CORRUPT file blocks"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2021-02-14 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284485#comment-17284485
 ] 

Ravuri Sushma sree commented on HDFS-15494:
---

Thank you for reviewing [~brahmareddy]

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15804) Support listOpenFiles API in Router

2021-01-30 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15804:
-

 Summary: Support listOpenFiles API in Router
 Key: HDFS-15804
 URL: https://issues.apache.org/jira/browse/HDFS-15804
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree


Currently Router does not support listOpenFiles API and returns null 

@Override
 public BatchedEntries listOpenFiles(long prevId,
 EnumSet openFilesTypes, String path)
 throws IOException {
 rpcServer.checkOperation(NameNode.OperationCategory.READ, false);
 return null;
 }

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15735) NameNode memory Leak on frequent execution of fsck

2020-12-18 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251762#comment-17251762
 ] 

Ravuri Sushma sree commented on HDFS-15735:
---

[~ayushtkn] [~John Smith] , Thank you for the reviews.

I think here tracing is configurable so I opted to close it instead of removing 
and stopping the call of fsck may vary depending on different business 
requirements.

> NameNode memory Leak on frequent execution of fsck  
> 
>
> Key: HDFS-15735
> URL: https://issues.apache.org/jira/browse/HDFS-15735
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15735.001.patch
>
>
> The memory of the cluster NameNode continues to grow, and the full gc 
> eventually leads to the failure of the active and standby HDFS
> Htrace is used to track the processing time of fsck
> Checking the code it is found that the tracer object in NamenodeFsck.java was 
> only created but not closed because of this the memory footprint continues to 
> grow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15735) NameNode memory Leak on frequent execution of fsck

2020-12-17 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17251200#comment-17251200
 ] 

Ravuri Sushma sree commented on HDFS-15735:
---

Attached a patch closing tracer object in fsck(). Please review

> NameNode memory Leak on frequent execution of fsck  
> 
>
> Key: HDFS-15735
> URL: https://issues.apache.org/jira/browse/HDFS-15735
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15735.001.patch
>
>
> The memory of the cluster NameNode continues to grow, and the full gc 
> eventually leads to the failure of the active and standby HDFS
> Htrace is used to track the processing time of fsck
> Checking the code it is found that the tracer object in NamenodeFsck.java was 
> only created but not closed because of this the memory footprint continues to 
> grow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15735) NameNode memory Leak on frequent execution of fsck

2020-12-17 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15735:
--
Attachment: HDFS-15735.001.patch
Status: Patch Available  (was: Open)

> NameNode memory Leak on frequent execution of fsck  
> 
>
> Key: HDFS-15735
> URL: https://issues.apache.org/jira/browse/HDFS-15735
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15735.001.patch
>
>
> The memory of the cluster NameNode continues to grow, and the full gc 
> eventually leads to the failure of the active and standby HDFS
> Htrace is used to track the processing time of fsck
> Checking the code it is found that the tracer object in NamenodeFsck.java was 
> only created but not closed because of this the memory footprint continues to 
> grow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15735) NameNode memory Leak on frequent execution of fsck

2020-12-17 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15735:
-

 Summary: NameNode memory Leak on frequent execution of fsck  
 Key: HDFS-15735
 URL: https://issues.apache.org/jira/browse/HDFS-15735
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree


The memory of the cluster NameNode continues to grow, and the full gc 
eventually leads to the failure of the active and standby HDFS

Htrace is used to track the processing time of fsck

Checking the code it is found that the tracer object in NamenodeFsck.java was 
only created but not closed because of this the memory footprint continues to 
grow



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2020-12-06 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244711#comment-17244711
 ] 

Ravuri Sushma sree commented on HDFS-14437:
---

[~angerszhuuu]


 I'm using 3.1.1 Hadoop version

ERROR :
{code:java}
java.lang.IllegalArgumentException: LastWrittenTxId 891 is expected to be the 
same as lastSyncedTxId 888
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1473)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1360)
at 
org.apache.hadoop.hdfs.server.namenode.TestEditLog$TransactionRollEditLog.run(TestEditLog.java:270)
at java.lang.Thread.run(Thread.java:748)
{code}
I encountered this when dfs.namenode.edits.asynclogging is set as true

In the parameterized testcase, testMultiThreadedEditLog[1] , when we set 
dfs.namenode.edits.asynclogging as true, endCurrentLogSegment calls 
logSyncAll() in FSEditLogAsync which just ensures that the queues are drained .

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
> Attachments: HDFS-14437.reproduction.patch, 
> HDFS-14437.reproductionwithlog.patch, screenshot-1.png
>
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> 

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2020-12-04 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17244117#comment-17244117
 ] 

Ravuri Sushma sree commented on HDFS-14437:
---

Thanks for the reply [~angerszhuuu].
Sounds great! 
I just had a doubt regarding the UT provided. 
Without the fix in PR , I encountered java.lang.IllegalArgumentException in 
both testMultiThreadedEditLog[0] and testMultiThreadedEditLog[1] . 
With the fix, this issue is resolved in testMultiThreadedEditLog[0] but in  
testMultiThreadedEditLog[1] logs it is still reproducible. Can you please let 
me know if you faced the same?

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
> Attachments: HDFS-14437.reproduction.patch, 
> HDFS-14437.reproductionwithlog.patch, screenshot-1.png
>
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due 

[jira] [Comment Edited] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2020-12-03 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243763#comment-17243763
 ] 

Ravuri Sushma sree edited comment on HDFS-14437 at 12/4/20, 6:54 AM:
-

I used the HDFS-14437.reproduction.patch provided and it seems that the PR 
resolved the  "java.lang.IllegalArgumentException: LastWrittenTxId 558 is 
expected to be the same as lastSyncedTxId 532" error . [~angerszhuuu] 
[~hexiaoqiao] Do you think any impact ?


was (Author: sushma_28):
I used the HDFS-14437.reproduction.patch provided and it seems that the PR 
resolved the  "java.lang.IllegalArgumentException: LastWrittenTxId 558 is 
expected to be the same as lastSyncedTxId 532" error . [~angerszhuuu] 
[~hexiaoqiao] Do you think any impact .

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
> Attachments: HDFS-14437.reproduction.patch, 
> HDFS-14437.reproductionwithlog.patch, screenshot-1.png
>
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2020-12-03 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17243763#comment-17243763
 ] 

Ravuri Sushma sree commented on HDFS-14437:
---

I used the HDFS-14437.reproduction.patch provided and it seems that the PR 
resolved the  "java.lang.IllegalArgumentException: LastWrittenTxId 558 is 
expected to be the same as lastSyncedTxId 532" error . [~angerszhuuu] 
[~hexiaoqiao] Do you think any impact .

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
> Attachments: HDFS-14437.reproduction.patch, 
> HDFS-14437.reproductionwithlog.patch, screenshot-1.png
>
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2020-11-29 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240493#comment-17240493
 ] 

Ravuri Sushma sree commented on HDFS-14437:
---

Thank you everyone for the discussion here, I have tried applying the patch and 
the solution on PR. [~hexiaoqiao] can you please let me know if anymore updates 
are available on this JIRA.

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
> Attachments: HDFS-14437.reproduction.patch, 
> HDFS-14437.reproductionwithlog.patch, screenshot-1.png
>
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   synchronized(journalSetLock) {
> IOUtils.cleanup(LOG, journalSet);
>  

[jira] [Commented] (HDFS-15422) Reported IBR is partially replaced with stored info when queuing.

2020-11-29 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240491#comment-17240491
 ] 

Ravuri Sushma sree commented on HDFS-15422:
---

Thank you everyone for the discussion here. Can anyone let me know if this 
issue is reproducible in UT?

> Reported IBR is partially replaced with stored info when queuing.
> -
>
> Key: HDFS-15422
> URL: https://issues.apache.org/jira/browse/HDFS-15422
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Kihwal Lee
>Priority: Critical
> Attachments: HDFS-15422-branch-2.10.001.patch
>
>
> When queueing an IBR (incremental block report) on a standby namenode, some 
> of the reported information is being replaced with the existing stored 
> information.  This can lead to false block corruption.
> We had a namenode, after transitioning to active, started reporting missing 
> blocks with "SIZE_MISMATCH" as corrupt reason. These were blocks that were 
> appended and the sizes were actually correct on the datanodes. Upon further 
> investigation, it was determined that the namenode was queueing IBRs with 
> altered information.
> Although it sounds bad, I am not making it blocker 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15550) Remove unused imports from TestFileTruncate.java

2020-08-31 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187493#comment-17187493
 ] 

Ravuri Sushma sree commented on HDFS-15550:
---

Thanks for the review [~hexiaoqiao]

> Remove unused imports from TestFileTruncate.java
> 
>
> Key: HDFS-15550
> URL: https://issues.apache.org/jira/browse/HDFS-15550
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15550.001.patch
>
>
> {{import org.apache.hadoop.fs.BlockLocation and import org.junit.Assert 
> remain unused in }}{{TestFileTruncate.java}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-08-31 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187463#comment-17187463
 ] 

Ravuri Sushma sree commented on HDFS-15494:
---

Above reported test failures are not related to this JIRA

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15550) Remove unused imports from TestFileTruncate.java

2020-08-30 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187297#comment-17187297
 ] 

Ravuri Sushma sree commented on HDFS-15550:
---

Thanks for the Review [~brahmareddy]

> Remove unused imports from TestFileTruncate.java
> 
>
> Key: HDFS-15550
> URL: https://issues.apache.org/jira/browse/HDFS-15550
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15550.001.patch
>
>
> {{import org.apache.hadoop.fs.BlockLocation and import org.junit.Assert 
> remain unused in }}{{TestFileTruncate.java}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15550) Remove unused imports from TestFileTruncate.java

2020-08-30 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15550:
--
Attachment: HDFS-15550.001.patch
Status: Patch Available  (was: Open)

> Remove unused imports from TestFileTruncate.java
> 
>
> Key: HDFS-15550
> URL: https://issues.apache.org/jira/browse/HDFS-15550
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15550.001.patch
>
>
> {{import org.apache.hadoop.fs.BlockLocation and import org.junit.Assert 
> remain unused in }}{{TestFileTruncate.java}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15550) Remove unused imports from TestFileTruncate.java

2020-08-30 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15550:
-

 Summary: Remove unused imports from TestFileTruncate.java
 Key: HDFS-15550
 URL: https://issues.apache.org/jira/browse/HDFS-15550
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree


{{import org.apache.hadoop.fs.BlockLocation and import org.junit.Assert remain 
unused in}}

{{ }}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15550) Remove unused imports from TestFileTruncate.java

2020-08-30 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15550:
--
Description: {{import org.apache.hadoop.fs.BlockLocation and import 
org.junit.Assert remain unused in }}{{TestFileTruncate.java}}  (was: {{import 
org.apache.hadoop.fs.BlockLocation and import org.junit.Assert remain unused 
in}}

{{ }})

> Remove unused imports from TestFileTruncate.java
> 
>
> Key: HDFS-15550
> URL: https://issues.apache.org/jira/browse/HDFS-15550
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Minor
>
> {{import org.apache.hadoop.fs.BlockLocation and import org.junit.Assert 
> remain unused in }}{{TestFileTruncate.java}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15222) HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct

2020-08-15 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15222:
--
Attachment: HDFS-15222.002.patch
Status: Patch Available  (was: Open)

> HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not 
> correct
> ---
>
> Key: HDFS-15222
> URL: https://issues.apache.org/jira/browse/HDFS-15222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, tools
>Affects Versions: 3.1.1
> Environment: 3 node HA cluster
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15222.001.patch, HDFS-15222.002.patch, output1.PNG, 
> output2.PNG
>
>
> Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct
>  
> Steps :-Steps :-       
>  * Create a directory and put files  -
>  * Corrupt the file blocks
>  * check the corrupted file blocks with "hdfs fsck -list-corruptfileblocks" 
> command    
> It will display corrupted file blocks with message as "The list of corrupt 
> files under path '/path' are:"   at the beginning which is wrong.   
> And at the end of output also the wrong message will display as "The 
> filesystem under path '/path' has  CORRUPT files"
>  
> Actual output : "The list of corrupt files under path '/path' are:"
>                            "The filesystem under path '/path' has  
> CORRUPT files"
> Expected output : "The list of corrupted file blocks under path '/path' are:"
>                               "The filesystem under path '/path' has  
> CORRUPT file blocks"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15222) HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct

2020-08-15 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15222:
--
Status: Open  (was: Patch Available)

> HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not 
> correct
> ---
>
> Key: HDFS-15222
> URL: https://issues.apache.org/jira/browse/HDFS-15222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, tools
>Affects Versions: 3.1.1
> Environment: 3 node HA cluster
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15222.001.patch, output1.PNG, output2.PNG
>
>
> Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct
>  
> Steps :-Steps :-       
>  * Create a directory and put files  -
>  * Corrupt the file blocks
>  * check the corrupted file blocks with "hdfs fsck -list-corruptfileblocks" 
> command    
> It will display corrupted file blocks with message as "The list of corrupt 
> files under path '/path' are:"   at the beginning which is wrong.   
> And at the end of output also the wrong message will display as "The 
> filesystem under path '/path' has  CORRUPT files"
>  
> Actual output : "The list of corrupt files under path '/path' are:"
>                            "The filesystem under path '/path' has  
> CORRUPT files"
> Expected output : "The list of corrupted file blocks under path '/path' are:"
>                               "The filesystem under path '/path' has  
> CORRUPT file blocks"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15534) FsShell prints usage for all dfs commands for illegal arguments

2020-08-15 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15534:
--
Attachment: HDFS-15534.001.patch
Status: Patch Available  (was: Open)

> FsShell prints usage for all dfs commands for illegal arguments
> ---
>
> Key: HDFS-15534
> URL: https://issues.apache.org/jira/browse/HDFS-15534
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15534.001.patch
>
>
> FsShell.java  prints usage for all commands when incorrect arguments are 
> passed for any dfs command. Usage should be printed for the command used 
> solitarily



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15534) FsShell prints usage for all dfs commands for illegal arguments

2020-08-15 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15534:
-

 Summary: FsShell prints usage for all dfs commands for illegal 
arguments
 Key: HDFS-15534
 URL: https://issues.apache.org/jira/browse/HDFS-15534
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree


FsShell.java  prints usage for all commands when incorrect arguments are passed 
for any dfs command. Usage should be printed for the command used solitarily



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15222) HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct

2020-08-09 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15222:
--
Attachment: HDFS-15222.001.patch
Status: Patch Available  (was: Open)

> HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not 
> correct
> ---
>
> Key: HDFS-15222
> URL: https://issues.apache.org/jira/browse/HDFS-15222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, tools
>Affects Versions: 3.1.1
> Environment: 3 node HA cluster
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: HDFS-15222.001.patch, output1.PNG, output2.PNG
>
>
> Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct
>  
> Steps :-Steps :-       
>  * Create a directory and put files  -
>  * Corrupt the file blocks
>  * check the corrupted file blocks with "hdfs fsck -list-corruptfileblocks" 
> command    
> It will display corrupted file blocks with message as "The list of corrupt 
> files under path '/path' are:"   at the beginning which is wrong.   
> And at the end of output also the wrong message will display as "The 
> filesystem under path '/path' has  CORRUPT files"
>  
> Actual output : "The list of corrupt files under path '/path' are:"
>                            "The filesystem under path '/path' has  
> CORRUPT files"
> Expected output : "The list of corrupted file blocks under path '/path' are:"
>                               "The filesystem under path '/path' has  
> CORRUPT file blocks"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15229) Truncate info should be logged at INFO level

2020-08-01 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169376#comment-17169376
 ] 

Ravuri Sushma sree commented on HDFS-15229:
---

Hi [~hemanthboyina] , Thank you for reviewing .

I have added a patch rebasing, please have a look

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch, HDFS-15229.002.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-08-01 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Attachment: HDFS-15229.002.patch

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch, HDFS-15229.002.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Attachment: HDFS-15229.001.patch
Status: Patch Available  (was: Open)

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Status: Open  (was: Patch Available)

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Attachment: (was: HDFS-15229.001.patch)

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15123) Remove unnecessary null check in FoldedTreeSet

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15123:
--
Attachment: (was: HDFS-15123.001.patch)

> Remove unnecessary null check in FoldedTreeSet
> --
>
> Key: HDFS-15123
> URL: https://issues.apache.org/jira/browse/HDFS-15123
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Fix For: 3.1.1
>
> Attachments: HDFS-15123.001.patch
>
>
> *if (toMoveUp.left != null)* and *if (toMoveUp.right != null)* null checks 
> are not necessary as they are being handled in the if and else if conditions
> {code:java}
> private void deleteNode(final Node node) {
>  if (node.right == null) {
>  if (node.left != null)
> { attachToParent(node, node.left); }
> else
> { attachNullToParent(node); }
> } else if (node.left == null) {
>  attachToParent(node, node.right);
>  } else {
>  else {
>  // node.left != null && node.right != null
>  // node.next should replace node in tree
>  // node.next != null guaranteed since node.left != null
>  // node.next.left == null since node.next.prev is node
>  // node.next.right may be null or non-null
>  Node toMoveUp = node.next;
>  if (toMoveUp.right == null)
> { attachNullToParent(toMoveUp); }
> else
> { attachToParent(toMoveUp, toMoveUp.right); }
> toMoveUp.left = node.left;
>   if (toMoveUp.left != null) {
>   toMoveUp.left.parent = toMoveUp;
>  }
>  toMoveUp.right = node.right;
>  if (toMoveUp.right != null) {
>   toMoveUp.right.parent = toMoveUp;
>   }
>  attachToParentNoBalance(node, toMoveUp);
>  toMoveUp.color = node.color;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15123) Remove unnecessary null check in FoldedTreeSet

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15123:
--
Attachment: HDFS-15123.001.patch
Status: Patch Available  (was: Open)

> Remove unnecessary null check in FoldedTreeSet
> --
>
> Key: HDFS-15123
> URL: https://issues.apache.org/jira/browse/HDFS-15123
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Fix For: 3.1.1
>
> Attachments: HDFS-15123.001.patch
>
>
> *if (toMoveUp.left != null)* and *if (toMoveUp.right != null)* null checks 
> are not necessary as they are being handled in the if and else if conditions
> {code:java}
> private void deleteNode(final Node node) {
>  if (node.right == null) {
>  if (node.left != null)
> { attachToParent(node, node.left); }
> else
> { attachNullToParent(node); }
> } else if (node.left == null) {
>  attachToParent(node, node.right);
>  } else {
>  else {
>  // node.left != null && node.right != null
>  // node.next should replace node in tree
>  // node.next != null guaranteed since node.left != null
>  // node.next.left == null since node.next.prev is node
>  // node.next.right may be null or non-null
>  Node toMoveUp = node.next;
>  if (toMoveUp.right == null)
> { attachNullToParent(toMoveUp); }
> else
> { attachToParent(toMoveUp, toMoveUp.right); }
> toMoveUp.left = node.left;
>   if (toMoveUp.left != null) {
>   toMoveUp.left.parent = toMoveUp;
>  }
>  toMoveUp.right = node.right;
>  if (toMoveUp.right != null) {
>   toMoveUp.right.parent = toMoveUp;
>   }
>  attachToParentNoBalance(node, toMoveUp);
>  toMoveUp.color = node.color;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15494:
--
Attachment: HDFS-15494.001.patch
Status: Patch Available  (was: Open)

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15494:
--
Attachment: (was: HDFS-15494.001.patch)

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Attachment: HDFS-15229.001.patch
Status: Patch Available  (was: Open)

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-07-28 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Attachment: (was: HDFS-15229.001.patch)

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15229) Truncate info should be logged at INFO level

2020-07-27 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165911#comment-17165911
 ] 

Ravuri Sushma sree commented on HDFS-15229:
---

[~brahma] ,

Can you please check this and review the patch attached

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-07-27 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17165817#comment-17165817
 ] 

Ravuri Sushma sree commented on HDFS-15494:
---

Attached he patch skipping on windows . Please Review 

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-07-27 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15494:
--
Attachment: HDFS-15494.001.patch

> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows
> ---
>
> Key: HDFS-15494
> URL: https://issues.apache.org/jira/browse/HDFS-15494
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15494.001.patch
>
>
> TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
> Fails on Windows because when RBW should be renamed to Finalized, windows is 
> not supporting .
> This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15494) TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows

2020-07-27 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15494:
-

 Summary: TestReplicaCachingGetSpaceUsed 
#testReplicaCachingGetSpaceUsedByRBWReplica Fails on Windows
 Key: HDFS-15494
 URL: https://issues.apache.org/jira/browse/HDFS-15494
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree


TestReplicaCachingGetSpaceUsed #testReplicaCachingGetSpaceUsedByRBWReplica 
Fails on Windows because when RBW should be renamed to Finalized, windows is 
not supporting .

This should be skipped on Windows 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-03-21 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Attachment: HDFS-15229.001.patch

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-03-18 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Attachment: (was: HDFS-15229.001.patch)

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15229) Truncate info should be logged at INFO level

2020-03-18 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15229:
--
Attachment: HDFS-15229.001.patch

>  Truncate info should be logged at INFO level
> -
>
> Key: HDFS-15229
> URL: https://issues.apache.org/jira/browse/HDFS-15229
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15229.001.patch
>
>
> In NN log and audit log, we can't find the truncate size.
> Logs related to Truncate are captured at DEBUG Level and it is important that 
> NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15229) Truncate info should be logged at INFO level

2020-03-18 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15229:
-

 Summary:  Truncate info should be logged at INFO level
 Key: HDFS-15229
 URL: https://issues.apache.org/jira/browse/HDFS-15229
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree


In NN log and audit log, we can't find the truncate size.

Logs related to Truncate are captured at DEBUG Level and it is important that 
NN should log the newLength of truncate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15222) HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct

2020-03-15 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-15222:
-

Assignee: Ravuri Sushma sree

> HDFS: Output message of ""hdfs fsck -list-corruptfileblocks" command is not 
> correct
> ---
>
> Key: HDFS-15222
> URL: https://issues.apache.org/jira/browse/HDFS-15222
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, tools
>Affects Versions: 3.1.1
> Environment: 3 node HA cluster
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Attachments: output1.PNG, output2.PNG
>
>
> Output message of ""hdfs fsck -list-corruptfileblocks" command is not correct
>  
> Steps :-Steps :-       
>  * Create a directory and put files  -
>  * Corrupt the file blocks
>  * check the corrupted file blocks with "hdfs fsck -list-corruptfileblocks" 
> command    
> It will display corrupted file blocks with message as "The list of corrupt 
> files under path '/path' are:"   at the beginning which is wrong.   
> And at the end of output also the wrong message will display as "The 
> filesystem under path '/path' has  CORRUPT files"
>  
> Actual output : "The list of corrupt files under path '/path' are:"
>                            "The filesystem under path '/path' has  
> CORRUPT files"
> Expected output : "The list of corrupted file blocks under path '/path' are:"
>                               "The filesystem under path '/path' has  
> CORRUPT file blocks"
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15134) Any write calls with REST API on Standby NN print error message with wrong online help URL

2020-03-15 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15134:
--
Attachment: HDFS-15134.001.patch

> Any write calls with REST API on Standby NN print error message with wrong 
> online help URL
> --
>
> Key: HDFS-15134
> URL: https://issues.apache.org/jira/browse/HDFS-15134
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Renukaprasad C
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15134.001.patch
>
>
> vm2:/opt# curl -k -i --negotiate -u : 
> "http://IP:PORT/webhdfs/v1/test?op=MKDIRS;
> HTTP/1.1 403 Forbidden
> Date: Mon, 20 Jan 2020 07:28:19 GMT
> Cache-Control: no-cache
> Expires: Mon, 20 Jan 2020 07:28:20 GMT
> Date: Mon, 20 Jan 2020 07:28:20 GMT
> Pragma: no-cache
> X-FRAME-OPTIONS: SAMEORIGIN
> Content-Type: application/json
> Transfer-Encoding: chunked
> {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation
>  category WRITE is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error"}}
> Invalid link doesnt exists - https://s.apache.org/sbnn-error. This need to be 
> updated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15203) A bug in ViewFileSystemBaseTest

2020-03-15 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17059655#comment-17059655
 ] 

Ravuri Sushma sree commented on HDFS-15203:
---

Hi [~kihwal]. I have added a patch . Can you please review 

> A bug in ViewFileSystemBaseTest
> ---
>
> Key: HDFS-15203
> URL: https://issues.apache.org/jira/browse/HDFS-15203
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Trivial
> Attachments: HDFS-15203.001.patch
>
>
> Missing an assignment here:
> {code:java}
>   @Test
>   public void testGetBlockLocations() throws IOException {
> ...
> // Same test but now get it via the FileStatus Parameter
> fsView.getFileBlockLocations(
> fsView.getFileStatus(viewFilePath), 0, 10240+100);
> targetBL = fsTarget.getFileBlockLocations(
> fsTarget.getFileStatus(targetFilePath), 0, 10240+100);
> compareBLs(viewBL, targetBL);
>  {code}
> But more importantly, I am not sure what is the difference between this and 
> the previous check. Are they redundant?
> {code:java}
> BlockLocation[] viewBL = 
> fsView.getFileBlockLocations(fsView.getFileStatus(viewFilePath), 0, 
> 10240+100);
> Assert.assertEquals(SupportsBlocks ? 10 : 1, viewBL.length);
> BlockLocation[] targetBL = 
> fsTarget.getFileBlockLocations(fsTarget.getFileStatus(targetFilePath), 0, 
> 10240+100);
> compareBLs(viewBL, targetBL);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15203) A bug in ViewFileSystemBaseTest

2020-03-15 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15203:
--
Attachment: HDFS-15203.001.patch

> A bug in ViewFileSystemBaseTest
> ---
>
> Key: HDFS-15203
> URL: https://issues.apache.org/jira/browse/HDFS-15203
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Trivial
> Attachments: HDFS-15203.001.patch
>
>
> Missing an assignment here:
> {code:java}
>   @Test
>   public void testGetBlockLocations() throws IOException {
> ...
> // Same test but now get it via the FileStatus Parameter
> fsView.getFileBlockLocations(
> fsView.getFileStatus(viewFilePath), 0, 10240+100);
> targetBL = fsTarget.getFileBlockLocations(
> fsTarget.getFileStatus(targetFilePath), 0, 10240+100);
> compareBLs(viewBL, targetBL);
>  {code}
> But more importantly, I am not sure what is the difference between this and 
> the previous check. Are they redundant?
> {code:java}
> BlockLocation[] viewBL = 
> fsView.getFileBlockLocations(fsView.getFileStatus(viewFilePath), 0, 
> 10240+100);
> Assert.assertEquals(SupportsBlocks ? 10 : 1, viewBL.length);
> BlockLocation[] targetBL = 
> fsTarget.getFileBlockLocations(fsTarget.getFileStatus(targetFilePath), 0, 
> 10240+100);
> compareBLs(viewBL, targetBL);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15220) FSCK calls are redirecting to Active NN

2020-03-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-15220:
-

Assignee: Ravuri Sushma sree

> FSCK calls are redirecting to Active NN
> ---
>
> Key: HDFS-15220
> URL: https://issues.apache.org/jira/browse/HDFS-15220
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: krishna reddy
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> Run any fsck except -delete & - move should go to ONN as it is read 
> operationhdfs fsck / -storagepolicies and check the RPC calls for observer



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-12 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17057856#comment-17057856
 ] 

Ravuri Sushma sree commented on HDFS-15216:
---

Hi [~sodonnell] 

Thank you for following it up and reviewing . Failed test cases are not related 
to this Jira 

> Wrong Use Case of -showprogress in fsck 
> 
>
> Key: HDFS-15216
> URL: https://issues.apache.org/jira/browse/HDFS-15216
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15216.001.patch
>
>
> *-showprogress* is deprecated and Progress is now shown by default but fsck 
> --help shows incorrect use case for the same 
>  
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
> -upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
> [-maintenance] [-blockId ]
>   start checking from this path
> h4. 
>  -move move corrupted files to /lost+found
>  -delete delete corrupted files
>  -files print out files being checked
>  -openforwrite print out files opened for write
>  -includeSnapshots include snapshot data if the given path indicates a 
> snapshottable directory or there are snapshottable directories under it
>  -list-corruptfileblocks print out list of missing blocks and files they 
> belong to
>  -files -blocks print out block report
>  -files -blocks -locations print out locations for every block
>  -files -blocks -racks print out network topology for data-node locations
>  -files -blocks -replicaDetails print out each replica details
>  -files -blocks -upgradedomains print out upgrade domains for every block
>  -storagepolicies print out storage policy summary for the blocks
>  -maintenance print out maintenance state node details
>  *-showprogress show progress in output. Default is OFF (no progress)*
>  -blockId print out which file this blockId belongs to, locations (nodes, 
> racks) of this block, and other diagnostics info (under replicated, corrupted 
> or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15216:
--
Attachment: HDFS-15216.001.patch

> Wrong Use Case of -showprogress in fsck 
> 
>
> Key: HDFS-15216
> URL: https://issues.apache.org/jira/browse/HDFS-15216
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15216.001.patch
>
>
> *-showprogress* is deprecated and Progress is now shown by default but fsck 
> --help shows incorrect use case for the same 
>  
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
> -upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
> [-maintenance] [-blockId ]
>   start checking from this path
> h4. 
>  -move move corrupted files to /lost+found
>  -delete delete corrupted files
>  -files print out files being checked
>  -openforwrite print out files opened for write
>  -includeSnapshots include snapshot data if the given path indicates a 
> snapshottable directory or there are snapshottable directories under it
>  -list-corruptfileblocks print out list of missing blocks and files they 
> belong to
>  -files -blocks print out block report
>  -files -blocks -locations print out locations for every block
>  -files -blocks -racks print out network topology for data-node locations
>  -files -blocks -replicaDetails print out each replica details
>  -files -blocks -upgradedomains print out upgrade domains for every block
>  -storagepolicies print out storage policy summary for the blocks
>  -maintenance print out maintenance state node details
>  *-showprogress show progress in output. Default is OFF (no progress)*
>  -blockId print out which file this blockId belongs to, locations (nodes, 
> racks) of this block, and other diagnostics info (under replicated, corrupted 
> or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15216:
-

 Summary: Wrong Use Case of -showprogress in fsck 
 Key: HDFS-15216
 URL: https://issues.apache.org/jira/browse/HDFS-15216
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree


*-showprogress* is deprecated and Progress is now shown by default but fsck 
--help shows incorrect use case for the same 

 

Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
-openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
-upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
[-maintenance] [-blockId ]
  start checking from this path
h4. 
 -move move corrupted files to /lost+found
 -delete delete corrupted files
 -files print out files being checked
 -openforwrite print out files opened for write
 -includeSnapshots include snapshot data if the given path indicates a 
snapshottable directory or there are snapshottable directories under it
 -list-corruptfileblocks print out list of missing blocks and files they belong 
to
 -files -blocks print out block report
 -files -blocks -locations print out locations for every block
 -files -blocks -racks print out network topology for data-node locations
 -files -blocks -replicaDetails print out each replica details
 -files -blocks -upgradedomains print out upgrade domains for every block
 -storagepolicies print out storage policy summary for the blocks
 -maintenance print out maintenance state node details
 *-showprogress show progress in output. Default is OFF (no progress)*
 -blockId print out which file this blockId belongs to, locations (nodes, 
racks) of this block, and other diagnostics info (under replicated, corrupted 
or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15216) Wrong Use Case of -showprogress in fsck

2020-03-10 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-15216:
-

Assignee: Ravuri Sushma sree

> Wrong Use Case of -showprogress in fsck 
> 
>
> Key: HDFS-15216
> URL: https://issues.apache.org/jira/browse/HDFS-15216
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> *-showprogress* is deprecated and Progress is now shown by default but fsck 
> --help shows incorrect use case for the same 
>  
> Usage: hdfs fsck  [-list-corruptfileblocks | [-move | -delete | 
> -openforwrite] [-files [-blocks [-locations | -racks | -replicaDetails | 
> -upgradedomains [-includeSnapshots] [-showprogress] [-storagepolicies] 
> [-maintenance] [-blockId ]
>   start checking from this path
> h4. 
>  -move move corrupted files to /lost+found
>  -delete delete corrupted files
>  -files print out files being checked
>  -openforwrite print out files opened for write
>  -includeSnapshots include snapshot data if the given path indicates a 
> snapshottable directory or there are snapshottable directories under it
>  -list-corruptfileblocks print out list of missing blocks and files they 
> belong to
>  -files -blocks print out block report
>  -files -blocks -locations print out locations for every block
>  -files -blocks -racks print out network topology for data-node locations
>  -files -blocks -replicaDetails print out each replica details
>  -files -blocks -upgradedomains print out upgrade domains for every block
>  -storagepolicies print out storage policy summary for the blocks
>  -maintenance print out maintenance state node details
>  *-showprogress show progress in output. Default is OFF (no progress)*
>  -blockId print out which file this blockId belongs to, locations (nodes, 
> racks) of this block, and other diagnostics info (under replicated, corrupted 
> or not, etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-03-08 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15135:
--
Attachment: HDFS-15135-branch-3.2.002.patch

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135-branch-3.2.001.patch, 
> HDFS-15135-branch-3.2.002.patch, HDFS-15135.001.patch, HDFS-15135.002.patch, 
> HDFS-15135.003.patch, HDFS-15135.004.patch, HDFS-15135.005.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15203) A bug in ViewFileSystemBaseTest

2020-03-04 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051342#comment-17051342
 ] 

Ravuri Sushma sree commented on HDFS-15203:
---

Hi  [~kihwal]


In believe the intention behind this Unit Test was getting block locations 
using the following two overloaded methods in FileSystem.java

 
{code:java}
public BlockLocation[] getFileBlockLocations(FileStatus file,
 long start, long len) throws IOException {{code}
and 
{code:java}
public BlockLocation[] getFileBlockLocations(Path p,
 long start, long len) throws IOException {{code}
Coming to ViewFsBaseTest, there is no such getFileBlockLocations method in 
FileContext.java accepting FileStatus as parameter,

 
{code:java}
// Same test but now get it via the FileStatus Parameter
 fcView.getFileBlockLocations(viewFilePath, 0, 10240+100);
 targetBL = fcTarget.getFileBlockLocations(targetFilePath, 0, 10240+100);{code}
Looks redundant and in my opinion it should be removed.

> A bug in ViewFileSystemBaseTest
> ---
>
> Key: HDFS-15203
> URL: https://issues.apache.org/jira/browse/HDFS-15203
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Trivial
>
> Missing an assignment here:
> {code:java}
>   @Test
>   public void testGetBlockLocations() throws IOException {
> ...
> // Same test but now get it via the FileStatus Parameter
> fsView.getFileBlockLocations(
> fsView.getFileStatus(viewFilePath), 0, 10240+100);
> targetBL = fsTarget.getFileBlockLocations(
> fsTarget.getFileStatus(targetFilePath), 0, 10240+100);
> compareBLs(viewBL, targetBL);
>  {code}
> But more importantly, I am not sure what is the difference between this and 
> the previous check. Are they redundant?
> {code:java}
> BlockLocation[] viewBL = 
> fsView.getFileBlockLocations(fsView.getFileStatus(viewFilePath), 0, 
> 10240+100);
> Assert.assertEquals(SupportsBlocks ? 10 : 1, viewBL.length);
> BlockLocation[] targetBL = 
> fsTarget.getFileBlockLocations(fsTarget.getFileStatus(targetFilePath), 0, 
> 10240+100);
> compareBLs(viewBL, targetBL);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2020-03-04 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051284#comment-17051284
 ] 

Ravuri Sushma sree commented on HDFS-14442:
---

[~surendrasingh]  Can you please review 

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.patch, HDFS-14442.004.patch
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-03-01 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15135:
--
Attachment: HDFS-15135-branch-3.2.001.patch

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135-branch-3.2.001.patch, HDFS-15135.001.patch, 
> HDFS-15135.002.patch, HDFS-15135.003.patch, HDFS-15135.004.patch, 
> HDFS-15135.005.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-02-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15135:
--
Attachment: HDFS-15135.005.patch

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135.001.patch, HDFS-15135.002.patch, 
> HDFS-15135.003.patch, HDFS-15135.004.patch, HDFS-15135.005.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-02-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15135:
--
Attachment: HDFS-15135.004.patch

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135.001.patch, HDFS-15135.002.patch, 
> HDFS-15135.003.patch, HDFS-15135.004.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-02-11 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15135:
--
Attachment: HDFS-15135.003.patch

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135.001.patch, HDFS-15135.002.patch, 
> HDFS-15135.003.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-02-05 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17030800#comment-17030800
 ] 

Ravuri Sushma sree commented on HDFS-15135:
---

[~surendrasingh] , Attached a patch adding UT 

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135.001.patch, HDFS-15135.002.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-02-05 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15135:
--
Attachment: HDFS-15135.002.patch

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135.001.patch, HDFS-15135.002.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-01-31 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15135:
--
Status: Patch Available  (was: Open)

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135.001.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-01-29 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026014#comment-17026014
 ] 

Ravuri Sushma sree commented on HDFS-15135:
---

Hi [~surendrasingh] 

Thanks for the report, I have attached a patch.

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135.001.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-01-29 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15135:
--
Attachment: HDFS-15135.001.patch

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15135.001.patch
>
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15135) EC : ArrayIndexOutOfBoundsException in BlockRecoveryWorker#RecoveryTaskStriped.

2020-01-21 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-15135:
-

Assignee: Ravuri Sushma sree

> EC : ArrayIndexOutOfBoundsException in 
> BlockRecoveryWorker#RecoveryTaskStriped.
> ---
>
> Key: HDFS-15135
> URL: https://issues.apache.org/jira/browse/HDFS-15135
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException: 8
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$RecoveryTaskStriped.recover(BlockRecoveryWorker.java:464)
>at 
> org.apache.hadoop.hdfs.server.datanode.BlockRecoveryWorker$1.run(BlockRecoveryWorker.java:602)
>at java.lang.Thread.run(Thread.java:745) {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2020-01-19 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018951#comment-17018951
 ] 

Ravuri Sushma sree commented on HDFS-14442:
---

Thank You [~ayushtkn] for the review

Failed Tests are not related to this JIRA

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.patch, HDFS-14442.004.patch
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15123) Remove unnecessary null check in FoldedTreeSet

2020-01-14 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15123:
--
Attachment: HDFS-15123.001.patch

> Remove unnecessary null check in FoldedTreeSet
> --
>
> Key: HDFS-15123
> URL: https://issues.apache.org/jira/browse/HDFS-15123
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Fix For: 3.1.1
>
> Attachments: HDFS-15123.001.patch
>
>
> *if (toMoveUp.left != null)* and *if (toMoveUp.right != null)* null checks 
> are not necessary as they are being handled in the if and else if conditions
> {code:java}
> private void deleteNode(final Node node) {
>  if (node.right == null) {
>  if (node.left != null)
> { attachToParent(node, node.left); }
> else
> { attachNullToParent(node); }
> } else if (node.left == null) {
>  attachToParent(node, node.right);
>  } else {
>  else {
>  // node.left != null && node.right != null
>  // node.next should replace node in tree
>  // node.next != null guaranteed since node.left != null
>  // node.next.left == null since node.next.prev is node
>  // node.next.right may be null or non-null
>  Node toMoveUp = node.next;
>  if (toMoveUp.right == null)
> { attachNullToParent(toMoveUp); }
> else
> { attachToParent(toMoveUp, toMoveUp.right); }
> toMoveUp.left = node.left;
>   if (toMoveUp.left != null) {
>   toMoveUp.left.parent = toMoveUp;
>  }
>  toMoveUp.right = node.right;
>  if (toMoveUp.right != null) {
>   toMoveUp.right.parent = toMoveUp;
>   }
>  attachToParentNoBalance(node, toMoveUp);
>  toMoveUp.color = node.color;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15123) Remove unnecessary null check in FoldedTreeSet

2020-01-14 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15123:
--
Description: 
*if (toMoveUp.left != null)* and *if (toMoveUp.right != null)* null checks are 
not necessary as they are being handled in the if and else if conditions
{code:java}
private void deleteNode(final Node node) {
 if (node.right == null) {
 if (node.left != null)
{ attachToParent(node, node.left); }
else
{ attachNullToParent(node); }
} else if (node.left == null) {
 attachToParent(node, node.right);
 } else {
 else {
 // node.left != null && node.right != null
 // node.next should replace node in tree
 // node.next != null guaranteed since node.left != null
 // node.next.left == null since node.next.prev is node
 // node.next.right may be null or non-null
 Node toMoveUp = node.next;
 if (toMoveUp.right == null)
{ attachNullToParent(toMoveUp); }
else
{ attachToParent(toMoveUp, toMoveUp.right); }
toMoveUp.left = node.left;
  if (toMoveUp.left != null) {
  toMoveUp.left.parent = toMoveUp;
 }
 toMoveUp.right = node.right;
 if (toMoveUp.right != null) {
  toMoveUp.right.parent = toMoveUp;
  }
 attachToParentNoBalance(node, toMoveUp);
 toMoveUp.color = node.color;
 }
{code}

  was:
private void deleteNode(final Node node) {
 +*if (node.right == null) {*+
 if (node.left != null) {
 attachToParent(node, node.left);
 } else {
 attachNullToParent(node);
 }
 +*} else if (node.left == null) {*+
 attachToParent(node, node.right);
 } *else {*
else {
 // node.left != null && node.right != null
 // node.next should replace node in tree
 // node.next != null guaranteed since node.left != null
 // node.next.left == null since node.next.prev is node
 // node.next.right may be null or non-null
 Node toMoveUp = node.next;
 if (toMoveUp.right == null) {
 attachNullToParent(toMoveUp);
 } else {
 attachToParent(toMoveUp, toMoveUp.right);
 }
 toMoveUp.left = node.left;
 *if (toMoveUp.left != null) {*
 toMoveUp.left.parent = toMoveUp;
}
 toMoveUp.right = node.right;
*if (toMoveUp.right != null) {*
 toMoveUp.right.parent = toMoveUp;
 }
 attachToParentNoBalance(node, toMoveUp);
 toMoveUp.color = node.color;
 }

*if (toMoveUp.left != null)* and  *if (toMoveUp.right != null)* null checks are 
not necessary as they are being handled in the if and else if conditions


> Remove unnecessary null check in FoldedTreeSet
> --
>
> Key: HDFS-15123
> URL: https://issues.apache.org/jira/browse/HDFS-15123
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Minor
> Fix For: 3.1.1
>
>
> *if (toMoveUp.left != null)* and *if (toMoveUp.right != null)* null checks 
> are not necessary as they are being handled in the if and else if conditions
> {code:java}
> private void deleteNode(final Node node) {
>  if (node.right == null) {
>  if (node.left != null)
> { attachToParent(node, node.left); }
> else
> { attachNullToParent(node); }
> } else if (node.left == null) {
>  attachToParent(node, node.right);
>  } else {
>  else {
>  // node.left != null && node.right != null
>  // node.next should replace node in tree
>  // node.next != null guaranteed since node.left != null
>  // node.next.left == null since node.next.prev is node
>  // node.next.right may be null or non-null
>  Node toMoveUp = node.next;
>  if (toMoveUp.right == null)
> { attachNullToParent(toMoveUp); }
> else
> { attachToParent(toMoveUp, toMoveUp.right); }
> toMoveUp.left = node.left;
>   if (toMoveUp.left != null) {
>   toMoveUp.left.parent = toMoveUp;
>  }
>  toMoveUp.right = node.right;
>  if (toMoveUp.right != null) {
>   toMoveUp.right.parent = toMoveUp;
>   }
>  attachToParentNoBalance(node, toMoveUp);
>  toMoveUp.color = node.color;
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15123) Remove unnecessary null check in FoldedTreeSet

2020-01-14 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15123:
-

 Summary: Remove unnecessary null check in FoldedTreeSet
 Key: HDFS-15123
 URL: https://issues.apache.org/jira/browse/HDFS-15123
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree
 Fix For: 3.1.1


private void deleteNode(final Node node) {
 +*if (node.right == null) {*+
 if (node.left != null) {
 attachToParent(node, node.left);
 } else {
 attachNullToParent(node);
 }
 +*} else if (node.left == null) {*+
 attachToParent(node, node.right);
 } *else {*
else {
 // node.left != null && node.right != null
 // node.next should replace node in tree
 // node.next != null guaranteed since node.left != null
 // node.next.left == null since node.next.prev is node
 // node.next.right may be null or non-null
 Node toMoveUp = node.next;
 if (toMoveUp.right == null) {
 attachNullToParent(toMoveUp);
 } else {
 attachToParent(toMoveUp, toMoveUp.right);
 }
 toMoveUp.left = node.left;
 *if (toMoveUp.left != null) {*
 toMoveUp.left.parent = toMoveUp;
}
 toMoveUp.right = node.right;
*if (toMoveUp.right != null) {*
 toMoveUp.right.parent = toMoveUp;
 }
 attachToParentNoBalance(node, toMoveUp);
 toMoveUp.color = node.color;
 }

*if (toMoveUp.left != null)* and  *if (toMoveUp.right != null)* null checks are 
not necessary as they are being handled in the if and else if conditions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-12-26 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14442:
--
Attachment: HDFS-14442.004.patch

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.patch, HDFS-14442.004.patch
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-12-26 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003692#comment-17003692
 ] 

Ravuri Sushma sree commented on HDFS-14442:
---

Hi [~xkrogen] , thank you for your inputs in simplifying the test, I have 
uploaded a patch implementing the same. Please review

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.patch
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-15045:
--
Attachment: HDFS-15045.001.patch

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-15045.001.patch
>
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15045) DataStreamer#createBlockOutputStream() should log exception in warn.

2019-12-10 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-15045:
-

Assignee: Ravuri Sushma sree

> DataStreamer#createBlockOutputStream() should log exception in warn.
> 
>
> Key: HDFS-15045
> URL: https://issues.apache.org/jira/browse/HDFS-15045
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: dfsclient
>Affects Versions: 3.1.1
>Reporter: Surendra Singh Lilhore
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> {code:java}
> } catch (IOException ie) {
> if (!errorState.isRestartingNode()) {
>   LOG.info("Exception in createBlockOutputStream " + this, ie);
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-11-21 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Attachment: HDFS-14528.007.patch

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, 
> HDFS-14528.005.patch, HDFS-14528.006.patch, HDFS-14528.007.patch, 
> HDFS-14528.2.Patch, ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14972) HDFS: fsck "-blockId" option not giving expected output

2019-11-14 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-14972:
-

Assignee: Ravuri Sushma sree

> HDFS: fsck "-blockId" option not giving expected output
> ---
>
> Key: HDFS-14972
> URL: https://issues.apache.org/jira/browse/HDFS-14972
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 3.1.2
> Environment: HA Cluster
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: image-2019-11-08-19-10-18-057.png, 
> image-2019-11-08-19-12-21-307.png, image-2019-11-14-14-50-15-032.png
>
>
> HDFS: fsck "-blockId" option not giving expected output
> HDFS fsck displaying correct output for corrupted files and blocks 
> !image-2019-11-08-19-10-18-057.png!
>  
>  
> HDFS fsck -blockId command not giving expected output for corrupted replica
>  
> !image-2019-11-08-19-12-21-307.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14987) EC: EC file blockId location info displaying as "null" with hdfs fsck -blockId command

2019-11-13 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-14987:
-

Assignee: Ravuri Sushma sree

> EC: EC file blockId location info displaying as "null" with hdfs fsck 
> -blockId command
> --
>
> Key: HDFS-14987
> URL: https://issues.apache.org/jira/browse/HDFS-14987
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ec, tools
>Affects Versions: 3.1.2
>Reporter: Souryakanta Dwivedy
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: EC_file_block_info.PNG, 
> image-2019-11-13-18-34-00-067.png, image-2019-11-13-18-36-29-063.png, 
> image-2019-11-13-18-38-18-899.png
>
>
> EC file blockId location info displaying as "null" with hdfs fsck -blockId 
> command
>  * Check the blockId information of an EC enabled file with "hdfs fsck 
> --blockId"-  Check the blockId information of an EC enabled file with "hdfs 
> fsck -blockId"    blockId location related info will display as null,which 
> needs to be rectified.    
>              Check the attachment "EC_file_block_info"
> ===
> !image-2019-11-13-18-34-00-067.png!
>  
>  *  Check the output of a normal file block to compare
>                   !image-2019-11-13-18-36-29-063.png!       
> ===
>              !image-2019-11-13-18-38-18-899.png!
>  * Actual Output :-     null   
>  * Expected output :- It should display the blockId location related info as 
> (nodes, racks) of the block  as specified in the usage info of fsck -blockId 
> option.                                    [like : Block replica on 
> datanode/rack: BLR1xx038/default-rack is HEALTHY]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14985) FSCK for a block of EC Files doesnt display status at the end

2019-11-13 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14985:
--
Description: 
*Environment*  Cluster of 2 Namenodes and 5 Datanodes and ec policy enabled  

fsck -blockId of a block associated with an EC File does not print status at 
the end and displays null instead

{color:#de350b}*Result :*{color}

./hdfs fsck -blockId blk_-x
 Connecting to namenode via 
 FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 
2019

Block Id: blk_-x
 Block belongs to: /ecdir/f2
 No. of Expected Replica: 3
 No. of live Replica: 3
 No. of excess Replica: 0
 No. of stale Replica: 2
 No. of decommissioned Replica: 0
 No. of decommissioning Replica: 0
 No. of corrupted Replica: 0
 null

{color:#de350b}*Expected :*{color} 

./hdfs fsck -blockId blk_-x
 Connecting to namenode via 
 FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 
2019

Block Id: blk_-x
 Block belongs to: /ecdir/f2
 No. of Expected Replica: 3
 No. of live Replica: 3
 No. of excess Replica: 0
 No. of stale Replica: 2
 No. of decommissioned Replica: 0
 No. of decommissioning Replica: 0
 No. of corrupted Replica: 0

Block replica on datanode/rack: vm10/default-rack is HEALTHY

 

  was:
FSCK of a blockid which belongs to an EC File doesnot print status at the end 
and displays null instead
 
./hdfs fsck -blockId blk_-x
Connecting to namenode via 
FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 2019

Block Id: blk_-x
Block belongs to: /ecdir/f2
No. of Expected Replica: 3
No. of live Replica: 3
No. of excess Replica: 0
No. of stale Replica: 2
No. of decommissioned Replica: 0
No. of decommissioning Replica: 0
No. of corrupted Replica: 0
null


> FSCK for a block of EC Files doesnt display status at the end
> -
>
> Key: HDFS-14985
> URL: https://issues.apache.org/jira/browse/HDFS-14985
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> *Environment*  Cluster of 2 Namenodes and 5 Datanodes and ec policy enabled  
> fsck -blockId of a block associated with an EC File does not print status at 
> the end and displays null instead
> {color:#de350b}*Result :*{color}
> ./hdfs fsck -blockId blk_-x
>  Connecting to namenode via 
>  FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 
> 2019
> Block Id: blk_-x
>  Block belongs to: /ecdir/f2
>  No. of Expected Replica: 3
>  No. of live Replica: 3
>  No. of excess Replica: 0
>  No. of stale Replica: 2
>  No. of decommissioned Replica: 0
>  No. of decommissioning Replica: 0
>  No. of corrupted Replica: 0
>  null
> {color:#de350b}*Expected :*{color} 
> ./hdfs fsck -blockId blk_-x
>  Connecting to namenode via 
>  FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 
> 2019
> Block Id: blk_-x
>  Block belongs to: /ecdir/f2
>  No. of Expected Replica: 3
>  No. of live Replica: 3
>  No. of excess Replica: 0
>  No. of stale Replica: 2
>  No. of decommissioned Replica: 0
>  No. of decommissioning Replica: 0
>  No. of corrupted Replica: 0
> Block replica on datanode/rack: vm10/default-rack is HEALTHY
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14985) FSCK for a block of EC Files doesnt display status at the end

2019-11-13 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-14985:
-

 Summary: FSCK for a block of EC Files doesnt display status at the 
end
 Key: HDFS-14985
 URL: https://issues.apache.org/jira/browse/HDFS-14985
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree


FSCK of a blockid which belongs to an EC File doesnot print status at the end 
and displays null instead
 
./hdfs fsck -blockId blk_-x
Connecting to namenode via 
FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 2019

Block Id: blk_-x
Block belongs to: /ecdir/f2
No. of Expected Replica: 3
No. of live Replica: 3
No. of excess Replica: 0
No. of stale Replica: 2
No. of decommissioned Replica: 0
No. of decommissioning Replica: 0
No. of corrupted Replica: 0
null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14985) FSCK for a block of EC Files doesnt display status at the end

2019-11-13 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree reassigned HDFS-14985:
-

Assignee: Ravuri Sushma sree

> FSCK for a block of EC Files doesnt display status at the end
> -
>
> Key: HDFS-14985
> URL: https://issues.apache.org/jira/browse/HDFS-14985
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>
> FSCK of a blockid which belongs to an EC File doesnot print status at the end 
> and displays null instead
>  
> ./hdfs fsck -blockId blk_-x
> Connecting to namenode via 
> FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 
> 2019
> Block Id: blk_-x
> Block belongs to: /ecdir/f2
> No. of Expected Replica: 3
> No. of live Replica: 3
> No. of excess Replica: 0
> No. of stale Replica: 2
> No. of decommissioned Replica: 0
> No. of decommissioning Replica: 0
> No. of corrupted Replica: 0
> null



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-11-13 Thread Ravuri Sushma sree (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16973133#comment-16973133
 ] 

Ravuri Sushma sree commented on HDFS-14442:
---

Hi [~xkrogen], I have uploaded the patch following your suggestions.Can you 
please review

 

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.patch
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14442:
--
Attachment: HDFS-14442.003.patch

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.patch
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14442:
--
Attachment: (was: HDFS-14442.003.PATCH)

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14442:
--
Attachment: HDFS-14442.003.PATCH

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.PATCH
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14442:
--
Attachment: (was: HDFS-14442.003.patch)

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.PATCH
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Attachment: HDFS-14528.006.patch

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, 
> HDFS-14528.005.patch, HDFS-14528.006.patch, HDFS-14528.2.Patch, 
> ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Attachment: (was: HDFS-14528.006.patch)

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, 
> HDFS-14528.005.patch, HDFS-14528.2.Patch, ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Attachment: HDFS-14528.006.patch

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, 
> HDFS-14528.005.patch, HDFS-14528.006.patch, HDFS-14528.2.Patch, 
> ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Attachment: (was: HDFS-14528.006.patch)

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, 
> HDFS-14528.005.patch, HDFS-14528.2.Patch, ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14528) Failover from Active to Standby Failed

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14528:
--
Attachment: HDFS-14528.006.patch

> Failover from Active to Standby Failed  
> 
>
> Key: HDFS-14528
> URL: https://issues.apache.org/jira/browse/HDFS-14528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Ravuri Sushma sree
>Assignee: Ravuri Sushma sree
>Priority: Major
>  Labels: multi-sbnn
> Attachments: HDFS-14528.003.patch, HDFS-14528.004.patch, 
> HDFS-14528.005.patch, HDFS-14528.006.patch, HDFS-14528.2.Patch, 
> ZKFC_issue.patch
>
>
>  *In a cluster with more than one Standby namenode, manual failover throws 
> exception for some cases*
> *When trying to exectue the failover command from active to standby* 
> *._/hdfs haadmin  -failover nn1 nn2, below Exception is thrown_*
>   Operation failed: Call From X-X-X-X/X-X-X-X to Y-Y-Y-Y: failed on 
> connection exception: java.net.ConnectException: Connection refused
> This is encountered in the following cases :
>  Scenario 1 : 
> Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> When trying to manually failover from NN1 to NN2 if NN3 is down, Exception is 
> thrown
> Scenario 2 :
>  Namenodes - NN1(Active) , NN2(Standby), NN3(Standby)
> ZKFC's -              ZKFC1,            ZKFC2,            ZKFC3
> When trying to manually failover using NN1 to NN3 if NN3's ZKFC (ZKFC3) is 
> down, Exception is thrown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14442) Disagreement between HAUtil.getAddressOfActive and RpcInvocationHandler.getConnectionId

2019-11-12 Thread Ravuri Sushma sree (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravuri Sushma sree updated HDFS-14442:
--
Attachment: HDFS-14442.003.patch

> Disagreement between HAUtil.getAddressOfActive and 
> RpcInvocationHandler.getConnectionId
> ---
>
> Key: HDFS-14442
> URL: https://issues.apache.org/jira/browse/HDFS-14442
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Ravuri Sushma sree
>Priority: Major
> Attachments: HDFS-14442.001.patch, HDFS-14442.002.patch, 
> HDFS-14442.003.patch
>
>
> While working on HDFS-14245, we noticed a discrepancy in some proxy-handling 
> code.
> The description of {{RpcInvocationHandler.getConnectionId()}} states:
> {code}
>   /**
>* Returns the connection id associated with the InvocationHandler instance.
>* @return ConnectionId
>*/
>   ConnectionId getConnectionId();
> {code}
> It does not make any claims about whether this connection ID will be an 
> active proxy or not. Yet in {{HAUtil}} we have:
> {code}
>   /**
>* Get the internet address of the currently-active NN. This should rarely 
> be
>* used, since callers of this method who connect directly to the NN using 
> the
>* resulting InetSocketAddress will not be able to connect to the active NN 
> if
>* a failover were to occur after this method has been called.
>* 
>* @param fs the file system to get the active address of.
>* @return the internet address of the currently-active NN.
>* @throws IOException if an error occurs while resolving the active NN.
>*/
>   public static InetSocketAddress getAddressOfActive(FileSystem fs)
>   throws IOException {
> if (!(fs instanceof DistributedFileSystem)) {
>   throw new IllegalArgumentException("FileSystem " + fs + " is not a 
> DFS.");
> }
> // force client address resolution.
> fs.exists(new Path("/"));
> DistributedFileSystem dfs = (DistributedFileSystem) fs;
> DFSClient dfsClient = dfs.getClient();
> return RPC.getServerAddress(dfsClient.getNamenode());
>   }
> {code}
> Where the call {{RPC.getServerAddress()}} eventually terminates into 
> {{RpcInvocationHandler#getConnectionId()}}, via {{RPC.getServerAddress()}} -> 
> {{RPC.getConnectionIdForProxy()}} -> 
> {{RpcInvocationHandler#getConnectionId()}}. {{HAUtil}} appears to be making 
> an incorrect assumption that {{RpcInvocationHandler}} will necessarily return 
> an _active_ connection ID. {{ObserverReadProxyProvider}} demonstrates a 
> counter-example to this, since the current connection ID may be pointing at, 
> for example, an Observer NameNode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >