[jira] [Updated] (HDFS-15116) Correct spelling of comments for NNStorage.setRestoreFailedStorage

2020-01-13 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15116:
--
Attachment: (was: HDFS-15116.001.patch)

> Correct spelling of comments for NNStorage.setRestoreFailedStorage
> --
>
> Key: HDFS-15116
> URL: https://issues.apache.org/jira/browse/HDFS-15116
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> Correct spelling of comments for NNStorage.setRestoreFailedStorage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15116) Correct spelling of comments for NNStorage.setRestoreFailedStorage

2020-01-13 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15116:
--
Attachment: HDFS-15116.000.patch

> Correct spelling of comments for NNStorage.setRestoreFailedStorage
> --
>
> Key: HDFS-15116
> URL: https://issues.apache.org/jira/browse/HDFS-15116
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15116.000.patch
>
>
> Correct spelling of comments for NNStorage.setRestoreFailedStorage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15116) Correct spelling of comments for NNStorage.setRestoreFailedStorage

2020-01-13 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15116:
--
Attachment: HDFS-15116.001.patch

> Correct spelling of comments for NNStorage.setRestoreFailedStorage
> --
>
> Key: HDFS-15116
> URL: https://issues.apache.org/jira/browse/HDFS-15116
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15116.001.patch
>
>
> Correct spelling of comments for NNStorage.setRestoreFailedStorage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15116) Correct spelling of comments for NNStorage.setRestoreFailedStorage

2020-01-13 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15116:
--
Attachment: (was: HDFS-15116.000.patch)

> Correct spelling of comments for NNStorage.setRestoreFailedStorage
> --
>
> Key: HDFS-15116
> URL: https://issues.apache.org/jira/browse/HDFS-15116
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15116.001.patch
>
>
> Correct spelling of comments for NNStorage.setRestoreFailedStorage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15116) Correct spelling of comments for NNStorage.setRestoreFailedStorage

2020-01-13 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15116:
--
Status: Patch Available  (was: Open)

> Correct spelling of comments for NNStorage.setRestoreFailedStorage
> --
>
> Key: HDFS-15116
> URL: https://issues.apache.org/jira/browse/HDFS-15116
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15116.000.patch
>
>
> Correct spelling of comments for NNStorage.setRestoreFailedStorage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15116) Correct spelling of comments for NNStorage.setRestoreFailedStorage

2020-01-13 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15116:
--
Attachment: HDFS-15116.000.patch

> Correct spelling of comments for NNStorage.setRestoreFailedStorage
> --
>
> Key: HDFS-15116
> URL: https://issues.apache.org/jira/browse/HDFS-15116
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15116.000.patch
>
>
> Correct spelling of comments for NNStorage.setRestoreFailedStorage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15116) Correct spelling of comments for NNStorage.setRestoreFailedStorage

2020-01-13 Thread Xudong Cao (Jira)
Xudong Cao created HDFS-15116:
-

 Summary: Correct spelling of comments for 
NNStorage.setRestoreFailedStorage
 Key: HDFS-15116
 URL: https://issues.apache.org/jira/browse/HDFS-15116
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.2.1
Reporter: Xudong Cao
Assignee: Xudong Cao


Correct spelling of comments for NNStorage.setRestoreFailedStorage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2020-01-12 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013957#comment-17013957
 ] 

Xudong Cao commented on HDFS-14963:
---

The above mentioned issue HDFS-15024 seems stuck, so can we process this patch 
first?

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.

2020-01-12 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17013954#comment-17013954
 ] 

Xudong Cao commented on HDFS-15027:
---

[~weichiu] can this patch be merged now?

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch, HDFS-15027.001.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is wrong and misleading, maybe we can improve the pattern like:
> *Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 
> complete, copied from /192.168.202.11:9866, initiated by 
> /192.168.202.13:53536, delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16999656#comment-16999656
 ] 

Xudong Cao commented on HDFS-15069:
---

Sorry for not noticing HDFS-12703, they are indeed a same issue.

> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications can let us know exactly what had happened.
> Possible solutions:
>  # Do not use thread pool to execute decommission monitor task, alternatively 
> we can introduce a separate thread to do this, just like HeartbeatManager, 
> ReplicationMonitor, LeaseManager, BlockReportThread, and so on.
>        OR
>        2. Catch all exceptions in decommission monitor task's run() method, 
> so it does not throw any exceptions.
> I prefer the second option.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao resolved HDFS-15069.
---
Resolution: Duplicate

> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications can let us know exactly what had happened.
> Possible solutions:
>  # Do not use thread pool to execute decommission monitor task, alternatively 
> we can introduce a separate thread to do this, just like HeartbeatManager, 
> ReplicationMonitor, LeaseManager, BlockReportThread, and so on.
>        OR
>        2. Catch all exceptions in decommission monitor task's run() method, 
> so it does not throw any exceptions.
> I prefer the second option.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

Possible solutions:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, LeaseManager, BlockReportThread, and so on.

       OR

       2. Catch all exceptions in decommission monitor task's run() method, so 
it does not throw any exceptions.

I prefer the second option.

 

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

Possible solutions:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, LeaseManager, BlockReportThread, and so on.

       OR

       2. Catch all exceptions in decommission monitor task's run() method, so 
it does not throw any exceptions.


> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at 

[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

Possible solutions:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, LeaseManager, BlockReportThread, and so on.

       OR

       2. Catch all exceptions in decommission monitor task's run() method, so 
it does not throw any exceptions.

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

Possible solutions:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, LeaseManager, BlockReportThread, and so on.

       OR

       2. Catch all exceptions in decommission monitor task's run() method, so 
he does not throw any exceptions.


> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not 

[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

Possible solutions:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, LeaseManager, BlockReportThread, and so on.

       OR

       2. Catch all exceptions in decommission monitor task's run() method, so 
he does not throw any exceptions.

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

A possible solution:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, LeaseManager, BlockReportThread, and so on.


> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 

[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

A possible solution:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, LeaseManager, BlockReportThread, and so on.

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

A possible solution:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, BlockReportThread, and so on.


> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the 

[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

A possible solution:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like HeartbeatManager, 
ReplicationMonitor, BlockReportThread, and so on.

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

A possible solution:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like heartbeatManager, 
ReplicationMonitor, blockReportThread, and so on.


> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this 

[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

A possible solution:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this, just like heartbeatManager, 
ReplicationMonitor, blockReportThread, and so on.

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

A possible solution:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this.


> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception 

[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exceptions.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Summary: DecommissionMonitor-0 thread will block forever while its timer 
task scheduled encountered any unchecked exceptions.  (was: 
DecommissionMonitor-0 thread will block forever while its timer task scheduled 
encountered any unchecked exception.)

> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exceptions.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications can let us know exactly what had happened.
> A possible solution:
>  # Do not use thread pool to execute decommission monitor task, alternatively 
> we can introduce a separate thread to do this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exception.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

A possible solution:
 # Do not use thread pool to execute decommission monitor task, alternatively 
we can introduce a separate thread to do this.

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.


> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exception.
> ---
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask().
> 

[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while its timer task scheduled encountered any unchecked exception.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Summary: DecommissionMonitor-0 thread will block forever while its timer 
task scheduled encountered any unchecked exception.  (was: 
DecommissionMonitor-0 thread will block forever while the timer moniter task 
encountered an unchecked exception.)

> DecommissionMonitor-0 thread will block forever while its timer task 
> scheduled encountered any unchecked exception.
> ---
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications can let us know exactly what had happened.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15069) DecommissionMonitor-0 thread will block forever while the timer moniter task encountered an unchecked exception.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Summary: DecommissionMonitor-0 thread will block forever while the timer 
moniter task encountered an unchecked exception.  (was: DecommissionMonitor 
thread will block forever while it encountered an unchecked exception.)

> DecommissionMonitor-0 thread will block forever while the timer moniter task 
> encountered an unchecked exception.
> 
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications can let us know exactly what had happened.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15069) DecommissionMonitor thread will block forever while it encountered an unchecked exception.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications can let us know exactly what had happened.

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask ().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications let us know exactly what had happened.


> DecommissionMonitor thread will block forever while it encountered an 
> unchecked exception.
> --
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications can let us know exactly what had happened.



--
This message was sent by Atlassian 

[jira] [Updated] (HDFS-15069) DecommissionMonitor thread will block forever while it encountered an unchecked exception.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 # stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask ().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications let us know exactly what had happened.

  was:
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 #  stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask ().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications let us know exactly what had happened.


> DecommissionMonitor thread will block forever while it encountered an 
> unchecked exception.
> --
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  # stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask ().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications let us know exactly what had happened.



--
This message was sent by Atlassian Jira

[jira] [Updated] (HDFS-15069) DecommissionMonitor thread will block forever while it encountered an unchecked exception.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of dns, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 #  stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask ().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications let us know exactly what had happened.

  was:
More than once, we have observed that during decommissioning of a large number 
of dns, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
1. stack on 2019.12.17 16:12 !stack_on_16_12.png!

2. stack on 2019.12.17 16:42 !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
1. The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
2. But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
1.The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask ().
2. The previously submitted task DecommissionMonitor will be never executed 
again.
3. No logs or notifications let us know exactly what had happened.


> DecommissionMonitor thread will block forever while it encountered an 
> unchecked exception.
> --
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of dns, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  #  stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask ().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications let us know exactly what had happened.



--
This message was sent by Atlassian Jira

[jira] [Updated] (HDFS-15069) DecommissionMonitor thread will block forever while it encountered an unchecked exception.

2019-12-18 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15069:
--
Description: 
More than once, we have observed that during decommissioning of a large number 
of DNs, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 #  stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask ().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications let us know exactly what had happened.

  was:
More than once, we have observed that during decommissioning of a large number 
of dns, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
 #  stack on 2019.12.17 16:12  !stack_on_16_12.png!
 # stack on 2019.12.17 16:42  !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
 # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
 # But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
 # The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask ().
 # The previously submitted task DecommissionMonitor will be never executed 
again.
 # No logs or notifications let us know exactly what had happened.


> DecommissionMonitor thread will block forever while it encountered an 
> unchecked exception.
> --
>
> Key: HDFS-15069
> URL: https://issues.apache.org/jira/browse/HDFS-15069
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Major
> Attachments: stack_on_16_12.png, stack_on_16_42.png
>
>
> More than once, we have observed that during decommissioning of a large 
> number of DNs, the thread DecommissionMonitor-0 will stop scheduling, 
> blocking for a long time, and there will be no exception logs or 
> notifications at all.
> e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
> 10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.
> The stack of DecommissionMonitor-0 looks like this:
>  #  stack on 2019.12.17 16:12  !stack_on_16_12.png!
>  # stack on 2019.12.17 16:42  !stack_on_16_42.png!
> It can be seen that during half an hour, this thread has not been scheduled 
> at all, its Waited count has not changed.
> We think the cause of the problem is:
>  # The DecommissionMonitor task submitted by NameNode encounters an unchecked 
> exception during its running , and then this task will be never executed 
> again.
>  # But NameNode does not care about the ScheduledFuture of this task, and 
> never calls ScheduledFuture.get(), so the unchecked exception thrown by the 
> task above will always be placed there, no one knows.
> After that, the subsequent phenomenon is:
>  # The ScheduledExecutorService thread DecommissionMonitor-0 will block 
> forever in ThreadPoolExecutor.getTask ().
>  # The previously submitted task DecommissionMonitor will be never executed 
> again.
>  # No logs or notifications let us know exactly what had happened.



--
This message was sent by Atlassian 

[jira] [Created] (HDFS-15069) DecommissionMonitor thread will block forever while it encountered an unchecked exception.

2019-12-18 Thread Xudong Cao (Jira)
Xudong Cao created HDFS-15069:
-

 Summary: DecommissionMonitor thread will block forever while it 
encountered an unchecked exception.
 Key: HDFS-15069
 URL: https://issues.apache.org/jira/browse/HDFS-15069
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.1.3
Reporter: Xudong Cao
Assignee: Xudong Cao
 Attachments: stack_on_16_12.png, stack_on_16_42.png

More than once, we have observed that during decommissioning of a large number 
of dns, the thread DecommissionMonitor-0 will stop scheduling, blocking for a 
long time, and there will be no exception logs or notifications at all.

e.g. Recently, we are decommissioning 65 DNs at the same time, each DN about 
10TB, and the DecommissionMonitor-0 thread blocked for about 15 days.

The stack of DecommissionMonitor-0 looks like this:
1. stack on 2019.12.17 16:12 !stack_on_16_12.png!

2. stack on 2019.12.17 16:42 !stack_on_16_42.png!

It can be seen that during half an hour, this thread has not been scheduled at 
all, its Waited count has not changed.

We think the cause of the problem is:
1. The DecommissionMonitor task submitted by NameNode encounters an unchecked 
exception during its running , and then this task will be never executed again.
2. But NameNode does not care about the ScheduledFuture of this task, and never 
calls ScheduledFuture.get(), so the unchecked exception thrown by the task 
above will always be placed there, no one knows.

After that, the subsequent phenomenon is:
1.The ScheduledExecutorService thread DecommissionMonitor-0 will block forever 
in ThreadPoolExecutor.getTask ().
2. The previously submitted task DecommissionMonitor will be never executed 
again.
3. No logs or notifications let us know exactly what had happened.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15027) Correct target DN's log while balancing.

2019-12-11 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994082#comment-16994082
 ] 

Xudong Cao edited comment on HDFS-15027 at 12/12/19 2:14 AM:
-

[~weichiu] Perhaps we should keep the keyword "Moved" to reflect the meaning of 
moving block during balance, just like :

*Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*


was (Author: xudongcao):
[~weichiu] Perhaps we should keep the keyword "Moved" to reflect the meaning of 
moving block during balance, just like:
{code:java}
2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 complete, 
copied from /192.168.202.11:9866, initiated by /192.168.202.13:53536, 
delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch, HDFS-15027.001.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is wrong and misleading, maybe we can improve the pattern like:
> *Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 
> complete, copied from /192.168.202.11:9866, initiated by 
> /192.168.202.13:53536, delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-11 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: HDFS-15027.001.patch

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch, HDFS-15027.001.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is wrong and misleading, maybe we can improve the pattern like:
> *Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 
> complete, copied from /192.168.202.11:9866, initiated by 
> /192.168.202.13:53536, delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15027) Correct target DN's log while balancing.

2019-12-11 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994082#comment-16994082
 ] 

Xudong Cao edited comment on HDFS-15027 at 12/12/19 2:11 AM:
-

[~weichiu] Perhaps we should keep the keyword "Moved" to reflect the meaning of 
moving block during balance, just like:
{code:java}
2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 complete, 
copied from /192.168.202.11:9866, initiated by /192.168.202.13:53536, 
delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}


was (Author: xudongcao):
[~weichiu] Perhaps we should keep the keyword "Moved" to reflect the meaning of 
moving block during balance, just like:
{code:java}
2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 complete, 
copied from /192.168.202.11:9866, initiated by 
/192.168.202.13:53536,delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is wrong and misleading, maybe we can improve the pattern like:
> *Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 
> complete, copied from /192.168.202.11:9866, initiated by 
> /192.168.202.13:53536, delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-11 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is wrong and misleading, maybe we can improve the pattern like:

*Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:

1. Wrong log printing before jira:
{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
2. Correct log printing after jira:
{code:java}
2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 complete, 
copied from /192.168.202.11:9866, initiated by /192.168.202.13:53536, 
delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}

  was:
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is wrong and misleading, maybe we can improve the pattern like:

*Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:

1. Wrong log printing before jira:
{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
2. Correct log printing after jira:
{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is wrong and misleading, maybe we can improve the pattern like:
> *Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 
> complete, copied from /192.168.202.11:9866, initiated by 
> /192.168.202.13:53536, delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-11 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is wrong and misleading, maybe we can improve the pattern like:

*Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:

1. Wrong log printing before jira:
{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
2. Correct log printing after jira:
{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}

  was:
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:

1. Wrong log printing before jira:
{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
2. Correct log printing after jira:
{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is wrong and misleading, maybe we can improve the pattern like:
> *Moved BLOCK complete, copied from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.

2019-12-11 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16994082#comment-16994082
 ] 

Xudong Cao commented on HDFS-15027:
---

[~weichiu] Perhaps we should keep the keyword "Moved" to reflect the meaning of 
moving block during balance, just like:
{code:java}
2019-12-12 10:06:34,791 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1360308441-192.168.202.11-1576116241828:blk_1073741872_1048 complete, 
copied from /192.168.202.11:9866, initiated by 
/192.168.202.13:53536,delHint=c70406f8-a815-4f6f-bdf0-fd3661bd6920{code}

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao edited comment on HDFS-14963 at 12/6/19 7:54 AM:


Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point. 
At least currently, regardless of whether the cache file was accidentally 
deleted, or the content was maliciously modified, or file permission problems 
arised, it will not abort a client's rpc invocations. The worst case is to fall 
back to the existing situation:  simply starting from index 0.

So in a sense, it can still be considered "stateless".


was (Author: xudongcao):
Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently, regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not abort a client's 
rpc invocations. The worst case is to fall back to the existing situation:  
simply starting from index 0.

So in a sense, it can still be considered "stateless".

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao edited comment on HDFS-14963 at 12/6/19 7:50 AM:


Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently, regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not abort a client's 
rpc invocations. The worst case is to fall back to the existing situation:  
simply starting from index 0.

 

So in a sense, it can still be considered "stateless".


was (Author: xudongcao):
Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently, regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not abort a client's 
rpc invocations. The worst case is to fall back to the existing situation:  
simply starting from index 0.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao edited comment on HDFS-14963 at 12/6/19 7:50 AM:


Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently, regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not abort a client's 
rpc invocations. The worst case is to fall back to the existing situation:  
simply starting from index 0.

So in a sense, it can still be considered "stateless".


was (Author: xudongcao):
Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently, regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not abort a client's 
rpc invocations. The worst case is to fall back to the existing situation:  
simply starting from index 0.

 

So in a sense, it can still be considered "stateless".

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao edited comment on HDFS-14963 at 12/6/19 2:25 AM:


Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently, regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not abort a client's 
rpc invocations. The worst case is to fall back to the existing situation:  
simply starting from index 0.


was (Author: xudongcao):
Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently , regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not abort a client's 
rpc invocations. The worst case is to fall back to the existing situation:  
simply starting from index 0.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao edited comment on HDFS-14963 at 12/6/19 2:23 AM:


Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently , regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not abort a client's 
rpc invocations. The worst case is to fall back to the existing situation:  
simply starting from index 0.


was (Author: xudongcao):
Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently , regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not affect the normal 
operation of the client. 

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao edited comment on HDFS-14963 at 12/6/19 2:15 AM:


Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

And for the issue of keeping state on local FileSystem, this is indeed a point, 
at least currently , regardless of whether the cache file was accidentally 
deleted or the content was maliciously modified, it will not affect the normal 
operation of the client. 


was (Author: xudongcao):
Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao edited comment on HDFS-14963 at 12/6/19 2:05 AM:


Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occurs, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.


was (Author: xudongcao):
Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occures, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao edited comment on HDFS-14963 at 12/6/19 1:58 AM:


Yes, [~weichiu]'s comment is what I want to say, the patch uses Read/Write File 
Lock as well as tyrLock mechanism, so :
 # while clients starting, they will all obtain the Read Lock successfully.
 # while failover occures, only one client can obtain the Write Lock 
successfully, other clients will directly skip writing because they tryLock 
failed.
 # while a client is writing file holding Write Lock, then a newly starting 
client can not obtain the Read Lock and it will simply begin with index 0 
immediately.


was (Author: xudongcao):
Yes, [~weichiu]'s comment is what I want to say, the patch uses ReadWrite File 
Lock as well as tyrLock() mechanism, so :

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-05 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989307#comment-16989307
 ] 

Xudong Cao commented on HDFS-14963:
---

Yes, [~weichiu]'s comment is what I want to say, the patch uses ReadWrite File 
Lock as well as tyrLock() mechanism, so :

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987459#comment-16987459
 ] 

Xudong Cao commented on HDFS-15027:
---

cc [~weichiu] Sorry, patch uploaded again, this is just a minor log improve, I 
think there's no need for unit test.

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987459#comment-16987459
 ] 

Xudong Cao edited comment on HDFS-15027 at 12/4/19 2:40 AM:


cc [~weichiu] Sorry, patch uploaded again, this is just a minor log 
improvement, I think there's no need for unit test.


was (Author: xudongcao):
cc [~weichiu] Sorry, patch uploaded again, this is just a minor log improve, I 
think there's no need for unit test.

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: HDFS-15027.000.patch

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:

1. Wrong log printing before jira:
{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
2. Correct log printing after jira:
{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}

  was:
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:
 # Wrong log printing before jira:

{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}

 # Correct log printing after jira:

{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
> 1. Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> 2. Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-03 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*

 

An example log of target DN during balancing:
 # Wrong log printing before jira:

{code:java}
2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}

 # Correct log printing after jira:

{code:java}
2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
/192.168.202.11:9866, initiated by /192.168.202.13:44502, 
delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}

  was:
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*
>  
> An example log of target DN during balancing:
>  # Wrong log printing before jira:
> {code:java}
> 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  # Correct log printing after jira:
> {code:java}
> 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from 
> /192.168.202.11:9866, initiated by /192.168.202.13:44502, 
> delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-12-03 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987411#comment-16987411
 ] 

Xudong Cao commented on HDFS-14963:
---

cc [~weichiu] All review comments have been resolved, could someone please 
merge this patch? thus we can proceed to process HDFS-14969 (as they modified 
some same files).

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN copied a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER*

  was:
During HDFS balancing, after the target DN moves a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER***


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after the target DN copied a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN moves a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

This is somehow misleading, maybe we can improve the pattern like:

*Copied BLOCK from PROXY DN, initiated by* *BALANCER***

  was:
During HDFS balancing, after the target DN moves a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from PROXY DN*

An example (Source & Proxy DN: 192.168.202.11, Target DN: 192.168.202.12, 
Balancer: 192.168.202.13) :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not the proxy DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after jira ( this is right, 192.168.202.11 is the real proxy 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the proxy DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after the target DN moves a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> This is somehow misleading, maybe we can improve the pattern like:
> *Copied BLOCK from PROXY DN, initiated by* *BALANCER***



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: (was: HDFS-15027.000.patch)

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after the target DN moves a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from PROXY DN*
> An example (Source & Proxy DN: 192.168.202.11, Target DN: 192.168.202.12, 
> Balancer: 192.168.202.13) :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not the proxy DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after jira ( this is right, 192.168.202.11 is the real proxy 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the proxy DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after the target DN moves a block from the proxy DN, it 
prints a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from PROXY DN*

An example (Source & Proxy DN: 192.168.202.11, Target DN: 192.168.202.12, 
Balancer: 192.168.202.13) :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not the proxy DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after jira ( this is right, 192.168.202.11 is the real proxy 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the proxy DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.

  was:
During HDFS balancing, after target DN moves a block from proxy DN, it prints a 
log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from PROXY DN*

An example (Source & Proxy DN: 192.168.202.11, Target DN: 192.168.202.12, 
Balancer: 192.168.202.13) :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not the proxy DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after jira ( this is right, 192.168.202.11 is the real proxy 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the proxy DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after the target DN moves a block from the proxy DN, 
> it prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from PROXY DN*
> An example (Source & Proxy DN: 192.168.202.11, Target DN: 192.168.202.12, 
> Balancer: 192.168.202.13) :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not the proxy DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after jira ( this is right, 192.168.202.11 is the real proxy 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the proxy DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after target DN moves a block from proxy DN, it prints a 
log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from PROXY DN*

An example (Source & Proxy DN: 192.168.202.11, Target DN: 192.168.202.12, 
Balancer: 192.168.202.13) :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not the proxy DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after jira ( this is right, 192.168.202.11 is the real proxy 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the proxy DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.

  was:
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example (Source DN: 192.168.202.11, Target DN: 192.168.202.12, Balancer: 
192.168.202.13) :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not source DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after jira ( this is right, 192.168.202.11 is the real source 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the source DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from proxy DN, it prints 
> a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from PROXY DN*
> An example (Source & Proxy DN: 192.168.202.11, Target DN: 192.168.202.12, 
> Balancer: 192.168.202.13) :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not the proxy DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after jira ( this is right, 192.168.202.11 is the real proxy 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the proxy DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Summary: Correct target DN's log while balancing.  (was: Correct target 
DN's log during balancing.)

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example (Source DN: 192.168.202.11, Target DN: 192.168.202.12, Balancer: 
> 192.168.202.13) :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not source DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after jira ( this is right, 192.168.202.11 is the real source 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the source DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log during balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Summary: Correct target DN's log during balancing.  (was: Correct target 
DN's log while balancing.)

> Correct target DN's log during balancing.
> -
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example (Source DN: 192.168.202.11, Target DN: 192.168.202.12, Balancer: 
> 192.168.202.13) :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not source DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after jira ( this is right, 192.168.202.11 is the real source 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the source DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example (Source DN: 192.168.202.11, Target DN: 192.168.202.12, Balancer: 
192.168.202.13) :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not source DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after jira ( this is right, 192.168.202.11 is the real source 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the source DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.

  was:
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not source DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after jira ( this is right, 192.168.202.11 is the real source 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the source DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example (Source DN: 192.168.202.11, Target DN: 192.168.202.12, Balancer: 
> 192.168.202.13) :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not source DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after jira ( this is right, 192.168.202.11 is the real source 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the source DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: HDFS-15027.000.patch

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not source DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after Jira ( this is right, 192.168.202.11 is the real source 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the Source DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not source DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after jira ( this is right, 192.168.202.11 is the real source 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the source DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.

  was:
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not source DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after Jira ( this is right, 192.168.202.11 is the real source 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the Source DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not source DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after jira ( this is right, 192.168.202.11 is the real source 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the source DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

Target DN's log before jira (this is wrong because 192.168.202.13 is the 
balancer, not source DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Target DN's log after Jira ( this is right, 192.168.202.11 is the real source 
DN) :
{code:java}
2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
/192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
Correspondingly, the Source DN's log is:
{code:java}
2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
/192.168.202.12:33486{code}
This is a minor log correction, no need for unit test.

  was:
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after Jira ( this is right, 192.168.202.11 is the real source DN) :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> Target DN's log before jira (this is wrong because 192.168.202.13 is the 
> balancer, not source DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Target DN's log after Jira ( this is right, 192.168.202.11 is the real source 
> DN) :
> {code:java}
> 2019-12-02 20:37:54,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 from 
> /192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> Correspondingly, the Source DN's log is:
> {code:java}
> 2019-12-02 20:37:54,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Copied BP-1426342230-192.168.202.11-1575277482603:blk_1073742034_1210 to 
> /192.168.202.12:33486{code}
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: (was: HDFS-15027.000.patch)

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after Jira ( this is right, 192.168.202.11 is the real source DN) :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after Jira ( this is right, 192.168.202.11 is the real source DN) :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 This is a minor log correction, no need for unit test.

  was:
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after jira :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after Jira ( this is right, 192.168.202.11 is the real source DN) :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: (was: HDFS-15027.000.patch)

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after jira :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: HDFS-15027.000.patch
Status: Patch Available  (was: Open)

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after jira :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Attachment: HDFS-15027.000.patch

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-15027.000.patch
>
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after jira :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Affects Version/s: (was: 3.1.3)

> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after jira :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after jira :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 This is a minor log correction, no need for unit test.

  was:
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after jira :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 

This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1, 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after jira :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after jira :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 

This is a minor log correction, no need for unit test.

  was:
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

 

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

 

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after jira :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 

This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1, 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after jira :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
During HDFS balancing, after target DN moves a block from source DN, it prints 
a log following the pattern below:

*Moved BLOCK from BALANCER*

 

But this is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

 

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after jira :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 

This is a minor log correction, no need for unit test.

  was:
While HDFS balancing, after target DN moves a block from source DN, it prints a 
log following pattern below:

*Moved BLOCK from BALANCER*

 

This is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

 

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after jira :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 

This is a minor log correction, no need for unit test.


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1, 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> During HDFS balancing, after target DN moves a block from source DN, it 
> prints a log following the pattern below:
> *Moved BLOCK from BALANCER*
>  
> But this is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
>  
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after jira :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-15027:
--
Description: 
While HDFS balancing, after target DN moves a block from source DN, it prints a 
log following pattern below:

*Moved BLOCK from BALANCER*

 

This is wrong and misleading, the right pattern should be:

*Moved BLOCK from SOURCE DN*

 

An example :

before jira (this is wrong because 192.168.202.13 is the balancer, not source 
DN) :
{code:java}
2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
/192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
after jira :
{code:java}
2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
 

This is a minor log correction, no need for unit test.

  was:While HDFS balancing, after target DN moves a block from source DN, it 
prints log follow this pattern:


> Correct target DN's log while balancing.
> 
>
> Key: HDFS-15027
> URL: https://issues.apache.org/jira/browse/HDFS-15027
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer  mover
>Affects Versions: 3.2.1, 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> While HDFS balancing, after target DN moves a block from source DN, it prints 
> a log following pattern below:
> *Moved BLOCK from BALANCER*
>  
> This is wrong and misleading, the right pattern should be:
> *Moved BLOCK from SOURCE DN*
>  
> An example :
> before jira (this is wrong because 192.168.202.13 is the balancer, not source 
> DN) :
> {code:java}
> 2019-12-02 17:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from 
> /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
> after jira :
> {code:java}
> 2019-12-02 19:44:39,875 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741956_1132 from 
> 192.168.202.11:9866, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code}
>  
> This is a minor log correction, no need for unit test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15027) Correct target DN's log while balancing.

2019-12-02 Thread Xudong Cao (Jira)
Xudong Cao created HDFS-15027:
-

 Summary: Correct target DN's log while balancing.
 Key: HDFS-15027
 URL: https://issues.apache.org/jira/browse/HDFS-15027
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer  mover
Affects Versions: 3.1.3, 3.2.1
Reporter: Xudong Cao
Assignee: Xudong Cao


While HDFS balancing, after target DN moves a block from source DN, it prints 
log follow this pattern:



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-25 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982041#comment-16982041
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/26/19 1:34 AM:
-

cc [~csun] At the beginning, the main goal of this jira was to solve the 
problem of the numerous unnecessary failover logs printing in the client side,  
and saving failover costs is just an extra benefit.

Although now there is another separated jira to solve the log printing problem 
(HDFS-14969 ), but I think after this jira is merged, the client will obtain 
two capabilities at the same time:
 1. Very few failover logs are printed.
 2. Saving failover costs.

In fact, I feel HDFS-14969 can even be considered as solved after the patch is 
merged.


was (Author: xudongcao):
cc [~csun] At the beginning, the main goal of this jira was to solve the 
problem of the numerous unnecessary failover logs printing in the client side,  
and saving failover costs is just an extra benefit.

Although now there is another separated jira to solve the log printing problem 
(HDFS-14969 ), but I think after this jira was merged, the client will obtain 
two capabilities at the same time:
1. Very few failover logs are printed.
2. Saving failover costs.

In fact, I feel HDFS-14969 can even be considered as solved after the patch is 
merged.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-25 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982041#comment-16982041
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/26/19 1:34 AM:
-

cc [~csun] At the beginning, the main goal of this jira was to solve the 
problem of the numerous unnecessary failover logs printing in the client side,  
and saving failover costs is just an extra benefit.

Although now there is another separated jira to solve the log printing problem 
(HDFS-14969 ), but I think after this jira is merged, the client will obtain 
two capabilities at the same time:
 1. Very few failover logs are printed.
 2. Saving failover costs.

In fact, I feel HDFS-14969 can even be considered as solved after this patch is 
merged.


was (Author: xudongcao):
cc [~csun] At the beginning, the main goal of this jira was to solve the 
problem of the numerous unnecessary failover logs printing in the client side,  
and saving failover costs is just an extra benefit.

Although now there is another separated jira to solve the log printing problem 
(HDFS-14969 ), but I think after this jira is merged, the client will obtain 
two capabilities at the same time:
 1. Very few failover logs are printed.
 2. Saving failover costs.

In fact, I feel HDFS-14969 can even be considered as solved after the patch is 
merged.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-25 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16982041#comment-16982041
 ] 

Xudong Cao commented on HDFS-14963:
---

cc [~csun] At the beginning, the main goal of this jira was to solve the 
problem of the numerous unnecessary failover logs printing in the client side,  
and saving failover costs is just an extra benefit.

Although now there is another separated jira to solve the log printing problem 
(HDFS-14969 ), but I think after this jira was merged, the client will obtain 
two capabilities at the same time:
1. Very few failover logs are printed.
2. Saving failover costs.

In fact, I feel HDFS-14969 can even be considered as solved after the patch is 
merged.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-22 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Description: 
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file 
named by its uri. *Note these cache files are shared by all hdfs client 
processes on this machine*.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

  was:
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file 
named by its uri. *Note these cache files are shared by all hdfs client 
processes on this machine*.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

 4


> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of 

[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-22 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: (was: HDFS-14963.001.patch)

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-22 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: (was: HDFS-14963.000.patch)

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  4



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-22 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Description: 
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file 
named by its uri. *Note these cache files are shared by all hdfs client 
processes on this machine*.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

 4

  was:
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file 
named by its uri. *Note these cache files are shared by all hdfs client 
processes on this machine*.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

 


> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>  Labels: multi-sbnn
> Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two 

[jira] [Comment Edited] (HDFS-14969) Fix HDFS client unnecessary failover log printing

2019-11-12 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972962#comment-16972962
 ] 

Xudong Cao edited comment on HDFS-14969 at 11/13/19 2:34 AM:
-

cc [~xkrogen] [~vagarychen]  [~shv] [~weichiu] I feel it's not good to remove 
the entire log. The more appropriate way is to update the logic to be aware of 
how many NNs are configured. We may need to add a new method to the 
FailoverProxyProvider interface such as getProxiesCount() and implement it in 
all subclasses. Then We can compare the current failover count and the total 
number of NNs in RetryInvocationHandler to determine whether to print the 
failover log. What do you think?

However, after the HDFS-14963 is merged in the future, I feel that this problem 
will be greatly alleviated.


was (Author: xudongcao):
cc [~xkrogen] [~vagarychen]  [~shv] [~weichiu] I feel it's not good to remove 
the entire log. The more appropriate way is to update the logic to be aware of 
how many NNs are configured. We may need to add a new method to the 
FailoverProxyProvider interface such as getProxiesCount() , and then implement 
it in all subclasses. What do you think?

However, after the HDFS-14963 is merged in the future, I feel that this problem 
will be greatly alleviated.

> Fix HDFS client unnecessary failover log printing
> -
>
> Key: HDFS-14969
> URL: https://issues.apache.org/jira/browse/HDFS-14969
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenario, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14969) Fix HDFS client unnecessary failover log printing

2019-11-12 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972962#comment-16972962
 ] 

Xudong Cao commented on HDFS-14969:
---

cc [~xkrogen] [~vagarychen]  [~shv] [~weichiu] I feel it's not good to remove 
the entire log. The more appropriate way is to update the logic to be aware of 
how many NNs are configured. We may need to add a new method to the 
FailoverProxyProvider interface such as getProxiesCount() , and then implement 
it in all subclasses. What do you think?

However, after the HDFS-14963 is merged in the future, I feel that this problem 
will be greatly alleviated.

> Fix HDFS client unnecessary failover log printing
> -
>
> Key: HDFS-14969
> URL: https://issues.apache.org/jira/browse/HDFS-14969
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenario, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-08 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 11:57 AM:
-

cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover in 
ConfiguredFailoverProxyProvider.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test. Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover. Of course in all 
abnormal situations there will be a WARN log.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test. Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> 

[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-08 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969979#comment-16969979
 ] 

Xudong Cao commented on HDFS-14963:
---

The failed unit tests have nothing to do with this jira, The same patch has 
passed precommit test in the github PR.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14969) Fix HDFS client unnecessary failover log printing

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14969:
--
Description: 
In multi-NameNodes scenery, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:
{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}

> Fix HDFS client unnecessary failover log printing
> -
>
> Key: HDFS-14969
> URL: https://issues.apache.org/jira/browse/HDFS-14969
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14969) Fix HDFS client unnecessary failover log printing

2019-11-07 Thread Xudong Cao (Jira)
Xudong Cao created HDFS-14969:
-

 Summary: Fix HDFS client unnecessary failover log printing
 Key: HDFS-14969
 URL: https://issues.apache.org/jira/browse/HDFS-14969
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 3.1.3
Reporter: Xudong Cao
Assignee: Xudong Cao






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 6:16 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover. Of course in all 
abnormal situations there will be a WARN log.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test. Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover. Of course in all 
abnormal situations there will be a WARN log.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state 

[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: HDFS-14963.001.patch

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch, HDFS-14963.001.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:34 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover. Of course in all 
abnormal situations there will be a WARN log.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> 

[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:28 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return an index 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return a index of 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> 

[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:27 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return a index of 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~xkrogen] , I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return a index of 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> 

[jira] [Comment Edited] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao edited comment on HDFS-14963 at 11/8/19 3:26 AM:


cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: simply return a index of 0. And if 
trylock() failed while writing, it simply returns and continues. In fact, I 
think both these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a 
separate JIRA.


was (Author: xudongcao):
cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: begin with 1st NN. And if trylock() 
failed while writing, it simply returns and continues. In fact, I think both 
these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 

[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969795#comment-16969795
 ] 

Xudong Cao commented on HDFS-14963:
---

cc [~shv] [~elgoiri] [~vagarychen] [~weichiu] Thank you all for your attention. 
For the convenience of reading, I have uploaded an additional patch besides 
github PR (they are exactly a same patch). Based on this patch:
 # The cache directory is configurable by a newly introduced item 
"dfs.client.failover.cache-active.dir",  its default value is 
${java.io.tmpdir}, which is /tmp on Linux platform.
 # Writing/Reading a cache file is under file lock protection, and we use 
trylock() instead of lock(), so in a high-concurrency scenario, reading/writing 
cache file will not become the bottleneck. if trylock() failed while reading, 
it just fall back to what we have today: begin with 1st NN. And if trylock() 
failed while writing, it simply returns and continues. In fact, I think both 
these situations should be very rare.
 # All cache files' mode are manually set to  "666", meaning every process can 
read/write them.
 # This cache mechanism is robust, regardless of whether the cache file was 
accidentally deleted or the content was maliciously modified, the 
readActiveCache() always returns a legal index, and writeActiveCache() will 
automatically rebuild the cache file on next failover.
 # We surely have dfs.client.failover.random.order, actually I have used it in 
the unit test, Zkfc does know who is active NN right now, but it does not have 
an rpc interface allowing us to get it.  and I think an rpc call is much more 
expensive than reading/writing local files.
 # cc [~elgoiri], I will then tacle the logging issue discussed in (2) in a 
separate JIRA.

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: HDFS-14963.000.patch

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: (was: HDFS-14963.000.patch)

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-07 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Attachment: HDFS-14963.000.patch

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
> Attachments: HDFS-14963.000.patch
>
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-06 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Status: Patch Available  (was: Open)

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri. *Note these cache files are shared by all hdfs client 
> processes on this machine*.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-06 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Description: 
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file 
named by its uri. *Note these cache files are shared by all hdfs client 
processes on this machine*.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

 

  was:
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file 
named by its uri.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

 


> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> 

[jira] [Updated] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.

2019-11-06 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Summary: Add HDFS Client machine caching active namenode index mechanism.  
(was: Add DFS Client caching active namenode mechanism.)

> Add HDFS Client machine caching active namenode index mechanism.
> 
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some scenarios, these logs will be 
> very numerous:
> {code:java}
> 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>  Operation category READ is not supported in state standby. Visit 
> https://s.apache.org/sbnn-error
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
>  ...{code}
> We can introduce a solution for this problem: in client machine, for every 
> hdfs cluster, caching its current Active NameNode index in a separate cache 
> file named by its uri.
> For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
> machine cache file directory is /tmp, then:
>  # the ns1 cluster related cache file is /tmp/ns1
>  # the ns2 cluster related cache file is /tmp/ns2
> And then:
>  #  When a client starts, it reads the current Active NameNode index from the 
> corresponding cache file based on the target hdfs uri, and then directly make 
> an rpc call toward the right ANN.
>  #  After each time client failovers, it need to write the latest Active 
> NameNode index to the corresponding cache file based on the target hdfs uri.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14963) Add DFS Client caching active namenode mechanism.

2019-11-06 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Description: 
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file 
named by its uri.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

 

  was:
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

 


> Add DFS Client caching active namenode mechanism.
> -
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from 

[jira] [Updated] (HDFS-14963) Add DFS Client caching active namenode mechanism.

2019-11-06 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Description: 
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file.

For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client 
machine cache file directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

And then:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

 

  was:
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file, so:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

Suppose there are hdfs://ns1 and hdfs://ns2, and the client own cache file 
directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2


> Add DFS Client caching active namenode mechanism.
> -
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some 

[jira] [Updated] (HDFS-14963) Add DFS Client caching active namenode mechanism.

2019-11-06 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Description: 
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client machine, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file, so:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

Suppose there are hdfs://ns1 and hdfs://ns2, and the client own cache file 
directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

  was:
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client side, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file, so:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

Suppose there are hdfs://ns1 and hdfs://ns2, and the client own cache file 
directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2


> Add DFS Client caching active namenode mechanism.
> -
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in some 

[jira] [Updated] (HDFS-14963) Add DFS Client caching active namenode mechanism.

2019-11-06 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Description: 
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent creation of 
clients.
 # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client side, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file, so:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

Suppose there are hdfs://ns1 and hdfs://ns2, and the client own cache file 
directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

  was:
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent startup of 
new client processes.
 #  Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client side, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file, so:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

Suppose there are hdfs://ns1 and hdfs://ns2, and the client own cache file 
directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2


> Add DFS Client caching active namenode mechanism.
> -
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent creation of 
> clients.
>  # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some unnecessary logs, in 

[jira] [Updated] (HDFS-14963) Add DFS Client caching active namenode mechanism.

2019-11-06 Thread Xudong Cao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14963:
--
Description: 
In multi-NameNodes scenery, a new hdfs client always begins a rpc call from the 
1st namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent startup of 
new client processes.
 #  Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client side, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file, so:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

Suppose there are hdfs://ns1 and hdfs://ns2, and the client own cache file 
directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2

  was:
In multi-NameNodes scenery, hdfs client always begins a rpc call from the 1st 
namenode, simply polls, and finally determines the current Active namenode. 

This brings at least two problems:
 # Extra failover consumption, especially in the case of frequent startup of 
new client processes.
 #  Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
then a client starts rpc with the 1st NN, it will be silent when failover from 
the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd NN, it 
prints some unnecessary logs, in some scenarios, these logs will be very 
numerous:

{code:java}
2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): 
Operation category READ is not supported in state standby. Visit 
https://s.apache.org/sbnn-error
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459)
 ...{code}
We can introduce a solution for this problem: in client side, for every hdfs 
cluster, caching its current Active NameNode index in a separate cache file, so:
 #  When a client starts, it reads the current Active NameNode index from the 
corresponding cache file based on the target hdfs uri, and then directly make 
an rpc call toward the right ANN.
 #  After each time client failovers, it need to write the latest Active 
NameNode index to the corresponding cache file based on the target hdfs uri.

Suppose there are hdfs://ns1 and hdfs://ns2, and the client own cache file 
directory is /tmp, then:
 # the ns1 cluster related cache file is /tmp/ns1
 # the ns2 cluster related cache file is /tmp/ns2


> Add DFS Client caching active namenode mechanism.
> -
>
> Key: HDFS-14963
> URL: https://issues.apache.org/jira/browse/HDFS-14963
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.1.3
>Reporter: Xudong Cao
>Assignee: Xudong Cao
>Priority: Minor
>
> In multi-NameNodes scenery, a new hdfs client always begins a rpc call from 
> the 1st namenode, simply polls, and finally determines the current Active 
> namenode. 
> This brings at least two problems:
>  # Extra failover consumption, especially in the case of frequent startup of 
> new client processes.
>  #  Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and 
> then a client starts rpc with the 1st NN, it will be silent when failover 
> from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd 
> NN, it prints some 

  1   2   3   >