[jira] [Commented] (HDFS-9665) Cilents are easily affected by standby namenode

2016-01-19 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107854#comment-15107854
 ] 

He Tianyi commented on HDFS-9665:
-

Also, I observed that {{doTailEdits}} may acquire write lock in standby 
namenode and last several seconds in my cluster.
This causes first request from client to block either.

> Cilents are easily affected by standby namenode
> ---
>
> Key: HDFS-9665
> URL: https://issues.apache.org/jira/browse/HDFS-9665
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs-client, namenode
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>Priority: Minor
>
> My case is during restarting of standby NameNode, there is chances that 
> {{hadoop fs}} command get hung until either IPC timeout reached or 
> {{StandbyException}} received, and then failover to active NameNode.
> Normally, duration of the 'hung stage' depends on 
> {{min(timeout_configuration, rpc_queue_time)}}. However, RPC queue in standby 
> NameNode is usually filled with block reports at this period, client requests 
> can't get processed quickly.
> I wish to get rid of this, by one of the following manners:
> a) we distinguish priority in RPC queue (chances causing starvation)
> b) we speculate first request, send it to both NameNodes, and take one valid 
> response.
> c) make client aware of HA state (by accessing ZK probably, chances causing 
> performance issue)
> Any suggestions or comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9665) Cilents are easily affected by standby namenode

2016-01-19 Thread He Tianyi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107896#comment-15107896
 ] 

He Tianyi commented on HDFS-9665:
-

Thanks. That patch worked well.

Closing issue.

> Cilents are easily affected by standby namenode
> ---
>
> Key: HDFS-9665
> URL: https://issues.apache.org/jira/browse/HDFS-9665
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs-client, namenode
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>Priority: Minor
>
> My case is during restarting of standby NameNode, there is chances that 
> {{hadoop fs}} command get hung until either IPC timeout reached or 
> {{StandbyException}} received, and then failover to active NameNode.
> Normally, duration of the 'hung stage' depends on 
> {{min(timeout_configuration, rpc_queue_time)}}. However, RPC queue in standby 
> NameNode is usually filled with block reports at this period, client requests 
> can't get processed quickly.
> I wish to get rid of this, by one of the following manners:
> a) we distinguish priority in RPC queue (chances causing starvation)
> b) we speculate first request, send it to both NameNodes, and take one valid 
> response.
> c) make client aware of HA state (by accessing ZK probably, chances causing 
> performance issue)
> Any suggestions or comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9665) Cilents are easily affected by standby namenode

2016-01-19 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107858#comment-15107858
 ] 

Jing Zhao commented on HDFS-9665:
-

[~He Tianyi], maybe you can try to use {{RequestHedgingProxyProvider}} from 
HDFS-7858?

> Cilents are easily affected by standby namenode
> ---
>
> Key: HDFS-9665
> URL: https://issues.apache.org/jira/browse/HDFS-9665
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs-client, namenode
>Affects Versions: 2.6.0
>Reporter: He Tianyi
>Assignee: He Tianyi
>Priority: Minor
>
> My case is during restarting of standby NameNode, there is chances that 
> {{hadoop fs}} command get hung until either IPC timeout reached or 
> {{StandbyException}} received, and then failover to active NameNode.
> Normally, duration of the 'hung stage' depends on 
> {{min(timeout_configuration, rpc_queue_time)}}. However, RPC queue in standby 
> NameNode is usually filled with block reports at this period, client requests 
> can't get processed quickly.
> I wish to get rid of this, by one of the following manners:
> a) we distinguish priority in RPC queue (chances causing starvation)
> b) we speculate first request, send it to both NameNodes, and take one valid 
> response.
> c) make client aware of HA state (by accessing ZK probably, chances causing 
> performance issue)
> Any suggestions or comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)