[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

Jing Zhao (JIRA) Tue, 18 Mar 2014 14:47:33 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939848#comment-13939848
 ]


Jing Zhao commented on HDFS-6089:
---------------------------------

Thanks for the response, Andrew. 

bq. If we add a time threshold (like the tailer), we want to avoid the reverse 
problem: a lot of small segments accumulating in the absence of a standby.
Could you please explain how we avoid this issue with the current strategy?
For the autoroller in ANN, I guess it should still determine whether to roll 
based on the # edits, however, we should change its sleeping interval from 5min 
to a smaller number (e.g., 2min), which means it will come to check the edits # 
every 2min and roll edits if necessary. Can this address your concern? Or am I 
missing something here?

> Standby NN while transitioning to active throws a connection refused error 
> when the prior active NN process is suspended
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6089
>                 URL: https://issues.apache.org/jira/browse/HDFS-6089
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.4.0
>            Reporter: Arpit Gupta
>            Assignee: Jing Zhao
>         Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to 
> active.
> What was noticed that some times the call to get the service state of nn2 got 
> a socket time out exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HDFS-6089) Standby NN while transitioning to active throws a connection refused error when the prior active NN process is suspended

Reply via email to