[
https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939848#comment-13939848
]
Jing Zhao commented on HDFS-6089:
---------------------------------
Thanks for the response, Andrew.
bq. If we add a time threshold (like the tailer), we want to avoid the reverse
problem: a lot of small segments accumulating in the absence of a standby.
Could you please explain how we avoid this issue with the current strategy?
For the autoroller in ANN, I guess it should still determine whether to roll
based on the # edits, however, we should change its sleeping interval from 5min
to a smaller number (e.g., 2min), which means it will come to check the edits #
every 2min and roll edits if necessary. Can this address your concern? Or am I
missing something here?
> Standby NN while transitioning to active throws a connection refused error
> when the prior active NN process is suspended
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-6089
> URL: https://issues.apache.org/jira/browse/HDFS-6089
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: ha
> Affects Versions: 2.4.0
> Reporter: Arpit Gupta
> Assignee: Jing Zhao
> Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch
>
>
> The following scenario was tested:
> * Determine Active NN and suspend the process (kill -19)
> * Wait about 60s to let the standby transition to active
> * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to
> active.
> What was noticed that some times the call to get the service state of nn2 got
> a socket time out exception.
--
This message was sent by Atlassian JIRA
(v6.2#6252)