[ https://issues.apache.org/jira/browse/HDFS-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939848#comment-13939848 ]
Jing Zhao commented on HDFS-6089: --------------------------------- Thanks for the response, Andrew. bq. If we add a time threshold (like the tailer), we want to avoid the reverse problem: a lot of small segments accumulating in the absence of a standby. Could you please explain how we avoid this issue with the current strategy? For the autoroller in ANN, I guess it should still determine whether to roll based on the # edits, however, we should change its sleeping interval from 5min to a smaller number (e.g., 2min), which means it will come to check the edits # every 2min and roll edits if necessary. Can this address your concern? Or am I missing something here? > Standby NN while transitioning to active throws a connection refused error > when the prior active NN process is suspended > ------------------------------------------------------------------------------------------------------------------------ > > Key: HDFS-6089 > URL: https://issues.apache.org/jira/browse/HDFS-6089 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha > Affects Versions: 2.4.0 > Reporter: Arpit Gupta > Assignee: Jing Zhao > Attachments: HDFS-6089.000.patch, HDFS-6089.001.patch > > > The following scenario was tested: > * Determine Active NN and suspend the process (kill -19) > * Wait about 60s to let the standby transition to active > * Get the service state for nn1 and nn2 and make sure nn2 has transitioned to > active. > What was noticed that some times the call to get the service state of nn2 got > a socket time out exception. -- This message was sent by Atlassian JIRA (v6.2#6252)