[ https://issues.apache.org/jira/browse/HDFS-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443791#comment-13443791 ]
Vinay commented on HDFS-1490: ----------------------------- {quote}Why not introduce a new config which defaults to something like 1 minute?{quote} Ok, agree. Will introduce new config for this. {quote}In the test case, shouldn't you somehow notify the servlet to exit? Currently it waits on itself, but nothing notifies it.{quote} That was just added make the client call get timeout. Ideally while stopping the server, that will be interrupted. Anyway I will add a timeout for that also. Thanks todd, for comments. I will post new patch in sometime. > TransferFSImage should timeout > ------------------------------ > > Key: HDFS-1490 > URL: https://issues.apache.org/jira/browse/HDFS-1490 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Reporter: Dmytro Molkov > Assignee: Dmytro Molkov > Priority: Minor > Attachments: HDFS-1490.patch, HDFS-1490.patch > > > Sometimes when primary crashes during image transfer secondary namenode would > hang trying to read the image from HTTP connection forever. > It would be great to set timeouts on the connection so if something like that > happens there is no need to restart the secondary itself. > In our case restarting components is handled by the set of scripts and since > the Secondary as the process is running it would just stay hung until we get > an alarm saying the checkpointing doesn't happen. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira