Todd Lipcon created HDFS-3894: --------------------------------- Summary: QJM: testRecoverAfterDoubleFailures can be flaky due to IPC client caching Key: HDFS-3894 URL: https://issues.apache.org/jira/browse/HDFS-3894 Project: Hadoop HDFS Issue Type: Sub-task Components: test Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon
TestQJMWithFaults.testRecoverAfterDoubleFailures fails really occasionally. Looking into it, the issue seems to be that it's possible by random chance for an IPC server port to be reused between two different iterations of the test loop. The client will then pick up and re-use the existing IPC connection to the old server. However, the old server was shut down and restarted, so the old IPC connection is stale (ie disconnected). This causes the new client to get an EOF when it sends the "format()" call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira