[ https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128653#comment-14128653 ]
Chris Cope edited comment on KAFKA-1501 at 9/10/14 4:10 PM: ------------------------------------------------------------ I agree, [~absingh]. I'm running some more tests and I think the best way to handle this unlikely event is to catch it specifically, and then have it rerun the entire test class *one* time, and noting this in the test log. This bug does not affect the core Kafka code, and is simply exposed here because Kafka has such great unit tests, and we just happen to run them A LOT for our purposes. I'm proposing this solution instead of hunting and fixing the underlying issue in choosePorts(), which when looking around at other projects does seem like a decent implementation. The probability of a test class failing twice in a row should be very low (<0.0001%) and should result in any test class failure less than 1% of the time `./gradlew test` is run. Is this approach sound? was (Author: copester): I agree, [~absingh]. I'm running some more tests and I think the best way to handle this unlikely event is to catch is specifically, and then have it rerun the entire test class *one* time, and noting this in the test log. This bug does not affect the core Kafka code, and is simple exposed here because Kafka has such great unit tests, and we just happen to run them A LOT of our purposes. I'm proposing this solution instead of hunting and fixing the underlying issue in choosePorts(), which when looking around at other projects does seem like a decent implementation. The probability of a test class failing twice in a row should be very low (<0.0001%) and should result in any test class failure less than 1% of the time `./gradlew test` is run. Is this approach sound? > transient unit tests failures due to port already in use > -------------------------------------------------------- > > Key: KAFKA-1501 > URL: https://issues.apache.org/jira/browse/KAFKA-1501 > Project: Kafka > Issue Type: Improvement > Components: core > Reporter: Jun Rao > Labels: newbie > > Saw the following transient failures. > kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED > kafka.common.KafkaException: Socket server failed to bind to > localhost:59909: Address already in use. > at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195) > at kafka.network.Acceptor.<init>(SocketServer.scala:141) > at kafka.network.SocketServer.startup(SocketServer.scala:68) > at kafka.server.KafkaServer.startup(KafkaServer.scala:95) > at kafka.utils.TestUtils$.createServer(TestUtils.scala:123) > at > kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68) -- This message was sent by Atlassian JIRA (v6.3.4#6332)