[jira] [Comment Edited] (KAFKA-1501) transient unit tests failures due to port already in use

2015-04-04 Thread Gwen Shapira (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396022#comment-14396022
 ] 

Gwen Shapira edited comment on KAFKA-1501 at 4/5/15 1:04 AM:
-

Not to take from this awesome improvement, but any reason 
ProducerFailureHandlingTest.testBrokerFailure() was removed?

(I mean, other than the fact that it is consistently failing / hanging after 
this patch... auto-merge left the test in and I found out the hard way. 
I admit failure to finding out why producers hang on lack of memory after this 
patch is added, but I'm a bit concerned)


was (Author: gwenshap):
Not to take from this awesome improvement, but any reason 
ProducerFailureHandlingTest.testBrokerFailure() was removed?

> transient unit tests failures due to port already in use
> 
>
> Key: KAFKA-1501
> URL: https://issues.apache.org/jira/browse/KAFKA-1501
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Jun Rao
>Assignee: Ewen Cheslack-Postava
>  Labels: newbie
> Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
> KAFKA-1501.patch, KAFKA-1501.patch, KAFKA-1501.patch, 
> KAFKA-1501_2015-03-09_11:41:07.patch, KAFKA-1501_2015-03-25_00:44:50.patch, 
> test-100.out, test-100.out, test-27.out, test-29.out, test-32.out, 
> test-35.out, test-38.out, test-4.out, test-42.out, test-45.out, test-46.out, 
> test-51.out, test-55.out, test-58.out, test-59.out, test-60.out, test-69.out, 
> test-72.out, test-74.out, test-76.out, test-84.out, test-87.out, test-91.out, 
> test-92.out
>
>
> Saw the following transient failures.
> kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED
> kafka.common.KafkaException: Socket server failed to bind to 
> localhost:59909: Address already in use.
> at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
> at kafka.network.Acceptor.(SocketServer.scala:141)
> at kafka.network.SocketServer.startup(SocketServer.scala:68)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
> at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
> at 
> kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1501) transient unit tests failures due to port already in use

2014-12-05 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236147#comment-14236147
 ] 

Jay Kreps edited comment on KAFKA-1501 at 12/5/14 9:29 PM:
---

Yeah that Stack Overflow article I linked to previously indicates that as of 
Java 7 the only really reliable way to check is to try to create a socket and 
see if that works. That should be a super non-invasive change too--just 
updating the implementation for choosePorts.

http://stackoverflow.com/questions/434718/sockets-discover-port-availability-using-java

They give a checkAvailable method something like the below. So the new approach 
would be to choose a random port in some range, and then check that it is 
available.

{code}
private static boolean checkAvailable(int port) {
Socket s = null;
try {
s = new Socket("localhost", port);
return false;
} catch (IOException e) {
return true;
} finally {
if( s != null){
try {
s.close();
} catch (IOException e) {
throw new RuntimeException("You should handle this error." , e);
}
}
}
}
{code}


was (Author: jkreps):
Yeah that Stack Overflow article I linked to previously indicates that as of 
Java 7 the only really reliable way to check is to try to create a socket and 
see if that works. That should be a super non-invasive change too--just 
updating the implementation for choosePorts.

http://stackoverflow.com/questions/434718/sockets-discover-port-availability-using-java

They give a checkAvailable method something like the below. So the new approach 
would be to choose a random port in some range, and then check that it is 
available.

{code}
private static boolean checkAvailable(int port) {
Socket s = null;
try {
s = new Socket("localhost", port);
return false;
} catch (IOException e) {
System.out.println("--Port " + port + " is available");
return true;
} finally {
if( s != null){
try {
s.close();
} catch (IOException e) {
throw new RuntimeException("You should handle this error." , e);
}
}
}
}
{code}

> transient unit tests failures due to port already in use
> 
>
> Key: KAFKA-1501
> URL: https://issues.apache.org/jira/browse/KAFKA-1501
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Jun Rao
>Assignee: Guozhang Wang
>  Labels: newbie
> Attachments: KAFKA-1501-choosePorts.patch, KAFKA-1501.patch, 
> KAFKA-1501.patch, KAFKA-1501.patch, test-100.out, test-100.out, test-27.out, 
> test-29.out, test-32.out, test-35.out, test-38.out, test-4.out, test-42.out, 
> test-45.out, test-46.out, test-51.out, test-55.out, test-58.out, test-59.out, 
> test-60.out, test-69.out, test-72.out, test-74.out, test-76.out, test-84.out, 
> test-87.out, test-91.out, test-92.out
>
>
> Saw the following transient failures.
> kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED
> kafka.common.KafkaException: Socket server failed to bind to 
> localhost:59909: Address already in use.
> at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
> at kafka.network.Acceptor.(SocketServer.scala:141)
> at kafka.network.SocketServer.startup(SocketServer.scala:68)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
> at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
> at 
> kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (KAFKA-1501) transient unit tests failures due to port already in use

2014-09-10 Thread Chris Cope (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14128653#comment-14128653
 ] 

Chris Cope edited comment on KAFKA-1501 at 9/10/14 4:10 PM:


I agree, [~absingh]. I'm running some more tests and I think the best way to 
handle this unlikely event is to catch it specifically, and then have it rerun 
the entire test class *one* time, and noting this in the test log. This bug 
does not affect the core Kafka code, and is simply exposed here because Kafka 
has such great unit tests, and we just happen to run them A LOT for our 
purposes. I'm proposing this solution instead of hunting and fixing the 
underlying issue in choosePorts(), which when looking around at other projects 
does seem like a decent implementation.

The probability of a test class failing twice in a row should be very low 
(<0.0001%) and should result in any test class failure less than 1% of the time 
`./gradlew test` is run.

Is this approach sound?



was (Author: copester):
I agree, [~absingh]. I'm running some more tests and I think the best way to 
handle this unlikely event is to catch is specifically, and then have it rerun 
the entire test class *one* time, and noting this in the test log. This bug 
does not affect the core Kafka code, and is simple exposed here because Kafka 
has such great unit tests, and we just happen to run them A LOT of our 
purposes. I'm proposing this solution instead of hunting and fixing the 
underlying issue in choosePorts(), which when looking around at other projects 
does seem like a decent implementation.

The probability of a test class failing twice in a row should be very low 
(<0.0001%) and should result in any test class failure less than 1% of the time 
`./gradlew test` is run.

Is this approach sound?


> transient unit tests failures due to port already in use
> 
>
> Key: KAFKA-1501
> URL: https://issues.apache.org/jira/browse/KAFKA-1501
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Reporter: Jun Rao
>  Labels: newbie
>
> Saw the following transient failures.
> kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckOne FAILED
> kafka.common.KafkaException: Socket server failed to bind to 
> localhost:59909: Address already in use.
> at kafka.network.Acceptor.openServerSocket(SocketServer.scala:195)
> at kafka.network.Acceptor.(SocketServer.scala:141)
> at kafka.network.SocketServer.startup(SocketServer.scala:68)
> at kafka.server.KafkaServer.startup(KafkaServer.scala:95)
> at kafka.utils.TestUtils$.createServer(TestUtils.scala:123)
> at 
> kafka.api.ProducerFailureHandlingTest.setUp(ProducerFailureHandlingTest.scala:68)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)