[
https://issues.apache.org/jira/browse/KAFKA-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
barney updated KAFKA-1832:
--------------------------
Description:
h3.How to replay the problem:
*producer configuration:
**producer.type=async
**metadata.broker.list=not.existed.com:9092
Make sure the host 'not.existed.com' does not exist in DNS server or /etc/hosts;
*send a lot of messages continuously using the above producer
It will cause '*java.net.SocketException: Too many open files*' after a while,
or you can use '*lsof -p $pid|wc -l*' to check the count of open files which
will be increasing as time goes by until it reaches the system limit(check by
'ulimit -n').
h3.Problem cause:
{code:title=kafka.network.BlockingChannel|borderStyle=solid}
channel.connect(new InetSocketAddress(host, port))
{code}
this line will throw an exception
'*java.nio.channels.UnresolvedAddressException*' when broker host does not
exist, and at this same time the field '*connected*' is false;
In *kafka.producer.SyncProducer*, 'disconnect()' will not invoke
'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false
which means the FileDescriptor will be created but never closed;
h3.More:
When the broker is an non-existent ip(for example:
metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the problem
will not appear;
In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch block
but 'Net.connect()' is in, that makes the difference;
h3.Temporary Solution:
{code:title=kafka.network.BlockingChannel|borderStyle=solid}
try
{
channel.connect(new InetSocketAddress(host, port))
}
catch
{
case e: UnresolvedAddressException =>
{
disconnect();
throw e
}
}
{code}
was:
h3.How to replay the problem:
1) producer configuration:
1.1) producer.type=async
1.2) metadata.broker.list=not.existed.com:9092
Make sure the host 'not.existed.com' does not exist in DNS server or /etc/hosts;
2) send a lot of messages continuously using the above producer
It will cause '*java.net.SocketException: Too many open files*' after a while,
or you can use '*lsof -p $pid|wc -l*' to check the count of open files which
will be increasing as time goes by until it reaches the system limit(check by
'ulimit -n').
h3.Problem cause:
{code:title=kafka.network.BlockingChannel|borderStyle=solid}
channel.connect(new InetSocketAddress(host, port))
{code}
this line will throw an exception
'*java.nio.channels.UnresolvedAddressException*' when broker host does not
exist, and at this same time the field '*connected*' is false;
In *kafka.producer.SyncProducer*, 'disconnect()' will not invoke
'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false
which means the FileDescriptor will be created but never closed;
h3.More:
When the broker is an non-existent ip(for example:
metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the problem
will not appear;
In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch block
but 'Net.connect()' is in, that makes the difference;
h3.Temporary Solution:
{code:title=kafka.network.BlockingChannel|borderStyle=solid}
try
{
channel.connect(new InetSocketAddress(host, port))
}
catch
{
case e: UnresolvedAddressException =>
{
disconnect();
throw e
}
}
{code}
> Async Producer will cause 'java.net.SocketException: Too many open files'
> when broker host does not exist
> ---------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-1832
> URL: https://issues.apache.org/jira/browse/KAFKA-1832
> Project: Kafka
> Issue Type: Bug
> Components: producer
> Affects Versions: 0.8.1, 0.8.1.1
> Environment: linux
> Reporter: barney
> Assignee: Jun Rao
>
> h3.How to replay the problem:
> *producer configuration:
> **producer.type=async
> **metadata.broker.list=not.existed.com:9092
> Make sure the host 'not.existed.com' does not exist in DNS server or
> /etc/hosts;
> *send a lot of messages continuously using the above producer
> It will cause '*java.net.SocketException: Too many open files*' after a
> while, or you can use '*lsof -p $pid|wc -l*' to check the count of open files
> which will be increasing as time goes by until it reaches the system
> limit(check by 'ulimit -n').
> h3.Problem cause:
> {code:title=kafka.network.BlockingChannel|borderStyle=solid}
> channel.connect(new InetSocketAddress(host, port))
> {code}
> this line will throw an exception
> '*java.nio.channels.UnresolvedAddressException*' when broker host does not
> exist, and at this same time the field '*connected*' is false;
> In *kafka.producer.SyncProducer*, 'disconnect()' will not invoke
> 'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false
> which means the FileDescriptor will be created but never closed;
> h3.More:
> When the broker is an non-existent ip(for example:
> metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the
> problem will not appear;
> In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch
> block but 'Net.connect()' is in, that makes the difference;
> h3.Temporary Solution:
> {code:title=kafka.network.BlockingChannel|borderStyle=solid}
> try
> {
> channel.connect(new InetSocketAddress(host, port))
> }
> catch
> {
> case e: UnresolvedAddressException =>
> {
> disconnect();
> throw e
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)