[ 
https://issues.apache.org/jira/browse/KAFKA-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

barney updated KAFKA-1832:
--------------------------
    Description: 
*How to replay the problem:*
1) producer configuration:
1.1) producer.type=async
1.2) metadata.broker.list=not.existed.com:9092
Make sure the host 'not.existed.com' does not exist in DNS server or /etc/hosts;
2) send a lot of messages continuously using the above producer
It will cause 'java.net.SocketException: Too many open files' after a while, or 
you can use 'lsof -p $pid|wc -l' to check the count of open files which will be 
increasing as time goes by until it reaches the system limit(check by 'ulimit 
-n').

*Problem cause:*
{code:title=kafka.network.BlockingChannel|borderStyle=solid} 
channel.connect(new InetSocketAddress(host, port))
{code}
this line will throw an exception 
'java.nio.channels.UnresolvedAddressException' when broker host does not exist, 
and at this same time the field 'connected' is false;
In kafka.producer.SyncProducer, 'disconnect()' will not invoke 
'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false 
which means the FileDescriptor will be created but never closed;

*More:*
When the broker is an non-existent ip(for example: 
metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the problem 
will not appear;
In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch block 
but 'Net.connect()' is in, that makes the difference;

*Temporary Solution:*
{code:title=kafka.network.BlockingChannel|borderStyle=solid} 
try
{
    channel.connect(new InetSocketAddress(host, port))
}
catch
{
    case e: UnresolvedAddressException => 
    {
        disconnect();
        throw e
    }
}
{code}

  was:
How to replay the problem:
1) producer configuration:
1.1) producer.type=async
1.2) metadata.broker.list=not.existed.com:9092
Make sure the host 'not.existed.com' does not exist in DNS server or /etc/hosts;
2) send a lot of messages continuously using the above producer
It will cause 'java.net.SocketException: Too many open files' after a while, or 
you can use 'lsof -p $pid|wc -l' to check the count of open files which will be 
increasing as time goes by until it reaches the system limit(check by 'ulimit 
-n').

Problem cause:
In kafka.network.BlockingChannel, 
'channel.connect(new InetSocketAddress(host, port))'
this line will throw an exception 
'java.nio.channels.UnresolvedAddressException' when broker host does not exist, 
and at this same time the field 'connected' is false;
In kafka.producer.SyncProducer, 'disconnect()' will not invoke 
'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false 
which means the FileDescriptor will be created but never closed;

More:
When the broker is an non-existent ip(for example: 
metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the problem 
will not appear;
In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch block 
but 'Net.connect()' is in, that makes the difference;

Temporary Solution:
In kafka.network.BlockingChannel:
try
{
    channel.connect(new InetSocketAddress(host, port))
}
catch
{
    case e: UnresolvedAddressException => 
    {
        disconnect();
        throw e
    }
}


> Async Producer will cause 'java.net.SocketException: Too many open files' 
> when broker host does not exist
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1832
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1832
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>    Affects Versions: 0.8.1, 0.8.1.1
>         Environment: linux
>            Reporter: barney
>            Assignee: Jun Rao
>
> *How to replay the problem:*
> 1) producer configuration:
> 1.1) producer.type=async
> 1.2) metadata.broker.list=not.existed.com:9092
> Make sure the host 'not.existed.com' does not exist in DNS server or 
> /etc/hosts;
> 2) send a lot of messages continuously using the above producer
> It will cause 'java.net.SocketException: Too many open files' after a while, 
> or you can use 'lsof -p $pid|wc -l' to check the count of open files which 
> will be increasing as time goes by until it reaches the system limit(check by 
> 'ulimit -n').
> *Problem cause:*
> {code:title=kafka.network.BlockingChannel|borderStyle=solid} 
> channel.connect(new InetSocketAddress(host, port))
> {code}
> this line will throw an exception 
> 'java.nio.channels.UnresolvedAddressException' when broker host does not 
> exist, and at this same time the field 'connected' is false;
> In kafka.producer.SyncProducer, 'disconnect()' will not invoke 
> 'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false 
> which means the FileDescriptor will be created but never closed;
> *More:*
> When the broker is an non-existent ip(for example: 
> metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the 
> problem will not appear;
> In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch 
> block but 'Net.connect()' is in, that makes the difference;
> *Temporary Solution:*
> {code:title=kafka.network.BlockingChannel|borderStyle=solid} 
> try
> {
>     channel.connect(new InetSocketAddress(host, port))
> }
> catch
> {
>     case e: UnresolvedAddressException => 
>     {
>         disconnect();
>         throw e
>     }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to