[ https://issues.apache.org/jira/browse/KAFKA-1832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
barney updated KAFKA-1832: -------------------------- Description: *How to replay the problem:* 1) producer configuration: 1.1) producer.type=async 1.2) metadata.broker.list=not.existed.com:9092 Make sure the host 'not.existed.com' does not exist in DNS server or /etc/hosts; 2) send a lot of messages continuously using the above producer It will cause 'java.net.SocketException: Too many open files' after a while, or you can use 'lsof -p $pid|wc -l' to check the count of open files which will be increasing as time goes by until it reaches the system limit(check by 'ulimit -n'). *Problem cause:* {code:title=kafka.network.BlockingChannel|borderStyle=solid} channel.connect(new InetSocketAddress(host, port)) {code} this line will throw an exception 'java.nio.channels.UnresolvedAddressException' when broker host does not exist, and at this same time the field 'connected' is false; In kafka.producer.SyncProducer, 'disconnect()' will not invoke 'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false which means the FileDescriptor will be created but never closed; *More:* When the broker is an non-existent ip(for example: metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the problem will not appear; In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch block but 'Net.connect()' is in, that makes the difference; *Temporary Solution:* {code:title=kafka.network.BlockingChannel|borderStyle=solid} try { channel.connect(new InetSocketAddress(host, port)) } catch { case e: UnresolvedAddressException => { disconnect(); throw e } } {code} was: How to replay the problem: 1) producer configuration: 1.1) producer.type=async 1.2) metadata.broker.list=not.existed.com:9092 Make sure the host 'not.existed.com' does not exist in DNS server or /etc/hosts; 2) send a lot of messages continuously using the above producer It will cause 'java.net.SocketException: Too many open files' after a while, or you can use 'lsof -p $pid|wc -l' to check the count of open files which will be increasing as time goes by until it reaches the system limit(check by 'ulimit -n'). Problem cause: In kafka.network.BlockingChannel, 'channel.connect(new InetSocketAddress(host, port))' this line will throw an exception 'java.nio.channels.UnresolvedAddressException' when broker host does not exist, and at this same time the field 'connected' is false; In kafka.producer.SyncProducer, 'disconnect()' will not invoke 'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false which means the FileDescriptor will be created but never closed; More: When the broker is an non-existent ip(for example: metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the problem will not appear; In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch block but 'Net.connect()' is in, that makes the difference; Temporary Solution: In kafka.network.BlockingChannel: try { channel.connect(new InetSocketAddress(host, port)) } catch { case e: UnresolvedAddressException => { disconnect(); throw e } } > Async Producer will cause 'java.net.SocketException: Too many open files' > when broker host does not exist > --------------------------------------------------------------------------------------------------------- > > Key: KAFKA-1832 > URL: https://issues.apache.org/jira/browse/KAFKA-1832 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 0.8.1, 0.8.1.1 > Environment: linux > Reporter: barney > Assignee: Jun Rao > > *How to replay the problem:* > 1) producer configuration: > 1.1) producer.type=async > 1.2) metadata.broker.list=not.existed.com:9092 > Make sure the host 'not.existed.com' does not exist in DNS server or > /etc/hosts; > 2) send a lot of messages continuously using the above producer > It will cause 'java.net.SocketException: Too many open files' after a while, > or you can use 'lsof -p $pid|wc -l' to check the count of open files which > will be increasing as time goes by until it reaches the system limit(check by > 'ulimit -n'). > *Problem cause:* > {code:title=kafka.network.BlockingChannel|borderStyle=solid} > channel.connect(new InetSocketAddress(host, port)) > {code} > this line will throw an exception > 'java.nio.channels.UnresolvedAddressException' when broker host does not > exist, and at this same time the field 'connected' is false; > In kafka.producer.SyncProducer, 'disconnect()' will not invoke > 'blockingChannel.disconnect()' because 'blockingChannel.isConnected' is false > which means the FileDescriptor will be created but never closed; > *More:* > When the broker is an non-existent ip(for example: > metadata.broker.list=1.1.1.1:9092) instead of an non-existent host, the > problem will not appear; > In SocketChannelImpl.connect(), 'Net.checkAddress()' is not in try-catch > block but 'Net.connect()' is in, that makes the difference; > *Temporary Solution:* > {code:title=kafka.network.BlockingChannel|borderStyle=solid} > try > { > channel.connect(new InetSocketAddress(host, port)) > } > catch > { > case e: UnresolvedAddressException => > { > disconnect(); > throw e > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)