[ 
https://issues.apache.org/jira/browse/KAFKA-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257805#comment-14257805
 ] 

Bhavesh Mistry edited comment on KAFKA-1642 at 12/24/14 7:07 PM:
-----------------------------------------------------------------

[~ewencp],

Thanks for patch.  You may close this issue.   The only thing,  I have not 
tested the rare case where a single broker is out of File Descriptor and under 
heavy load on producer will request more connections to same broker.  According 
to code, it will mark the Node State to disconnect and I am not sure if data 
will be sent via already live connection.  

Another comment is that there is no WARN or ERROR message logged when 
connection fails. Can we please change the log level for following code to WAR, 
because in production environment people set LOG LEVEL to WARN or ERROR.  So 
there will no visibility if there is connection issue.  

{code}
   /**
     * Initiate a connection to the given node
     */
    private void initiateConnect(Node node, long now) {
        try {
            log.debug("Initiating connection to node {} at {}:{}.", node.id(), 
node.host(), node.port());
            this.connectionStates.connecting(node.id(), now);
            selector.connect(node.id(), new InetSocketAddress(node.host(), 
node.port()), this.socketSendBuffer, this.socketReceiveBuffer);
        } catch (IOException e) {
            /* attempt failed, we'll try again after the backoff */
            connectionStates.disconnected(node.id());
            /* maybe the problem is our metadata, update it */
            metadata.requestUpdate();
            log.debug("Error connecting to node {} at {}:{}:", node.id(), 
node.host(), node.port(), e);
        }
    }
{code}

Thanks for all your help !

Thanks,

Bhavesh  


was (Author: bmis13):
[~ewencp],

Thanks for patch.  You may close this issue.   The only thing,  I have not 
tested the rare case where a single broker is out of File Descriptor and under 
heavy load on producer will request more connections to same broker.  According 
to code, it will mark the Node State to disconnect and I am not sure if data 
will be sent via already live connection.  

Thanks for all your help !

Thanks,

Bhavesh  

> [Java New Producer Kafka Trunk] CPU Usage Spike to 100% when network 
> connection is lost
> ---------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1642
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1642
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>    Affects Versions: 0.8.2
>            Reporter: Bhavesh Mistry
>            Assignee: Ewen Cheslack-Postava
>            Priority: Blocker
>             Fix For: 0.8.2
>
>         Attachments: 
> 0001-Initial-CPU-Hish-Usage-by-Kafka-FIX-and-Also-fix-CLO.patch, 
> KAFKA-1642.patch, KAFKA-1642.patch, KAFKA-1642_2014-10-20_17:33:57.patch, 
> KAFKA-1642_2014-10-23_16:19:41.patch
>
>
> I see my CPU spike to 100% when network connection is lost for while.  It 
> seems network  IO thread are very busy logging following error message.  Is 
> this expected behavior ?
> 2014-09-17 14:06:16.830 [kafka-producer-network-thread] ERROR 
> org.apache.kafka.clients.producer.internals.Sender - Uncaught error in kafka 
> producer I/O thread: 
> java.lang.IllegalStateException: No entry found for node -2
> at 
> org.apache.kafka.clients.ClusterConnectionStates.nodeState(ClusterConnectionStates.java:110)
> at 
> org.apache.kafka.clients.ClusterConnectionStates.disconnected(ClusterConnectionStates.java:99)
> at 
> org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:394)
> at 
> org.apache.kafka.clients.NetworkClient.maybeUpdateMetadata(NetworkClient.java:380)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:174)
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:175)
> at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:115)
> at java.lang.Thread.run(Thread.java:744)
> Thanks,
> Bhavesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to