[jira] [Commented] (KAFKA-305) SyncProducer does not correctly timeout

Prashanth Menon (Commented) (JIRA) Thu, 22 Mar 2012 19:56:48 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236291#comment-13236291
 ]


Prashanth Menon commented on KAFKA-305:
---------------------------------------

I've uploaded a new patch with the suggestions, but it's not ready for commit, 
just another review.  A few notes:

1. BlockingChannel modified to meet suggestions.
2. SimpleConsumer uses BlockingChannel.
3. To test the BlockingChannel (in SyncProducer and async producer), I bring up 
a regular server but shutdown the requesthandler.  So the socket remains open, 
accepts requests and queues them in the request channel, but there are no 
handlers processing requests.
4. The original testZKSendWithDeadBroker wasn't commented entirely correctly.  
I've modified to actually test what the name suggests.
5. Though I wait for the broker to do down, testZKSendWithDeadBroker still 
unpredictably throws the "Broker already registered" exception.  Are you 
experiencing this locally?

I think there might be an issue with the BrokerPartitionInfo and ProduerPool 
classes.  ProducerPool never removes producers even if one is connected to a 
downed broker, so calls to getAnyProducer (used by 
BrokerPartitioninfo.updateInfo to update cached topic metadata information) 
could return the same "bad" producer on consecutive calls when attempting to 
refresh the cache.  This could potentially cause an entire send to fail though 
there may exist a broker that is able to service the topic metadata request.  
We need to somehow, remove "bad" producers, or refresh the ProducerPool when 
brokers go down, or have BrokerPartitionInfo retry its updateInfo call a 
certain number of times.  Thoughts?
                
> SyncProducer does not correctly timeout
> ---------------------------------------
>
>                 Key: KAFKA-305
>                 URL: https://issues.apache.org/jira/browse/KAFKA-305
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.7, 0.8
>            Reporter: Prashanth Menon
>            Priority: Critical
>         Attachments: KAFKA-305-v1.patch, KAFKA-305-v2.patch
>
>
> So it turns out that using the channel in SyncProducer like we are to perform 
> blocking reads will not trigger socket timeouts (though we set it) and will 
> block forever which is bad.  This bug identifies the issue: 
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4614802 and this article 
> presents a potential work-around: 
> http://stackoverflow.com/questions/2866557/timeout-for-socketchannel for 
> workaround. The work-around is a simple solution that involves creating a 
> separate ReadableByteChannel instance for timeout-enabled reads.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (KAFKA-305) SyncProducer does not correctly timeout

Reply via email to