[jira] [Commented] (KAFKA-3656) Avoid stressing system more when already under stress

John Ky (JIRA) Thu, 05 May 2016 18:11:20 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273436#comment-15273436
 ]


John Ky commented on KAFKA-3656:
--------------------------------

I spent a day writing reflection to work around this issue.  Can we have this 
back ported?

> Avoid stressing system more when already under stress
> -----------------------------------------------------
>
>                 Key: KAFKA-3656
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3656
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Alexey Raga
>            Assignee: Liquan Pei
>             Fix For: 0.10.1.0, 0.10.0.0
>
>
> I am working with Kafka Connect now and I am having error messages like that:
> {code}
> [2016-05-04 03:11:28,226] ERROR Failed to flush 
> WorkerSourceTask{id=geo-connector-0}, timed out while waiting for producer to 
> flush outstanding messages, 151860 left ([FAILED toString()]) 
> (org.apache.kafka.connect.runtime.WorkerSourceTask:237)
> [2016-05-04 03:11:28,227] ERROR Failed to commit offsets for 
> WorkerSourceTask{id=geo-connector-0} 
> (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter:112)
> {code}
> I didn't figure out the reason why Connect would pull so many records into 
> memory when it clearly can't produce that fast and I don't yet know why 
> producing messages is slow.
> But the part of {{151860 left ([FAILED toString()]) }} is interesting and I 
> looked at the code and found this:
> {code}
> if (timeoutMs <= 0) {
>                         log.error(
>                                 "Failed to flush {}, timed out while waiting 
> for producer to flush outstanding "
>                                         + "messages, {} left ({})", this, 
> outstandingMessages.size(), outstandingMessages);
>                         finishFailedFlush();
>                         return false;
>                     }
> {code}
> So when the connector is under stress and, assuming {{151860}} messages, 
> under a heavy memory pressure the code choses to take pretty much {{4 * 
> 151860}} byte arrays and to convert it to a java string.
> This not only eats more memory and adds to GC, but is also useless for 
> logging because the actual string, if it wouldn't fail, would look like:
> {code}
> (topic=lamington--geo-connector, partition=null, key=null, 
> value=[B@62c66f62=ProducerRecord(topic=lamington--geo-connector, 
> partition=null, key=null, value=[B@62c66f62, 
> ProducerRecord(topic=lamington--geo-connector, partition=null, key=null, .....
> {code}
> I think it is a bug and a string representation of the outstanding messages 
> should be removed from the log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KAFKA-3656) Avoid stressing system more when already under stress

Reply via email to