[ https://issues.apache.org/jira/browse/KAFKA-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273436#comment-15273436 ]
John Ky commented on KAFKA-3656: -------------------------------- I spent a day writing reflection to work around this issue. Can we have this back ported? > Avoid stressing system more when already under stress > ----------------------------------------------------- > > Key: KAFKA-3656 > URL: https://issues.apache.org/jira/browse/KAFKA-3656 > Project: Kafka > Issue Type: Bug > Reporter: Alexey Raga > Assignee: Liquan Pei > Fix For: 0.10.1.0, 0.10.0.0 > > > I am working with Kafka Connect now and I am having error messages like that: > {code} > [2016-05-04 03:11:28,226] ERROR Failed to flush > WorkerSourceTask{id=geo-connector-0}, timed out while waiting for producer to > flush outstanding messages, 151860 left ([FAILED toString()]) > (org.apache.kafka.connect.runtime.WorkerSourceTask:237) > [2016-05-04 03:11:28,227] ERROR Failed to commit offsets for > WorkerSourceTask{id=geo-connector-0} > (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter:112) > {code} > I didn't figure out the reason why Connect would pull so many records into > memory when it clearly can't produce that fast and I don't yet know why > producing messages is slow. > But the part of {{151860 left ([FAILED toString()]) }} is interesting and I > looked at the code and found this: > {code} > if (timeoutMs <= 0) { > log.error( > "Failed to flush {}, timed out while waiting > for producer to flush outstanding " > + "messages, {} left ({})", this, > outstandingMessages.size(), outstandingMessages); > finishFailedFlush(); > return false; > } > {code} > So when the connector is under stress and, assuming {{151860}} messages, > under a heavy memory pressure the code choses to take pretty much {{4 * > 151860}} byte arrays and to convert it to a java string. > This not only eats more memory and adds to GC, but is also useless for > logging because the actual string, if it wouldn't fail, would look like: > {code} > (topic=lamington--geo-connector, partition=null, key=null, > value=[B@62c66f62=ProducerRecord(topic=lamington--geo-connector, > partition=null, key=null, value=[B@62c66f62, > ProducerRecord(topic=lamington--geo-connector, partition=null, key=null, ..... > {code} > I think it is a bug and a string representation of the outstanding messages > should be removed from the log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)