[ 
https://issues.apache.org/jira/browse/KAFKA-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17066361#comment-17066361
 ] 

ASF GitHub Bot commented on KAFKA-9700:
---------------------------------------

ijuma commented on pull request #8285: KAFKA-9700:Fix negative 
estimatedCompressionRatio issue
URL: https://github.com/apache/kafka/pull/8285
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Negative estimatedCompressionRatio leads to misjudgment about if there is no 
> room
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-9700
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9700
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>            Reporter: jiamei xie
>            Priority: Major
>
> * When I run the following command 
> bin/kafka-producer-perf-test.sh --topic test --num-records 50000000 
> --throughput -1 --record-size 5000 --producer-props 
> bootstrap.servers=server04:9092 acks=1 buffer.memory=67108864 batch.size 
> 65536 compression.type=zstd
> There was a warning:
> [2020-03-06 17:36:50,216] WARN [Producer clientId=producer-1] Got error 
> produce response in correlation id 3261 on topic-partition test-1, splitting 
> and retrying (2147483647 attempts left). Error: MESSAGE_TOO_LARGE 
> (org.apache.kafka.clients.producer.internals.Sender)
> * The batch size(65536) is smaller than max.message.bytes (1048588) .  So 
> it's not the root cause.
> * I added some logs in CompressionRatioEstimator.updateEstimation and found 
> there were negative currentEstimation values.  The following were logs I added
> public static float updateEstimation(String topic, CompressionType type, 
> float observedRatio) {
>     float[] compressionRatioForTopic = getAndCreateEstimationIfAbsent(topic);
>     float currentEstimation = compressionRatioForTopic[type.id];
>     synchronized (compressionRatioForTopic) {
>         if (observedRatio > currentEstimation)
>         {
>                 compressionRatioForTopic[type.id] = 
> Math.max(currentEstimation + COMPRESSION_RATIO_DETERIORATE_STEP, 
> observedRatio);
>         }
>         else if (observedRatio < currentEstimation) {
>                   compressionRatioForTopic[type.id] = currentEstimation - 
> COMPRESSION_RATIO_IMPROVING_STEP;
>                   log.warn("####currentEstimation is {} , 
> COMPRESSION_RATIO_IMPROVING_STEP is {} , compressionRatioForTopic[type.id] is 
> {}, type.id is {}", currentEstimation, 
> COMPRESSION_RATIO_IMPROVING_STEP,compressionRatioForTopic[type.id], type.id);
>         }
>     }
>      return compressionRatioForTopic[type.id];
> }
> The observedRatio is smaller than COMPRESSION_RATIO_IMPROVING_STEP in some 
> cases.  Some I think the else if block should be changed into 
> else if (observedRatio < currentEstimation) {
>                   compressionRatioForTopic[type.id] = 
> Math.max(currentEstimation - COMPRESSION_RATIO_IMPROVING_STEP, observedRatio);
>               }



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to