[GitHub] [druid] FrankChen021 commented on issue #11061: How to improving ingestion performance

GitBox Thu, 01 Apr 2021 08:32:44 -0700


FrankChen021 commented on issue #11061:
URL: https://github.com/apache/druid/issues/11061#issuecomment-811988987



   > the num of partitions is 270.
   
   It's better to check whether your messages are evenly distributed among 
these partitions. If not, some peons will process more messages than others, 
there might be some lags for these peons.
   
   > The maxBytesInMemory is about 300Mb, is it not make full use of heap?
   
   In your spec, this parameter is 0, which will be default to 1/6 of JVM max 
heap size during running. Persistence happens when either `maxRowsInMemory` or 
`maxBytesInMemory` is reached.  But persistence is running in a separated 
thread, I don't think these two affect the ingestion speed.
   
   For Kafka, generally speaking, there're 2 parameters which I usually highly 
recommend to use to improve throughput 
   1. set `compression.type` to lz4 at producer side. This would greatly reduce 
network traffic without introducing extra CPU usage at producer and consumer 
side.
   2. set `fetch.min.bytes` to a large number(such as 4096) to tell brokers 
return more messages in a single fetch response to a consumer . This would 
greatly reduce network requests/responses between consumers and brokers. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] FrankChen021 commented on issue #11061: How to improving ingestion performance

Reply via email to