[ 
https://issues.apache.org/jira/browse/KUDU-1954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367930#comment-16367930
 ] 

Todd Lipcon commented on KUDU-1954:
-----------------------------------

I think most of this has been improved in the last year:
bq. we don't schedule flushes until we are already in "backpressure" realm, so 
we spent most of our time doing backpressure
- KUDU-1949 changed this to start triggering flushes at 60%, and backpressure 
only starts at 80%.

bq. even if we configure N maintenance threads, we typically are only using 
~50% of those threads due to the scheduling granularity
40aa4c3c271c9df20a17a1d353ce582ee3fda742 (in 1.4.0) changed the MM to 
immediately schedule new work when a thread frees up.

bq. when we do hit the "memory-pressure flush" threshold, all threads quickly 
switch to flushing, which then brings us far beneath the threshold
bq. long running compactions can temporarily starve flushes
bq. high volume of writes can starve compactions
These three are not yet addressed, though various improvements to flush/compact 
performance make long-running ones less common.

> Improve maintenance manager behavior in heavy write workload
> ------------------------------------------------------------
>
>                 Key: KUDU-1954
>                 URL: https://issues.apache.org/jira/browse/KUDU-1954
>             Project: Kudu
>          Issue Type: Improvement
>          Components: perf, tserver
>    Affects Versions: 1.3.0
>            Reporter: Todd Lipcon
>            Priority: Major
>         Attachments: mm-trace.png
>
>
> During the investigation in [this 
> doc|https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit]
>  I found a few maintenance-manager-related issues during heavy writes:
> - we don't schedule flushes until we are already in "backpressure" realm, so 
> we spent most of our time doing backpressure
> - even if we configure N maintenance threads, we typically are only using 
> ~50% of those threads due to the scheduling granularity
> - when we do hit the "memory-pressure flush" threshold, all threads quickly 
> switch to flushing, which then brings us far beneath the threshold
> - long running compactions can temporarily starve flushes
> - high volume of writes can starve compactions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to