[ https://issues.apache.org/jira/browse/KUDU-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175080#comment-17175080 ]
Andrew Wong commented on KUDU-3180: ----------------------------------- {quote}After we tuned -flush_threshold_secs to 1800(was 3600 before), we could avoid OOM {quote} If the server is running low on memory during these times, wouldn't that put it into memory-pressure mode anyway? If you're on 1.11, that should already schedule a flush for the mem-store that anchors the most memory. And in 1.12, based on the screenshots, we would still schedule some flushes for some fairly large mem-stores. Additionally, I would have expected write requests to also be throttled, further slowing down the memory growth. If there is no memory-pressure despite there being OOMs, I wonder if this could be related to KUDU-3030. {quote}Maybe could use max(memory_size, time_since_last_flush to define perf improvement of a mem-store flush, so that both big mem-stores and long_lived mem-stores could be flushed in priority. {quote} Yeah, my biggest concern is that we don't regress KUDU-3002, since the perf score for {{time_since_last_flush}} is limited. If we go down this route, that may need to be adjusted. > kudu don't always prefer to flush MRS/DMS that anchor more memory > ----------------------------------------------------------------- > > Key: KUDU-3180 > URL: https://issues.apache.org/jira/browse/KUDU-3180 > Project: Kudu > Issue Type: Improvement > Reporter: YifanZhang > Priority: Major > Attachments: image-2020-08-04-20-26-53-749.png, > image-2020-08-04-20-28-00-665.png > > > Current time-based flush policy always give a flush op a high score if we > haven't flushed for the tablet in a long time, that may lead to starvation of > ops that could free more memory. > We set -flush_threshold_mb=32, -flush_threshold_secs=1800 in a cluster, and > find that some small MRS/DMS flushes has a higher perf score than big MRS/DMS > flushes and compactions, which seems not so reasonable. > !image-2020-08-04-20-26-53-749.png|width=1424,height=317!!image-2020-08-04-20-28-00-665.png|width=1414,height=327! -- This message was sent by Atlassian Jira (v8.3.4#803005)