[ https://issues.apache.org/jira/browse/KUDU-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654712#comment-16654712 ]
Will Berkeley commented on KUDU-1400: ------------------------------------- Todd implemented a configurable flushing time threshold ({{–flush_threshold_secs}}) in 8d026474be, a long time ago. I've written a [design doc|https://docs.google.com/document/d/1yTfxt0_2p5EfIjCnjJCt3o-nB9xk-Kl2O8yKTA1LQrQ/edit#heading=h.5z0d0yyd9zfk] for improvements to compaction policy that should also help with this issue. > Improve rowset compaction policy to consider merging small DRSs > --------------------------------------------------------------- > > Key: KUDU-1400 > URL: https://issues.apache.org/jira/browse/KUDU-1400 > Project: Kudu > Issue Type: Improvement > Reporter: Binglin Chang > Assignee: Will Berkeley > Priority: Major > > We see some small table with light write load generate lot's of small > DRS(~1MB), since those DRSes do not overlap much, they don't get the chance > to be compacted, generating lot of very small files/blocks. So: > # Compaction solution value should consider benefits of merging small DRS > # Every 2 min flushing MRS(small or large) seems suboptimal, maybe flushing > small MRS should have "lower priority" than rowset compaction with higher > solution value? -- This message was sent by Atlassian JIRA (v7.6.3#76005)