[ 
https://issues.apache.org/jira/browse/KUDU-1400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654712#comment-16654712
 ] 

Will Berkeley commented on KUDU-1400:
-------------------------------------

Todd implemented a configurable flushing time threshold 
({{–flush_threshold_secs}}) in 8d026474be, a long time ago.

I've written a [design 
doc|https://docs.google.com/document/d/1yTfxt0_2p5EfIjCnjJCt3o-nB9xk-Kl2O8yKTA1LQrQ/edit#heading=h.5z0d0yyd9zfk]
 for improvements to compaction policy that should also help with this issue.

> Improve rowset compaction policy to consider merging small DRSs
> ---------------------------------------------------------------
>
>                 Key: KUDU-1400
>                 URL: https://issues.apache.org/jira/browse/KUDU-1400
>             Project: Kudu
>          Issue Type: Improvement
>            Reporter: Binglin Chang
>            Assignee: Will Berkeley
>            Priority: Major
>
> We see some small table with light write load generate lot's of small 
> DRS(~1MB), since those DRSes do not overlap much, they don't get the chance 
> to be compacted, generating lot of very small files/blocks. So:
> # Compaction solution value should consider benefits of merging small DRS
> # Every 2 min flushing MRS(small or large) seems suboptimal, maybe flushing 
> small MRS should have "lower priority" than rowset compaction with higher 
> solution value?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to