Yep, that's right -- currently the only thing that reclaims space taken by deleted rows is a RowSet merge compaction. We haven't added any logic to trigger those based on the number of deleted rows in a RowSet; they are currently only triggered by logic which tries to merge RowSets with overlapping key ranges (see https://github.com/apache /kudu/blob/master/docs/design-docs/compaction-policy.md# intuition-behind-compaction-selection-policy and BudgetedCompactionPolicy:: PickRowSets()).
The follow-up work to add a background task to permanently remove deleted rows is being tracked in https://issues.apache.org/jira/browse/KUDU-1979 (which I just filed). Mike On Mon, Apr 24, 2017 at 12:37 PM, Todd Lipcon <t...@cloudera.com> wrote: > Mike can correct me if wrong, but I think the background task in 1.3 is > only responsible for removing old deltas, and doesn't do anything to try to > trigger compactions on rowsets with a high percentage of deleted _rows_. > > That's a separate bit of work that hasn't been started yet. > > -Todd > > On Sat, Apr 22, 2017 at 7:36 PM, Jason Heo <jason.heo....@gmail.com> > wrote: > >> Hi David. >> >> Thank you for your reply. >> >> I'll try to upgrade to 1.3 this week. >> >> Regards, >> >> Jason >> >> 2017-04-23 2:06 GMT+09:00 <davidral...@gmail.com>: >> >>> Hi Jason >>> >>> In Kudu 1.2 if there are compactions happening, they will reclaim >>> space. Unfortunately the conditions for this to happen don't always >>> occur (if the portion of the keyspace where the deletions occurred >>> stopped receiving writes and was already fully compacted cleanup is >>> more unlikely) >>> In Kudu 1.3 we added a background task to clean up old data even in >>> the absence of compactions. Could you upgrade? >>> >>> Best >>> David >>> >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera >