Yep, that's right -- currently the only thing that reclaims space taken by
deleted rows is a RowSet merge compaction. We haven't added any logic to
trigger those based on the number of deleted rows in a RowSet; they are
currently only triggered by logic which tries to merge RowSets with
overlapping key ranges (see https://github.com/apache
/kudu/blob/master/docs/design-docs/compaction-policy.md#
intuition-behind-compaction-selection-policy and BudgetedCompactionPolicy::
PickRowSets()).

The follow-up work to add a background task to permanently remove deleted
rows is being tracked in https://issues.apache.org/jira/browse/KUDU-1979
(which I just filed).

Mike

On Mon, Apr 24, 2017 at 12:37 PM, Todd Lipcon <t...@cloudera.com> wrote:

> Mike can correct me if wrong, but I think the background task in 1.3 is
> only responsible for removing old deltas, and doesn't do anything to try to
> trigger compactions on rowsets with a high percentage of deleted _rows_.
>
> That's a separate bit of work that hasn't been started yet.
>
> -Todd
>
> On Sat, Apr 22, 2017 at 7:36 PM, Jason Heo <jason.heo....@gmail.com>
> wrote:
>
>> Hi David.
>>
>> Thank you for your reply.
>>
>> I'll try to upgrade to 1.3 this week.
>>
>> Regards,
>>
>> Jason
>>
>> 2017-04-23 2:06 GMT+09:00 <davidral...@gmail.com>:
>>
>>> Hi Jason
>>>
>>>   In Kudu 1.2 if there are compactions happening, they will reclaim
>>> space. Unfortunately the conditions for this to happen don't always
>>> occur (if the portion of the keyspace where the deletions occurred
>>> stopped receiving writes and was already fully compacted cleanup is
>>> more unlikely)
>>>   In Kudu 1.3 we added a background task to clean up old data even in
>>> the absence of compactions. Could you upgrade?
>>>
>>> Best
>>> David
>>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Reply via email to