Hi all,

I'm writing in regards to my enhancement proposal, #10876
<https://github.com/apache/druid/issues/10876>, and subsequent PR, #10877
<https://github.com/apache/druid/pull/10877>. The issue and PR are related
to what unused segments the Druid coordinator is able to find and kill with
machine generated kill tasks. Currently, only segments whose interval end
date are in the past (relative to the time the Coordinator is looking for
segments) are able to be killed automatically. My solution allows unused
segments to be killed whose interval end date is in the future (relative to
when the Coordinator searches for segments to kill)

My team has found the existing functionality to introduce waste in
deepstore and metastore when our users are using Druid to build datasources
that span into the future. These data sources are then being refreshed
iteratively as future projections change, resulting in unused segments due
to overshadowing (a common occurrence at my org). Before we applied my
proposed change internally, we had built up a lot of unused data in
deepstore and metastore. After using this new feature, we are able to keep
our deepstore and metastore much more clean. I think this would be a great
thing for others in the community to have access to to avoid similar data
storage pain points.

Unfortunately, it has been quite some time since the PR was created, and
the only code review I've been able to land was from a non-committer
colleague of mine. I fear it may never be taken up without a little extra
push now that it is so far down the open PRs list. My hope is that bringing
up the topic in the dev list catches the eye of a neutral party who may
want to give it a look.

I'm going to be able to spend a decent amount of time these next few weeks
reviewing open PRs in the Druid project, so I'm more than happy to set up a
"review for a review" type of agreement with someone who is also working on
a new change. Feel free to reach out directly via email or a comment on my
PR if you have something you are working to get reviewed.

Thank you,
Lucas Capistrant

Reply via email to