jasonk000 commented on PR #14642: URL: https://github.com/apache/druid/pull/14642#issuecomment-1648440123
> Since we already have the interval, I wonder if we even need to fire multiple DELETE statements and if just a single one with the right WHERE clause on start, end, datasource and used would suffice. There are probably some nuances here such as the IntervalMode used while retrieving the unused segments, etc, but nothing that cannot be wired up. This makes sense to me, though I'm not entirely clear on the segment rules around this. If we did this, it would change scope a bit: we take in a (unordered) Set, it's not obvious at this layer to me that we have all the entries in the Interval, so we might delete unintended files. I would think for best results the API needs to change from `deleteSegments()` to `deleteSegmentsInInterval()`? > If we already speed up segment nuke by batching the SQL Metadata deleteSegments, do we still need this? Yes, - today if we issue a task for "delete 10 million segments", it will locks up the overlord/cluster. Even though S3 and SQL batching will make it much much faster, it won't solve the cluster stability, just make the segment count to lock it up bigger. This this lock being held too long brings most overlord guided activity -- including ingestion -- to a halt since ingestion tasks can't get segments allocated and are forced to wait. > If automatic kill task (druid.coordinator.kill.on) is our paved path, should we encourage users to use that (and set the druid.coordinator.kill.maxSegments, etc) over manually submitting kill task? If user manually submit kill task, they would not have any guardrail that automatic kill task provide (such as too many segments in one kill task, killing interval that they did not intended, datasource whitelist/blacklist, etc). I agree, though, we can easily imagine a situation where a user doesn't realise automatic kill task and needs to do a catch-up (me!), or even a change in requirements to reduce or change storage configurations means a user no longer need 24mo but only need 12mo data. In these situations, a user should be able to issue a task and the system stay stable. > Do we need the segment nuke action to return the segment Could be, it doesn't at the moment. Would you use it in `KillTask` or just for logging? > What happen to the caller of the segment nuke action (i.e. KillTask) if some batches succeeded and some batches failed? The number of failed/succeeded may not be enough and we may need the actual id of the segments right? This is an unresolved problem today: the SQL delete happens inside a transaction which is committed _before_ the S3 deletes are issued. So, if any files fail to delete from S3, the segments are still gone from SQL. I'm not sure if there's a reconciliation process. > If we do go ahead with this change, I think not having a new config is better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org