Re: [PR] split segment-nuke action to smaller chunks (druid)

via GitHub Mon, 24 Jul 2023 11:51:16 -0700


jasonk000 commented on PR #14642:
URL: https://github.com/apache/druid/pull/14642#issuecomment-1648440123


   > Since we already have the interval, I wonder if we even need to fire 
multiple DELETE statements and if just a single one with the right WHERE clause 
on start, end, datasource and used would suffice. There are probably some 
nuances here such as the IntervalMode used while retrieving the unused 
segments, etc, but nothing that cannot be wired up.
   
   This makes sense to me, though I'm not entirely clear on the segment rules 
around this. If we did this, it would change scope a bit: we take in a 
(unordered) Set, it's not obvious at this layer to me that we have all the 
entries in the Interval, so we might delete unintended files. I would think for 
best results the API needs to change from `deleteSegments()` to 
`deleteSegmentsInInterval()`?
   
   > If we already speed up segment nuke by batching the SQL Metadata 
deleteSegments, do we still need this?
   
   Yes, - today if we issue a task for "delete 10 million segments", it will 
locks up the overlord/cluster. Even though S3 and SQL batching will make it 
much much faster, it won't solve the cluster stability, just make the segment 
count to lock it up bigger. This this lock being held too long brings most 
overlord guided activity -- including ingestion -- to a halt since ingestion 
tasks can't get segments allocated and are forced to wait.
   
   > If automatic kill task (druid.coordinator.kill.on) is our paved path, 
should we encourage users to use that (and set the 
druid.coordinator.kill.maxSegments, etc) over manually submitting kill task? If 
user manually submit kill task, they would not have any guardrail that 
automatic kill task provide (such as too many segments in one kill task, 
killing interval that they did not intended, datasource whitelist/blacklist, 
etc).
   
   I agree, though, we can easily imagine a situation where a user doesn't 
realise automatic kill task and needs to do a catch-up (me!), or even a change 
in requirements to reduce or change storage configurations means a user no 
longer need 24mo but only need 12mo data. In these situations, a user should be 
able to issue a task and the system stay stable.
   
   > Do we need the segment nuke action to return the segment
   
   Could be, it doesn't at the moment. Would you use it in `KillTask` or just 
for logging?
   
   > What happen to the caller of the segment nuke action (i.e. KillTask) if 
some batches succeeded and some batches failed? The number of failed/succeeded 
may not be enough and we may need the actual id of the segments right?
   
   This is an unresolved problem today: the SQL delete happens inside a 
transaction which is committed _before_ the S3 deletes are issued. So, if any 
files fail to delete from S3, the segments are still gone from SQL. I'm not 
sure if there's a reconciliation process.
   
   > If we do go ahead with this change, I think not having a new config is 
better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Re: [PR] split segment-nuke action to smaller chunks (druid)

Reply via email to