capistrant opened a new issue, #12526:
URL: https://github.com/apache/druid/issues/12526

   ### Motivation
   
   Druid currently provides 3 guard rails to the Coordinator when it comes to 
locating and killing unused segments.
   
   1. Enabling/Disabling automated segment killing
     * A binary choice that automated killing by the coordinator can or cannot 
happen
     * Allows the cluster operator to decide if there should be any automated 
cleanup of unused segments at all
   2. The idea of "killable datasources"
     * A cluster operator can identify a subset of the overall datasource set 
of killable. To be killable means that unused segments can be permanently killed
     * Allows the cluster operator to insulate datasources from automated 
cleanup if needed
   3. `druid.coordinator.kill.durationToRetain` and 
`druid.coordinator.kill.ignoreDurationToRetain`
     * Configuration that is used to create a protected date `now - 
druid.coordinator.killDurationToRetain` where any segment whose end date is 
after this date, can't be deleted automatically
     * This interval can be ignored completely by setting 
`druid.coordinator.kill.ignoreDurationToRetain` to `true`
     * The main benefit of this configuration is that it allows the operator to 
keep unused data around in deep store in case a load rule change matches these 
unused segments, thus making them usable again. This would prevent end users 
from having to re-ingest the unused data had it been automatically cleaned up.
     * Note that this includes unused overshadowed segments which wouldn't be 
useable again even if the load rules changed (at least not easily usable again 
as far as. I am aware)
   
   I am of the opinion that this aforementioned guard rails are not adequate in 
two areas:
   
   The most glaring thing missing is a buffer window between a segment being 
marked unused and being permanently deleted from druid. As of now, if an unused 
segment is not protected by 1, 2, or 3 above, it is liable to be killed 
immediately after being marked unused in the event that the 
`KillUnusedSegments` duty runs on the coordinator at this time. Regardless, 
this window between marking unused and killing is an indeterminate window bound 
by the instant of marking and the configured period for how often the killing 
logic should run. From an operators perspective, killing must be thought of 
being immediate due to its potential of being true. This same idea is raised in 
https://github.com/apache/druid/issues/9889 ... This is similar to a trash 
folder in HDFS. It is there to prevent user error from causing un-wanted data 
loss.
   
   Another motivating factor is the retention of unused overshadowed segments 
within the durationToRetain interval. It is not clear to me that there is any 
straight-forward mechanism for bringing an overshadowed segment back to life 
once it is marked unused. If that is true, these segments should be excluded 
from durationToRetain to prevent the buildup data that cannot become used 
again, even if the load rule chain for the parent datasource is changed to 
include the segments interval.
   
   ### Proposed changes
   
   Adding a last_used column to the druid_segments table. When the boolean 
column, used, is updated, this column is updated with the current timestamp. 
The coordinator now uses this `last_used` column in conjunction with 
`druid.coordinator.kill.bufferPeriod` to filter out segments when looking for 
segments to kill. The decision flow would now be: is kill enabled --> is the 
datasource killable --> is the segment last_used date beyond the bufferPeriod 
--> does the segment end date pre-date the timestamp created by `now` - 
`druid.coordinator.killDurationToRetain`
   
   Exempting overshadowed segments from the last decision point mentioned above.
   > does the segment end date pre-date the timestamp created by `now` - 
`druid.coordinator.killDurationToRetain`
   
   ### Rationale
   
   #### Alternative Implementation
   
   https://github.com/apache/druid/pull/10877#discussion_r864088454 There is 
some discussion here regarding the desired solution. An alternative to the 
schema change is embedding this last_used date in the payload stored per 
segment in the metadata store. The plus to this is that it removes the schema 
change required to facilitate the upgrade of an existing cluster. The downside 
to this is the need to extract the embedded date from the payload when 
evaluating unused segments for potential killing. The work can no longer be 
pushed down to the metadata query, but rather must live in Druid code.
   
   ### Operational impact
   
   This is a breaking change for upgrades. The coordinator will now expect a 
different druid_segments schema, meaning a cluster operator will need to update 
the schema and populate the new column for all existing rows before upgrading 
the coordinator. There should be no impediment to rolling downgrade 
functionality wise. There will however, be wasted storage in the metastore due 
to the now unused column.
   
   To mitigate the disruption to existing clusters we should provide scripts to 
alter the metastore for all supported metadata storage platforms (Alternatively 
we could allow Druid code to alter the schema on startup in the new version, 
but this would require DDL permissions for the metadata storage user)
   
   ### Test plan
   
   The new and changed logic for searching for segments to kill should be able 
to be automated by building on top of our existing integration test suite. 
   
   The migration path will also need a testing plan that can be nailed down 
after implementation.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to