kfaraz commented on PR #19016:
URL: https://github.com/apache/druid/pull/19016#issuecomment-3989871583

   @GWphua , as @cecemei points out, there are some concerns with this 
implementation:
   
   - It works only with SEGMENT locking.
   - Segment locking is known to have issues and has already been deprecated 
and will be removed fairly soon.
   - Wouldn't work with concurrent append and replace.
   - Wouldn't work with auto-compaction and only manually submitted compaction 
tasks (I see that you have already mentioned thinking how to integrate it with 
auto-compaction).
   
   <hr>
   
   I would advise revising the implementation in this PR to make use of the 
REPLACE/APPEND (i.e. concurrent supported locks), and the upgrade mechanism 
similar to what is being done in #19059 (revision of #18996).
   Once #19059 has been merged, you can update your PR to use those building 
blocks (I think minimal changes would be needed since everything would already 
be in place).
   
   In short, the approach would be something like this:
   - Submit a compaction task (with auto-compaction or manual) with an input 
spec that specifies an interval as well as a set of "uncompacted" segment IDs
   - Ensure that the compaction task holds a `REPLACE` lock (set 
`useConcurrentLocks: true` in the task context). Do not use SEGMENT lock.
   - Identify all the segments in the interval. The segments that are already 
compacted should be marked to be "upgraded" once the task finishes
   - Launch compaction task for only the uncompacted segments
       - No need to update the `AbstractTask.findSegmentsToLock()` since we 
will always lock the whole interval
       - Just passing the correct segmentIds to the `DruidInputSource` should 
suffice
   - Upon completion, segments will be upgraded automatically and queries will 
start seeing new compacted data.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to