9aman opened a new pull request, #15002:
URL: https://github.com/apache/pinot/pull/15002
This commit implements a cleanup mechanism for pauseless tables that:
1. Takes error segments as input and finds the oldest error segment (lowest
sequence number) per partition
2. For each partition, deletes all segments with sequence numbers greater
than or equal to its oldest error segment.
API: DELETE /resetPauselessTable/{tableName}
Flags: ?type=[OFFLINE|REALTIME]&retention=[period]&segments=[list]
Timeline Visualization for Partition 1:
(Each █ represents a segment, 🔴 marks error segments)
seq=0 seq=1 seq=2 seq=3 seq=4 seq=5 seq=6 seq=7 seq=8
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
█ █ █ █ █ █ █ █ █
[Safe Zone - Will be preserved] | [Deletion Zone - Will be removed]
|
▼
Error at seq=5
(oldest error)
Segment Map:
0 1 2 3 4 5 6 7 8 <- Sequence Numbers
█ █ █ █ █ █ █ █ █ <- Segments
✓ ✓ ✓ ✓ ✓ ✂️ ✂️ ✂️ ✂️ <- Action
🔴 <- Error
Timeline Visualization for Partition 2:
seq=0 seq=1 seq=2 seq=3 seq=4 seq=5
▼ ▼ ▼ ▼ ▼ ▼
█ █ █ █ █ █
[Safe Zone] | [Deletion Zone - Will be removed]
|
▼
Error at seq=3
Segment Map:
0 1 2 3 4 5 <- Sequence Numbers
█ █ █ █ █ █ <- Segments
✓ ✓ ✓ ✂️ ✂️ ✂️ <- Action
🔴 <- Error
Legend:
█ - Existing segment
🔴 - Error segment
✂️ - Will be deleted (including error segment)
✓ - Will be preserved
Key Points:
┌──────────────────────────────────────┐
│ • Everything >= error seq is deleted │
│ • Everything < error seq is kept │
│ • Each partition handled separately │
│ • Error segment itself is deleted │
└──────────────────────────────────────┘
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]