J-HowHuang opened a new pull request, #15953:
URL: https://github.com/apache/pinot/pull/15953
## Description
It was revealed a risk of data loss for pauseless tables during rebalance,
when `downtime=true` or `minAvailableReplicas=0`.
If a segment is being moved and has not yet uploaded to deep store,
premature deletion could cause irrecoverable data loss.
This PR introduces pre-checks and warnings as a workaround to mitigate such
scenarios -- Add to the pre-check logic for table rebalancing when "pauseless
ingestion" is enabled yet the rebalance parameters have downtime=true` or
`minAvailableReplicas=0`, adding additional safety checks and warnings to
prevent potential data loss.
### Key Changes
- **Enhance Pre-Check Logic:**
- Add warnings in the pre-check item `"rebalanceConfigOptions"` if:
- Replication is 1 for pauseless tables (inevitably needs downtime,
which may cause risk of data loss).
- Downtime or `minAvailableReplicas=0` for pauseless tables.
- **Testing:**
- Add/extend tests in `TableRebalancerClusterStatelessTest` to cover new
warning scenarios and validate correct pre-check status/messages for pauseless
tables.
## Tests
### Case 1: Pauseless table with `RF=1 -> RF=1`, rebalanced from 2 servers
to 1 server, `downtime=false`, `minAvailableReplica=-1`

### Case 2: Pauseless table with `RF=1 -> RF=2`, rebalanced from 1 servers
to 2 server, `downtime=true`

### Case 3: Pauseless table with `RF=1 -> RF=2`, rebalanced from 1 servers
to 2 server, `minAvailableReplica=-2`

### Case 4: Pauseless table with `RF=1 -> RF=2`, rebalanced from 1 servers
to 2 server, `minAvailableReplica=-1`

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]