tkhurana opened a new pull request, #2433:
URL: https://github.com/apache/phoenix/pull/2433

   ## Summary
   
   Rewrites the `ReplicationLog` writer rotation mechanism to be lock-free and 
asynchronous, improving throughput on the disruptor consumer hot path.
   
   - **Lock-free writer swap**: Replaced `ReentrantLock`-based synchronous 
`rotateLog()` with an `AtomicReference<LogFileWriter> pendingWriter` staging 
pattern. `LogRotationTask` creates new writers on a background thread and 
stages them in `pendingWriter`. The disruptor consumer thread atomically drains 
the staged writer inside `apply()` before each action, replays any unsynced 
appends, and continues.
   - **Async close**: Old writers are submitted to a dedicated `closeExecutor` 
for background close, keeping the hot path unblocked.
   - **On-demand size rotation**: Instead of checking size under a lock in 
`getWriter()`, `requestRotationIfNeeded()` is called post-action and submits a 
`LogRotationTask` to the executor if the size threshold is exceeded. An 
`AtomicBoolean rotationRequested` prevents duplicate submissions.
   - **Round-boundary aligned scheduling**: The rotation executor's 
`initialDelay` is computed via `computeInitialDelay()` to align with 
replication round boundaries. The rotation interval is now derived from 
`replicationRoundDurationSeconds` rather than a separate config key.
   - **Simplified retry logic in `apply()`**: First failure retries on the same 
writer (transient). Second failure triggers `requestRotation()` to create a new 
writer on the background thread. The retry sleep gives the background thread 
time to stage it, and the next attempt drains it.
   - **Removed `RotationReason` enum and per-reason metrics**: Eliminated 
`TIME`/`SIZE`/`ERROR` rotation reason tracking. Only total rotation count and 
failure count remain.
   - **`closed` field upgraded to `AtomicBoolean`** for safe concurrent access 
without locks.
   
   ### Test changes
   
   - Added `recreateLogGroup()`, `overrideConf()`, and `waitForRotationTick()` 
helpers to `ReplicationLogBaseTest`
   - `TestableLog` overrides `startRotationExecutor()` with a full-round 
initial delay to prevent flaky boundary-related failures
   - 8 new tests covering: mid-batch replay, undrained pending writer 
replacement, staged-writer pickup on retry, idle + lease recovery, replay 
failure retry, error-recovery rotation, on-demand rotation not suppressing 
scheduled ticks, round-boundary alignment
   - New `ReplicationLogTest` with unit tests for `computeInitialDelay`
   - Updated existing tests to use `forceRotation()` instead of removed 
`rotateLog()`
   
   ## Test plan
   
   - [ ] `mvn test -pl phoenix-core -Dtest=ReplicationLogTest`
   - [ ] `mvn test -pl phoenix-core -Dtest=ReplicationLogGroupTest`
   - [ ] `mvn test -pl phoenix-core -Dtest=ReplicationLogDiscoveryForwarderTest`
   - [ ] Verify no regressions in other replication tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to