Tanuj Khurana created PHOENIX-7846:
--------------------------------------
Summary: Bound rotation replay cost for large commit batches
Key: PHOENIX-7846
URL: https://issues.apache.org/jira/browse/PHOENIX-7846
Project: Phoenix
Issue Type: Sub-task
Reporter: Tanuj Khurana
Assignee: Tanuj Khurana
Problem:
ReplicationLog maintains a
currentBatch which accumulates every successful append and clears only on an
explicit sync() call. On writer rotation mid-batch, replayCurrentBatch()
re-appends every record in the batch onto the new writer. For workloads with
many appends between explicit syncs, the replay cost scales linearly with batch
size.
There is a pre-existing implicit durability point: LogFileFormatWriter.append()
checks the in-memory block size after each append and, when the block hits
maxBlockSize (default 1 MB), triggers an internal sync() that flushes the block
to HDFS. Records up to that point are durable. However, this information does
not propagate back to ReplicationLog.append(), so currentBatch keeps growing
past these durability points.
For example, with a 10k-record batch (1 KB records, 1 MB block size): blocks
fill every ~1000 records, but currentBatch grows to 10,000. Rotation at record
9,500 replays all 9,500 records — even though records 1–9,000 are already
durable in completed blocks on the old writer's file.
Solution:
Change LogFile.Writer.append() to return a boolean indicating whether a
block-full sync occurred. Propagate this signal through LogFileFormatWriter →
LogFileWriter → ReplicationLog.append(). When the signal is true, clear
currentBatch — all records up to this point are durable and do not need replay.
After this change, replay on rotation is proportional to the last partial block
(bounded by maxBlockSize), not the full inter-sync window. Using the same
example: rotation at record 9,500 replays only ~500 records instead of 9,500.
No change to durability semantics — this only leverages an existing durability
point that was previously not propagated.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)