Tanuj Khurana created PHOENIX-7846:
--------------------------------------

             Summary: Bound rotation replay cost for large commit batches
                 Key: PHOENIX-7846
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7846
             Project: Phoenix
          Issue Type: Sub-task
            Reporter: Tanuj Khurana
            Assignee: Tanuj Khurana


Problem:                                                                        
                                                 ReplicationLog maintains a 
currentBatch which accumulates every successful append and clears only on an 
explicit sync() call. On writer rotation mid-batch, replayCurrentBatch() 
re-appends every record in the batch onto the new writer. For workloads with 
many appends between explicit syncs, the replay cost scales linearly with batch 
size. 

There is a pre-existing implicit durability point: LogFileFormatWriter.append() 
checks the in-memory block size after each append and, when the block hits 
maxBlockSize (default 1 MB), triggers an internal sync() that flushes the block 
to HDFS. Records up to that point are durable. However, this information does 
not propagate back to ReplicationLog.append(), so currentBatch keeps growing 
past these durability points. 

For example, with a 10k-record batch (1 KB records, 1 MB block size): blocks 
fill every ~1000 records, but currentBatch grows to 10,000. Rotation at record 
9,500 replays all 9,500 records — even though  records 1–9,000 are already 
durable in completed blocks on the old writer's file.
                                                                                
                                                                              
Solution:                                                                       
                                                       

Change LogFile.Writer.append() to return a boolean indicating whether a 
block-full sync occurred. Propagate this signal through LogFileFormatWriter → 
LogFileWriter → ReplicationLog.append(). When the signal  is true, clear 
currentBatch — all records up to this point are durable and do not need replay.
After this change, replay on rotation is proportional to the last partial block 
(bounded by maxBlockSize), not the full inter-sync window. Using the same 
example: rotation at record 9,500 replays only ~500 records instead of 9,500.

No change to durability semantics — this only leverages an existing durability 
point that was previously not propagated. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to