Hari Krishna Dara created PHOENIX-7759:
------------------------------------------

             Summary: Preserve buffered mutations when batch size limit is 
exceeded
                 Key: PHOENIX-7759
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7759
             Project: Phoenix
          Issue Type: New Feature
            Reporter: Hari Krishna Dara
            Assignee: Hari Krishna Dara


h3. Summary
When applications \{{UPSERT}} multiple rows with deferred commit, mutations 
accumulate in client-side \{{MutationState}}. This occurs when:
* Using \{{executeUpdate()}} with \{{autoCommit=false}}
* Using \{{addBatch()}} with \{{executeBatch()}} (regardless of \{{autoCommit}} 
value)

Currently, when the configured limit (\{{phoenix.mutate.maxSize}} or 
\{{phoenix.mutate.maxSizeBytes}}) is reached, Phoenix clears all buffered 
mutations and throws an exception, causing data loss and requiring applications 
to restart batch processing from the beginning.

h3. Problem
Applications have no opportunity to commit partial progress when limits are 
reached. Workarounds like setting excessively large limits or implementing 
custom batching heuristics are either risky or inefficient.

h3. Solution
Introduce a new configuration property 
\{{phoenix.mutate.preserveOnLimitExceeded}} (default: \{{false}}) that, when 
enabled:
1. Performs a pre-check before joining mutations to detect if limits would be 
exceeded
2. Throws a new \{{MutationLimitReachedException}} without clearing existing 
buffered mutations
2. For \{{executeBatch()}}, handle the above exception to trim the batch to 
contain only unprocessed items and translating the exception into the new 
\{{MutationLimitBatchException}} that captures the "processed count"

This allows applications to commit existing mutations and continue processing 
from where they left off, effectively providing the ability to "dynamically 
size" the batch.

h3. Backward Compatibility
The new behavior is opt-in. Default behavior (clear mutations on limit 
exceeded) is unchanged.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to