Re: [PR] PHOENIX-7759: Preserve buffered mutations when batch size limit is exceeded [phoenix]

via GitHub Tue, 10 Feb 2026 01:48:30 -0800


sanjeet006py commented on code in PR #2371:
URL: https://github.com/apache/phoenix/pull/2371#discussion_r2786844384



##########
phoenix-core-client/src/main/java/org/apache/phoenix/execute/MutationState.java:
##########
@@ -2406,36 +2444,56 @@ int[] getStatementIndexes() {
 
     /**
      * Join the newRow with the current row if it doesn't conflict with it. A 
regular upsert
-     * conflicts with a conditional upsert
-     * @return True if the rows were successfully joined else False
+     * conflicts with a conditional upsert.
+     * @param mutationState if non-null, checks limits before modification and 
throws
+     *                      MutationLimitReachedException if size increase 
would exceed limits
+     * @return the size change (can be 0, positive, or negative) if merged, or 
null if conflicting
      */
-    boolean join(RowMutationState newRow) {
+    Long join(RowMutationState newRow, MutationState mutationState) throws 
SQLException {
       if (isConflicting(newRow)) {
-        return false;
+        return null;
       }
-      // If we already have a row and the new row has an ON DUPLICATE KEY 
clause
-      // ignore the new values (as that's what the server will do).
+
+      // Pre-compute merged results (no side effects - these return new 
objects)
+      byte[] combinedOnDupKey =
+        PhoenixIndexBuilderHelper.combineOnDupKey(this.onDupKeyBytes, 
newRow.onDupKeyBytes);
+      int[] mergedIndexes = joinSortedIntArrays(statementIndexes, 
newRow.getStatementIndexes());
+
+      // Calculate column values size change
+      long colValuesSizeDiff = 0;
       if (newRow.onDupKeyBytes == null) {
-        // increment the column value size by the new row column value size
-        colValuesSize += newRow.colValuesSize;
+        colValuesSizeDiff = newRow.colValuesSize;
         for (Map.Entry<PColumn, byte[]> entry : 
newRow.columnValues.entrySet()) {
-          PColumn col = entry.getKey();
-          byte[] oldValue = columnValues.put(col, entry.getValue());
+          byte[] oldValue = columnValues.get(entry.getKey());
           if (oldValue != null) {
-            // decrement column value size by the size of all column values 
that were replaced
-            colValuesSize -= (col.getEstimatedSize() + oldValue.length);
+            colValuesSizeDiff -= (entry.getKey().getEstimatedSize() + 
oldValue.length);
           }
         }
       }
-      // Concatenate ON DUPLICATE KEY bytes to allow multiple
-      // increments of the same row in the same commit batch.
-      this.onDupKeyBytes =
-        PhoenixIndexBuilderHelper.combineOnDupKey(this.onDupKeyBytes, 
newRow.onDupKeyBytes);
+
+      // Total size change (can be negative)
+      long totalSizeDiff = colValuesSizeDiff

Review Comment:
   Seems like this also was an existing gap that the serialized bytes of ON 
DUPLICATE KEY was not accounted in mutation size along with accounting for 
merged indexes length. 
   I wonder if it can turn into backward incompatible change how size of same 
mutations can increase due to fixing of existing gaps. That means some previous 
use cases which were succeeding could fail now. 
   Should we tie up this new behaviour with this change i.e. only account for 
new size calculations when this feature is enabled?



##########
phoenix-core-client/src/main/java/org/apache/phoenix/execute/MutationState.java:
##########
@@ -600,11 +637,13 @@ private void joinMutationState(TableRef tableRef, 
MultiRowMutationState srcRows,
 
     if (!conflictingRows.isEmpty()) {
       addMutations(dstMutations, tableRef, conflictingRows);
+      // Update estimatedSize only after actual state change
+      estimatedSize += conflictingRowsTotalSize;

Review Comment:
   Was this an existing gap which will be fixed with your change as I don't see 
size of conflicting rows being accounted right now?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] PHOENIX-7759: Preserve buffered mutations when batch size limit is exceeded [phoenix]

Reply via email to