juliuszsompolski commented on code in PR #51091:
URL: https://github.com/apache/spark/pull/51091#discussion_r2151777936


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala:
##########
@@ -233,6 +246,7 @@ case class MergeRowsExec(
         }
       }
 
+      longMetric("numTargetRowsCopied") += 1

Review Comment:
   @szehon-ho in Delta Based merge if the target row falls through all the 
match instructions, it is not copied, it is not rewritten at all. See that in 
the next line after this you return `null`, so this row is **not** passed to 
`WriteDeltaExec`, and hence not written. The reason that Delta Based merge does 
not have this keepCarryOverRows catch-all fallback condition is exactly because 
it doesn't copy rows. So I think the only thing needed in the previous commit 
was to delete this one line.
   
   Actually, the tests that you added to MergeIntoTableSuiteBase executed in 
the suites extending DeltaBasedMergeIntoTableSuiteBase should be showing `0` as 
number rows copied. I think the tests need to be split between 
DeltaBasedMergeIntoTableSuiteBase and GroupBasedMergeIntoTableSuite to account 
for that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to