juliuszsompolski commented on code in PR #51091:
URL: https://github.com/apache/spark/pull/51091#discussion_r2151777936
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala:
##########
@@ -233,6 +246,7 @@ case class MergeRowsExec(
}
}
+ longMetric("numTargetRowsCopied") += 1
Review Comment:
@szehon-ho in Delta Based merge if the target row falls through all the
match instructions, it is not copied, it is not rewritten at all. See that in
the next line after this you return `null`, so this row is **not** passed to
`WriteDeltaExec`, and hence not written. The reason that Delta Based merge does
not have this keepCarryOverRows catch-all fallback condition is exactly because
it doesn't copy rows. So I think the only thing needed in the previous commit
was to delete this one line.
Actually, the tests that you added to MergeIntoTableSuiteBase executed in
the suites extending DeltaBasedMergeIntoTableSuiteBase should be showing `0` as
number rows copied. I think the tests need to be split between
DeltaBasedMergeIntoTableSuiteBase and GroupBasedMergeIntoTableSuite to account
for that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]