Github user xwu0226 commented on the issue: https://github.com/apache/spark/pull/16685 @ilganeli Thanks for replying to my comments! Please correct me if I am wrong. My understanding of your assumption is that the target table does not have or maintain any unique constraints. Mostly the target table is created and maintained solely by the spark application, right? If this is the assumption, I do believe that the simple INSERT and UPDATE may perform better than UPSERT. But if the target table has unique constraint to start with, INSERT/UPDATE and UPSERT/MERGE comparison may be like what you said as slight horse race, since in either case index lookup and validation is required, where UPSERT/MERGE may have a bit more `if/else` depending on the implementation in the database systems. Benchmark between 2 approaches can tell.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org