Re: [PR] [SPARK-46207][SQL] Support MergeInto in DataFrameWriterV2 [spark]

via GitHub Sun, 17 Dec 2023 18:25:22 -0800


huaxingao commented on code in PR #44119:
URL: https://github.com/apache/spark/pull/44119#discussion_r1429394708



##########
sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala:
##########
@@ -4129,6 +4129,36 @@ class Dataset[T] private[sql](
     new DataFrameWriterV2[T](table, this)
   }
 
+  /**
+   * Create a [[DataFrameWriterV2]] for MergeInto action.
+   *
+   * Scala Examples:
+   * {{{
+   *   spark.table("source")
+   *     .mergeInto("target")
+   *     .on($"source.id" === $"target.id")
+   *     .whenMatched($"salary" === 100)
+   *     .delete()
+   *     .whenNotMatched()
+   *     .insertAll()
+   *     .whenNotMatchedBySource($"salary" === 100)
+   *     .update(Map(
+   *       "salary" -> lit(200)
+   *     ))
+   *     .merge()
+   * }}}
+   *
+   * @since 4.0.0
+   */
+  def mergeInto(table: String): DataFrameWriterV2[T] = {
+    if (isStreaming) {

Review Comment:
   After introducing a new `MergeIntoWriter`, it became `def mergeInto(table: 
String): MergeIntoWriter[T]`



##########
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriterV2.scala:
##########
@@ -167,6 +173,241 @@ final class DataFrameWriterV2[T] private[sql](table: 
String, ds: Dataset[T])
     runCommand(overwrite)
   }
 
+  /**
+   * Specifies the merge condition.
+   *
+   * Sets the condition, provided as a `String`, to be used for merging data. 
This condition
+   * is converted internally to a `Column` and used to determine how rows from 
the source
+   * DataFrame are matched with rows in the target table.
+   *
+   * @param condition a `String` representing the merge condition.
+   * @return the current `DataFrameWriterV2` instance with the specified merge 
condition set.
+   */
+  def on(condition: String): DataFrameWriterV2[T] = {

Review Comment:
   Added `MergeIntoWriter[T]`



##########
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriterV2.scala:
##########
@@ -167,6 +173,241 @@ final class DataFrameWriterV2[T] private[sql](table: 
String, ds: Dataset[T])
     runCommand(overwrite)
   }
 
+  /**
+   * Specifies the merge condition.
+   *
+   * Sets the condition, provided as a `String`, to be used for merging data. 
This condition
+   * is converted internally to a `Column` and used to determine how rows from 
the source
+   * DataFrame are matched with rows in the target table.
+   *
+   * @param condition a `String` representing the merge condition.

Review Comment:
   Removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-46207][SQL] Support MergeInto in DataFrameWriterV2 [spark]

Reply via email to