[ https://issues.apache.org/jira/browse/SPARK-36472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Denis Krivenko updated SPARK-36472: ----------------------------------- Description: Existing SQL syntax for *MEGRE* (see Delta Lake examples [here|https://docs.delta.io/latest/delta-update.html#upsert-into-a-table-using-merge] and [here|https://docs.microsoft.com/en-us/azure/databricks/spark/latest/spark-sql/language-manual/delta-merge-into]) could be improved by adding an alternative for {{<merge_condition>}} *Main assumption* In common cases target and source tables have the same column names used in {{<merge_condition>}} as merge keys, for example: {code:sql} ON target.key1 = source.key1 AND target.key2 = source.key2{code} It would be more convenient to use a syntax similar to: {code:sql} ON COLUMNS (key1, key2) -- or ON MATCHINGÂ (key1, key2) {code} The same approach is used for [JOIN|https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-join.html] where {{join_criteria}} syntax is {code:sql} ON boolean_expression | USING ( column_name [ , ... ] ) {code} *Improvement proposal* Syntax {code:sql} MERGE INTO target_table_identifier [AS target_alias] USING source_table_identifier [<time_travel_version>] [AS source_alias] ON { <merge_condition> | COLUMNS ( column_name [ , ... ] ) } [ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ] [ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ] [ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ] {code} Example {code:sql} MERGE INTO target USING source ON COLUMNS (key1, key2) WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * {code} was: Existing SQL syntax for *MEGRE* (see Delta Lake examples [here|https://docs.delta.io/latest/delta-update.html#upsert-into-a-table-using-merge] and [here|https://docs.microsoft.com/en-us/azure/databricks/spark/latest/spark-sql/language-manual/delta-merge-into]) could be improved by adding an alternative for {{<merge_condition>}} *Main assumption* In common cases target and source tables have the same column names used in {{<merge_condition>}} as merge keys, for example: {code:sql} ON target.key1 = source.key1 AND target.key2 = source.key2{code} It would be more convenient to use a syntax similar to: {code:sql} ON COLUMNS (key1, key2) {code} The same approach is used for [JOIN|https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-join.html] where {{join_criteria}} syntax is {code:sql} ON boolean_expression | USING ( column_name [ , ... ] ) {code} *Improvement proposal* Syntax {code:sql} MERGE INTO target_table_identifier [AS target_alias] USING source_table_identifier [<time_travel_version>] [AS source_alias] ON { <merge_condition> | COLUMNS ( column_name [ , ... ] ) } [ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ] [ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ] [ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ] {code} Example {code:sql} MERGE INTO target USING source ON COLUMNS (key1, key2) WHEN MATCHED THEN UPDATE SET * WHEN NOT MATCHED THEN INSERT * {code} > Improve SQL syntax for MERGE > ---------------------------- > > Key: SPARK-36472 > URL: https://issues.apache.org/jira/browse/SPARK-36472 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.1.2 > Reporter: Denis Krivenko > Priority: Trivial > > Existing SQL syntax for *MEGRE* (see Delta Lake examples > [here|https://docs.delta.io/latest/delta-update.html#upsert-into-a-table-using-merge] > and > [here|https://docs.microsoft.com/en-us/azure/databricks/spark/latest/spark-sql/language-manual/delta-merge-into]) > could be improved by adding an alternative for {{<merge_condition>}} > *Main assumption* > In common cases target and source tables have the same column names used in > {{<merge_condition>}} as merge keys, for example: > {code:sql} > ON target.key1 = source.key1 AND target.key2 = source.key2{code} > It would be more convenient to use a syntax similar to: > {code:sql} > ON COLUMNS (key1, key2) > -- or > ON MATCHINGÂ (key1, key2) > {code} > The same approach is used for > [JOIN|https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-join.html] > where {{join_criteria}} syntax is > {code:sql} > ON boolean_expression | USING ( column_name [ , ... ] ) > {code} > *Improvement proposal* > Syntax > {code:sql} > MERGE INTO target_table_identifier [AS target_alias] > USING source_table_identifier [<time_travel_version>] [AS source_alias] > ON { <merge_condition> | COLUMNS ( column_name [ , ... ] ) } > [ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ] > [ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ] > [ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ] > {code} > Example > {code:sql} > MERGE INTO target > USING source > ON COLUMNS (key1, key2) > WHEN MATCHED THEN > UPDATE SET * > WHEN NOT MATCHED THEN > INSERT * > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org