Josh Rosen created SPARK-20845:
----------------------------------

             Summary: Support specification of column names in INSERT INTO
                 Key: SPARK-20845
                 URL: https://issues.apache.org/jira/browse/SPARK-20845
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Josh Rosen


Some databases allow you to specify column names when specifying the target of 
an INSERT INTO. For example, in SQLite:

{code}
sqlite> CREATE TABLE twocolumn (x INT, y INT); INSERT INTO twocolumn(x, y) 
VALUES (44,51), (NULL,52), (42,53), (45,45)
   ...> ;
sqlite> select * from twocolumn;
44|51
|52
42|53
45|45
{code}

I have a corpus of existing queries of this form which I would like to run on 
Spark SQL, so I think we should extend our dialect to support this syntax.

When implementing this, we should make sure to test the following behaviors and 
corner-cases:

- Number of columns specified is greater than or less than the number of 
columns in the table.
- Specification of repeated columns.
- Specification of columns which do not exist in the target table.
- Permute column order instead of using the default order in the table.

For each of these, we should check how SQLite behaves and should also compare 
against another database. It looks like T-SQL supports this; see 
https://technet.microsoft.com/en-us/library/dd776381(v=sql.105).aspx under the 
"Inserting data that is not in the same order as the table columns" header.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to