Jungtaek Lim created SPARK-31707:
------------------------------------

             Summary: Revert SPARK-30098 Use default datasource as provider for 
CREATE TABLE syntax
                 Key: SPARK-31707
                 URL: https://issues.apache.org/jira/browse/SPARK-31707
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.0.0
            Reporter: Jungtaek Lim


We need to consider the behavior change of SPARK-30098 .
This is a placeholder to keep the discussion and the final decision.

`CREATE TABLE` syntax changes its behavior silently.

The following is one example of the breaking the existing user data pipelines.
*Apache Spark 2.4.5*
{code}
spark-sql> CREATE TABLE t(a STRING);

spark-sql> LOAD DATA INPATH '/usr/local/spark/README.md' INTO TABLE t;

spark-sql> SELECT * FROM t LIMIT 1;
# Apache Spark
Time taken: 2.05 seconds, Fetched 1 row(s)
{code}

{code}
spark-sql> CREATE TABLE t(a CHAR(3));

spark-sql> INSERT INTO TABLE t SELECT 'a ';

spark-sql> SELECT a, length(a) FROM t;
a       3
{code}

*Apache Spark 3.0.0-preview2*
{code}
spark-sql> CREATE TABLE t(a STRING);

spark-sql> LOAD DATA INPATH '/usr/local/spark/README.md' INTO TABLE t;
Error in query: LOAD DATA is not supported for datasource tables: `default`.`t`;
{code}

{code}
spark-sql> CREATE TABLE t(a CHAR(3));

spark-sql> INSERT INTO TABLE t SELECT 'a ';

spark-sql> SELECT a, length(a) FROM t;
a       2
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to