Jungtaek Lim created SPARK-31707: ------------------------------------ Summary: Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax Key: SPARK-31707 URL: https://issues.apache.org/jira/browse/SPARK-31707 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.0.0 Reporter: Jungtaek Lim
We need to consider the behavior change of SPARK-30098 . This is a placeholder to keep the discussion and the final decision. `CREATE TABLE` syntax changes its behavior silently. The following is one example of the breaking the existing user data pipelines. *Apache Spark 2.4.5* {code} spark-sql> CREATE TABLE t(a STRING); spark-sql> LOAD DATA INPATH '/usr/local/spark/README.md' INTO TABLE t; spark-sql> SELECT * FROM t LIMIT 1; # Apache Spark Time taken: 2.05 seconds, Fetched 1 row(s) {code} {code} spark-sql> CREATE TABLE t(a CHAR(3)); spark-sql> INSERT INTO TABLE t SELECT 'a '; spark-sql> SELECT a, length(a) FROM t; a 3 {code} *Apache Spark 3.0.0-preview2* {code} spark-sql> CREATE TABLE t(a STRING); spark-sql> LOAD DATA INPATH '/usr/local/spark/README.md' INTO TABLE t; Error in query: LOAD DATA is not supported for datasource tables: `default`.`t`; {code} {code} spark-sql> CREATE TABLE t(a CHAR(3)); spark-sql> INSERT INTO TABLE t SELECT 'a '; spark-sql> SELECT a, length(a) FROM t; a 2 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org