[ https://issues.apache.org/jira/browse/SPARK-29421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16948506#comment-16948506 ]
Lantao Jin edited comment on SPARK-29421 at 10/10/19 12:07 PM: --------------------------------------------------------------- [~cloud_fan] Yes, Hive support a similar command with {{STORED AS}}: {code} hive> CREATE TABLE tbl(a int) STORED AS TEXTFILE; OK Time taken: 0.726 seconds hive> CREATE TABLE tbl2 LIKE tbl STORED AS PARQUET; OK Time taken: 0.294 seconds {code} Here I use {{USING}} to be compatible with Spark command {{CREATE TABLE ... USING}}. So do you think which option would be better? was (Author: cltlfcjin): [~cloud_fan] Yes, Hive support a similar command with {{STORED AS}}: {code} hive> CREATE TABLE tbl(a int) STORED AS TEXTFILE; OK Time taken: 0.726 seconds hive> CREATE TABLE tbl2 LIKE tbl STORED AS PARQUET; OK Time taken: 0.294 seconds {code} > Add an opportunity to change the file format of command CREATE TABLE LIKE > ------------------------------------------------------------------------- > > Key: SPARK-29421 > URL: https://issues.apache.org/jira/browse/SPARK-29421 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.0.0 > Reporter: Lantao Jin > Priority: Major > > Use CREATE TABLE tb1 LIKE tb2 command to create an empty table tb1 based on > the definition of table tb2. The most user case is to create tb1 with the > same schema of tb2. But an inconvenient case here is this command also copies > the FileFormat from tb2, it cannot change the input/output format and serde. > Add the ability of changing file format is useful for some scenarios like > upgrading a table from a low performance file format to a high performance > one (parquet, orc). > Here gives two options to enhance it. > Option1: Add a configuration {{spark.sql.createTableLike.fileformat}}, the > value by default is "none" which keeps the behaviour same with current -- > copying the file format from source table. After run command SET > spark.sql.createTableLike.fileformat=parquet or any other valid file format > defined in {{HiveSerDe}}, {{CREATE TABLE ... LIKE}} will use the new file > format type. > Option2: Add syntax {{USING fileformat}} after {{CREATE TABLE ... LIKE}}. For > example, > {code} > CREATE TABLE tb1 LIKE tb2 USING parquet; > {code} > If USING keyword is ignored, it also keeps the behaviour same with current -- > copying the file format from source table. > Both of them can keep its behaviour same with current. > We use option1 with parquet file format as an enhancement in our production > thriftserver because we need change many existing SQL scripts without any > modification. But for community, Option2 could be treated as a new feature > since it needs user to write additional USING part. > cc [~dongjoon] [~hyukjin.kwon] [~joshrosen] [~cloud_fan] [~yumwang] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org