[ https://issues.apache.org/jira/browse/SPARK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tejas Patil updated SPARK-19587: -------------------------------- Description: This came up in discussion at https://github.com/apache/spark/pull/16898#discussion_r100697138 Allowing partition columns to be a part of sort columns should not be supported (logically it does not make sense). {code} df.write .format(source) .partitionBy("i") .bucketBy(8, "x") .sortBy("i") .saveAsTable("bucketed_table") {code} Hive fails for such case. {code} CREATE TABLE user_info_bucketed(user_id BIGINT) PARTITIONED BY(ds STRING) CLUSTERED BY(user_id) SORTED BY (ds ASC) INTO 8 BUCKETS; FAILED: SemanticException [Error 10002]: Invalid column reference Caused by: SemanticException: Invalid column reference {code} was: Allowing partition columns to be a part of sort columns should not be supported (logically it does not make sense). {code} df.write .format(source) .partitionBy("i") .bucketBy(8, "x") .sortBy("i") .saveAsTable("bucketed_table") {code} Hive fails for such case. {code} CREATE TABLE user_info_bucketed(user_id BIGINT) PARTITIONED BY(ds STRING) CLUSTERED BY(user_id) SORTED BY (ds ASC) INTO 8 BUCKETS; FAILED: SemanticException [Error 10002]: Invalid column reference Caused by: SemanticException: Invalid column reference {code} > Disallow when sort columns are part of partitioning columns > ----------------------------------------------------------- > > Key: SPARK-19587 > URL: https://issues.apache.org/jira/browse/SPARK-19587 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.0 > Reporter: Tejas Patil > > This came up in discussion at > https://github.com/apache/spark/pull/16898#discussion_r100697138 > Allowing partition columns to be a part of sort columns should not be > supported (logically it does not make sense). > {code} > df.write > .format(source) > .partitionBy("i") > .bucketBy(8, "x") > .sortBy("i") > .saveAsTable("bucketed_table") > {code} > Hive fails for such case. > {code} > CREATE TABLE user_info_bucketed(user_id BIGINT) > PARTITIONED BY(ds STRING) > CLUSTERED BY(user_id) > SORTED BY (ds ASC) > INTO 8 BUCKETS; > > FAILED: SemanticException [Error 10002]: Invalid column reference > Caused by: SemanticException: Invalid column reference > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org