Gengliang Wang created SPARK-42416: -------------------------------------- Summary: Dateset.show() should not resolve the analyzed logical plan again Key: SPARK-42416 URL: https://issues.apache.org/jira/browse/SPARK-42416 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.4.0 Reporter: Gengliang Wang
For the following query {code:java} sql( """ |CREATE TABLE app_open ( | uid STRING, | st TIMESTAMP, | ds INT |) USING parquet PARTITIONED BY (ds); |""".stripMargin) sql( """ |create or replace temporary view group_by_error as WITH new_app_open AS ( | SELECT | ao.* | FROM | app_open ao |) |SELECT | uid, | 20230208 AS ds | FROM | new_app_open | GROUP BY | 1, | 2 |""".stripMargin) sql( """ |select | `uid` |from | group_by_error |""".stripMargin).show(){code} Spark will throw the following error {code:java} [GROUP_BY_POS_OUT_OF_RANGE] GROUP BY position 20230208 is not in select list (valid range is [1, 2]).; line 9 pos 4 {code} This is because the logical plan is not set as analyzed and it is analyzed again. The analyzer rules about aggregation/sort ordinals are not idempotent. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org