Gengliang Wang created SPARK-42416:
--------------------------------------

             Summary: Dateset.show() should not resolve the analyzed logical 
plan again
                 Key: SPARK-42416
                 URL: https://issues.apache.org/jira/browse/SPARK-42416
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.4.0
            Reporter: Gengliang Wang


For the following query

 
{code:java}
      sql(
        """
          |CREATE TABLE app_open (
          |  uid STRING,
          |  st TIMESTAMP,
          |  ds INT
          |) USING parquet PARTITIONED BY (ds);
          |""".stripMargin)

      sql(
        """
          |create or replace temporary view group_by_error as WITH new_app_open 
AS (
          |  SELECT
          |    ao.*
          |  FROM
          |    app_open ao
          |)
          |SELECT
          |    uid,
          |    20230208 AS ds
          |  FROM
          |    new_app_open
          |  GROUP BY
          |    1,
          |    2
          |""".stripMargin)

      sql(
        """
          |select
          |  `uid`
          |from
          |  group_by_error
          |""".stripMargin).show(){code}
Spark will throw the following error

 

 
{code:java}
[GROUP_BY_POS_OUT_OF_RANGE] GROUP BY position 20230208 is not in select list 
(valid range is [1, 2]).; line 9 pos 4 {code}
 

 

This is because the logical plan is not set as analyzed and it is analyzed 
again. The analyzer rules about aggregation/sort ordinals are not idempotent.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to