Apologies. Issue is seen after we upgraded from Spark 3.1 to Spark 3.3. The same query runs fine on Spark 3.1.
Omit the Spark version mentioned in email subject earlier. Anup Error trace: query_result.explain(extended=True)\n File \"…/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py\" raise Py4JJavaError(\npy4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.sql.api.python.PythonSQLUtils.explainString.\n: java.lang.IllegalStateException: You hit a query analyzer bug. Please report your query to Spark user mailing list.\n\tat org.apache.spark.sql.execution.SparkStrategies$Aggregation$.apply(SparkStrategies.scala:516)\n\tat org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)\n\tat scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)\n\tat scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)\n\tat scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)\n\tat org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)\n\tat org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:72)\n\tat org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)\n\tat scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)\n\tat scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)\n\tat scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat scala.collection.AbstractIterator.foreach(Iterator.scala:1431)\n\tat scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)\n\tat scala.collect... From: "Sharma, Anup" <anu...@amazon.com> Date: Tuesday, February 20, 2024 at 4:58 PM To: "user@spark.apache.org" <user@spark.apache.org> Cc: "Thinderu, Shalini" <thish...@amazon.com> Subject: Spark 4.0 Query Analyzer Bug Report Hi Spark team, We ran into a dataframe issue after upgrading from spark 3.1 to 4. query_result.explain(extended=True)\n File \"…/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py\" raise Py4JJavaError(\npy4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.sql.api.python.PythonSQLUtils.explainString.\n: java.lang.IllegalStateException: You hit a query analyzer bug. Please report your query to Spark user mailing list.\n\tat org.apache.spark.sql.execution.SparkStrategies$Aggregation$.apply(SparkStrategies.scala:516)\n\tat org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)\n\tat scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)\n\tat scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)\n\tat scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)\n\tat org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)\n\tat org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:72)\n\tat org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)\n\tat scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)\n\tat scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)\n\tat scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat scala.collection.AbstractIterator.foreach(Iterator.scala:1431)\n\tat scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)\n\tat scala.collect... Could you please let us know if this is already being looked at? Thanks, Anup