[jira] [Updated] (SPARK-42831) Show result expressions in AggregateExec

Wan Kun (Jira) Thu, 16 Mar 2023 19:58:07 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-42831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wan Kun updated SPARK-42831:
----------------------------
    Description: 
If the result expressions in AggregateExec are not empty, we should display 
them. Or we will get confused because some important expressions do not show up 
in the DAG.

For example, the plan for query *SELECT sum(p) from values(cast(23.4 as 
decimal(7,2))) t(p)*  was incorrect because the result expression 
*MakeDecimal(sum(UnscaledValue(p#0))#1L,17,2) AS sum(p)#2* is not displayed

Before 

{code:java}
== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- HashAggregate(keys=[], functions=[sum(UnscaledValue(p#0))], 
output=[sum(p)#2])
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=11]
      +- HashAggregate(keys=[], functions=[partial_sum(UnscaledValue(p#0))], 
output=[sum#5L])
         +- LocalTableScan [p#0]
{code}


After

{code:java}
== Physical Plan ==     
AdaptiveSparkPlan isFinalPlan=false
+- HashAggregate(keys=[], functions=[sum(UnscaledValue(p#0))], 
results=[MakeDecimal(sum(UnscaledValue(p#0))#1L,17,2) AS sum(p)#2], 
output=[sum(p)#2])
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=38]
      +- HashAggregate(keys=[], functions=[partial_sum(UnscaledValue(p#0))], 
results=[sum#13L], output=[sum#13L])
         +- LocalTableScan [p#0]
{code}


  was:
If the result expressions in AggregateExec is non-empty, we should show them. 
Or we will be confused due to some important expressions did not showed in DAG.

For example, the plan of query SELECT sum(p) from values(cast(23.4 as 
decimal(7,2))) t(p)  was not correct because the result expression 
MakeDecimal(sum(UnscaledValue(p#0))#1L,17,2) AS sum(p)#2 was now showing

Before 

{code:java}
== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=false
+- HashAggregate(keys=[], functions=[sum(UnscaledValue(p#0))], 
output=[sum(p)#2])
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=11]
      +- HashAggregate(keys=[], functions=[partial_sum(UnscaledValue(p#0))], 
output=[sum#5L])
         +- LocalTableScan [p#0]
{code}


After

{code:java}
== Physical Plan ==     
AdaptiveSparkPlan isFinalPlan=false
+- HashAggregate(keys=[], functions=[sum(UnscaledValue(p#0))], 
results=[MakeDecimal(sum(UnscaledValue(p#0))#1L,17,2) AS sum(p)#2], 
output=[sum(p)#2])
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=38]
      +- HashAggregate(keys=[], functions=[partial_sum(UnscaledValue(p#0))], 
results=[sum#13L], output=[sum#13L])
         +- LocalTableScan [p#0]
{code}



> Show result expressions in AggregateExec
> ----------------------------------------
>
>                 Key: SPARK-42831
>                 URL: https://issues.apache.org/jira/browse/SPARK-42831
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Wan Kun
>            Priority: Minor
>
> If the result expressions in AggregateExec are not empty, we should display 
> them. Or we will get confused because some important expressions do not show 
> up in the DAG.
> For example, the plan for query *SELECT sum(p) from values(cast(23.4 as 
> decimal(7,2))) t(p)*  was incorrect because the result expression 
> *MakeDecimal(sum(UnscaledValue(p#0))#1L,17,2) AS sum(p)#2* is not displayed
> Before 
> {code:java}
> == Physical Plan ==
> AdaptiveSparkPlan isFinalPlan=false
> +- HashAggregate(keys=[], functions=[sum(UnscaledValue(p#0))], 
> output=[sum(p)#2])
>    +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=11]
>       +- HashAggregate(keys=[], functions=[partial_sum(UnscaledValue(p#0))], 
> output=[sum#5L])
>          +- LocalTableScan [p#0]
> {code}
> After
> {code:java}
> == Physical Plan ==     
> AdaptiveSparkPlan isFinalPlan=false
> +- HashAggregate(keys=[], functions=[sum(UnscaledValue(p#0))], 
> results=[MakeDecimal(sum(UnscaledValue(p#0))#1L,17,2) AS sum(p)#2], 
> output=[sum(p)#2])
>    +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=38]
>       +- HashAggregate(keys=[], functions=[partial_sum(UnscaledValue(p#0))], 
> results=[sum#13L], output=[sum#13L])
>          +- LocalTableScan [p#0]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-42831) Show result expressions in AggregateExec

Reply via email to