Rui Wang created SPARK-38118:
--------------------------------

             Summary: MEAN(Boolean) in the HAVING claus should throw data 
mismatch error
                 Key: SPARK-38118
                 URL: https://issues.apache.org/jira/browse/SPARK-38118
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.3.0
            Reporter: Rui Wang


{code:java}
with t as (select true c)
3select t.c
4from t
5group by t.c
6having mean(t.c) > 0 {code}
This query throws "Column 't.c' does not exist. Did you mean one of the 
following? [t.c]"

However, mean(boolean) is not a supported function signature, thus error result 
should be  "cannot resolve 'mean(t.c)' due to data type mismatch: function 
average requires numeric or interval types, not boolean"

 

This is because
 # The mean(boolean) in HAVING was not marked as resolved in 
{{ResolveFunctions}} rule.

 # Thus in {{{}ResolveAggregationFunctions{}}}, the {{TempResolvedColumn}} as a 
wrapper in mean({{{}TempResolvedColumn{}}}(t.c)) cannot be removed (only 
resolved AGG can remove its’s TempResolvedColumn).

 # Thus in a later batch rule applying,  {{TempResolvedColumn}} was reverted 
and it becomes mean(`t.c`), so mean loses the information about t.c.

 # Thus at the last step, the analyzer can only report t.c not found.

 

mean(boolean) in HAVING is not marked as resolved in {{ResolveFunctions}} rule 
because 
 # It uses Expression default `resolved` field population code 
{code:java}
lazy val resolved: Boolean = childrenResolved && 
checkInputDataTypes().isSuccess {code}
 
 #  During the analyzing,  mean(boolean) is mean(TempResolveColumn(boolean), 
thus childrenResolved is true.
 # however checkInputDataTypes() will be false 
([Average.scala#L55|[https://github.com/apache/spark/blob/74ebef243c18e7a8f32bf90ea75ab6afed9e3132/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala#L55])]
 # Thus eventually Average's `resolved`  will be false, but it leads to wrong 
error message.

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to