Rui Wang created SPARK-38118: -------------------------------- Summary: MEAN(Boolean) in the HAVING claus should throw data mismatch error Key: SPARK-38118 URL: https://issues.apache.org/jira/browse/SPARK-38118 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Rui Wang
{code:java} with t as (select true c) 3select t.c 4from t 5group by t.c 6having mean(t.c) > 0 {code} This query throws "Column 't.c' does not exist. Did you mean one of the following? [t.c]" However, mean(boolean) is not a supported function signature, thus error result should be "cannot resolve 'mean(t.c)' due to data type mismatch: function average requires numeric or interval types, not boolean" This is because # The mean(boolean) in HAVING was not marked as resolved in {{ResolveFunctions}} rule. # Thus in {{{}ResolveAggregationFunctions{}}}, the {{TempResolvedColumn}} as a wrapper in mean({{{}TempResolvedColumn{}}}(t.c)) cannot be removed (only resolved AGG can remove its’s TempResolvedColumn). # Thus in a later batch rule applying, {{TempResolvedColumn}} was reverted and it becomes mean(`t.c`), so mean loses the information about t.c. # Thus at the last step, the analyzer can only report t.c not found. mean(boolean) in HAVING is not marked as resolved in {{ResolveFunctions}} rule because # It uses Expression default `resolved` field population code {code:java} lazy val resolved: Boolean = childrenResolved && checkInputDataTypes().isSuccess {code} # During the analyzing, mean(boolean) is mean(TempResolveColumn(boolean), thus childrenResolved is true. # however checkInputDataTypes() will be false ([Average.scala#L55|[https://github.com/apache/spark/blob/74ebef243c18e7a8f32bf90ea75ab6afed9e3132/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Average.scala#L55])] # Thus eventually Average's `resolved` will be false, but it leads to wrong error message. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org