This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 7120e6b88f23 [SPARK-45905][SQL] Least common type between decimal types should retain integral digits first 7120e6b88f23 is described below commit 7120e6b88f2327ffb71c4bca14b10b15aeb26c32 Author: Wenchen Fan <cloud0...@gmail.com> AuthorDate: Wed Nov 15 20:39:38 2023 +0800 [SPARK-45905][SQL] Least common type between decimal types should retain integral digits first ### What changes were proposed in this pull request? This is kind of a followup of https://github.com/apache/spark/pull/20023 . It's simply wrong to cut the decimal precision to 38 if a wider decimal type exceeds the max precision, which drops the integral digits and makes the decimal value very likely to overflow. In https://github.com/apache/spark/pull/20023 , we fixed this issue for arithmetic operations, but many other operations suffer from the same issue: Union, binary comparison, in subquery, create_array, coalesce, etc. This PR fixes all the remaining operators, without the min scale limitation, which should be applied to division and multiple only according to the SQL server doc: https://learn.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-ver15 ### Why are the changes needed? To produce reasonable wider decimal type. ### Does this PR introduce _any_ user-facing change? Yes, the final data type of these operators will be changed if it's decimal type and its precision exceeds the max and the scale is not 0. ### How was this patch tested? updated tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #43781 from cloud-fan/decimal. Lead-authored-by: Wenchen Fan <cloud0...@gmail.com> Co-authored-by: Wenchen Fan <wenc...@databricks.com> Signed-off-by: Wenchen Fan <wenc...@databricks.com> --- docs/sql-ref-ansi-compliance.md | 19 ++ .../org/apache/spark/sql/types/DecimalType.scala | 12 ++ .../sql/catalyst/analysis/DecimalPrecision.scala | 7 +- .../org/apache/spark/sql/internal/SQLConf.scala | 12 +- .../catalyst/analysis/AnsiTypeCoercionSuite.scala | 10 +- .../sql/catalyst/analysis/TypeCoercionSuite.scala | 18 +- .../typeCoercion/native/mapZipWith.sql.out | 50 ++--- .../results/typeCoercion/native/mapZipWith.sql.out | 42 +---- .../approved-plans-v2_7/q36a.sf100/explain.txt | 6 +- .../approved-plans-v2_7/q36a/explain.txt | 6 +- .../tpcds-query-results/v2_7/q36a.sql.out | 202 ++++++++++----------- .../apache/spark/sql/DataFrameFunctionsSuite.scala | 21 +-- 12 files changed, 191 insertions(+), 214 deletions(-) diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md index 4729db16d63f..90e65e5ce36e 100644 --- a/docs/sql-ref-ansi-compliance.md +++ b/docs/sql-ref-ansi-compliance.md @@ -240,6 +240,25 @@ The least common type resolution is used to: - Derive the result type for expressions such as the case expression. - Derive the element, key, or value types for array and map constructors. Special rules are applied if the least common type resolves to FLOAT. With float type values, if any of the types is INT, BIGINT, or DECIMAL the least common type is pushed to DOUBLE to avoid potential loss of digits. + +Decimal type is a bit more complicated here, as it's not a simple type but has parameters: precision and scale. +A `decimal(precision, scale)` means the value can have at most `precision - scale` digits in the integral part and `scale` digits in the fractional part. +A least common type between decimal types should have enough digits in both integral and fractional parts to represent all values. +More precisely, a least common type between `decimal(p1, s1)` and `decimal(p2, s2)` has the scale of `max(s1, s2)` and precision of `max(s1, s2) + max(p1 - s1, p2 - s2)`. +However, decimal types in Spark have a maximum precision: 38. If the final decimal type need more precision, we must do truncation. +Since the digits in the integral part are more significant, Spark truncates the digits in the fractional part first. For example, `decimal(48, 20)` will be reduced to `decimal(38, 10)`. + +Note, arithmetic operations have special rules to calculate the least common type for decimal inputs: + +| Operation | Result precision | Result scale | +|------------|------------------------------------------|---------------------| +| e1 + e2 | max(s1, s2) + max(p1 - s1, p2 - s2) + 1 | max(s1, s2) | +| e1 - e2 | max(s1, s2) + max(p1 - s1, p2 - s2) + 1 | max(s1, s2) | +| e1 * e2 | p1 + p2 + 1 | s1 + s2 | +| e1 / e2 | p1 - s1 + s2 + max(6, s1 + p2 + 1) | max(6, s1 + p2 + 1) | +| e1 % e2 | min(p1 - s1, p2 - s2) + max(s1, s2) | max(s1, s2) | + +The truncation rule is also different for arithmetic operations: they retain at least 6 digits in the fractional part, which means we can only reduce `scale` to 6. Overflow may happen in this case. ```sql -- The coalesce function accepts any set of argument types as long as they share a least common type. diff --git a/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala b/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala index af545341dee7..9de34d0b3bc1 100644 --- a/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala +++ b/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala @@ -146,6 +146,18 @@ object DecimalType extends AbstractDataType { DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE)) } + private[sql] def boundedPreferIntegralDigits(precision: Int, scale: Int): DecimalType = { + if (precision <= MAX_PRECISION) { + DecimalType(precision, scale) + } else { + // If we have to reduce the precision, we should retain the digits in the integral part first, + // as they are more significant to the value. Here we reduce the scale as well to drop the + // digits in the fractional part. + val diff = precision - MAX_PRECISION + DecimalType(MAX_PRECISION, math.max(0, scale - diff)) + } + } + private[sql] def checkNegativeScale(scale: Int): Unit = { if (scale < 0 && !SqlApiConf.get.allowNegativeScaleOfDecimalEnabled) { throw DataTypeErrors.negativeScaleNotAllowedError(scale) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala index 09cf61a77955..d5e914b343d5 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala @@ -20,6 +20,7 @@ package org.apache.spark.sql.catalyst.analysis import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.expressions.Literal._ import org.apache.spark.sql.catalyst.types.DataTypeUtils +import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.types._ @@ -64,7 +65,11 @@ object DecimalPrecision extends TypeCoercionRule { def widerDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = { val scale = max(s1, s2) val range = max(p1 - s1, p2 - s2) - DecimalType.bounded(range + scale, scale) + if (conf.getConf(SQLConf.LEGACY_RETAIN_FRACTION_DIGITS_FIRST)) { + DecimalType.bounded(range + scale, scale) + } else { + DecimalType.boundedPreferIntegralDigits(range + scale, scale) + } } override def transform: PartialFunction[Expression, Expression] = { diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index c12c5a826ba1..2d67a8428d22 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -4550,6 +4550,16 @@ object SQLConf { .booleanConf .createWithDefault(false) + val LEGACY_RETAIN_FRACTION_DIGITS_FIRST = + buildConf("spark.sql.legacy.decimal.retainFractionDigitsOnTruncate") + .internal() + .doc("When set to true, we will try to retain the fraction digits first rather than " + + "integral digits as prior Spark 4.0, when getting a least common type between decimal " + + "types, and the result decimal precision exceeds the max precision.") + .version("4.0.0") + .booleanConf + .createWithDefault(false) + /** * Holds information about keys that have been deprecated. * @@ -5434,7 +5444,7 @@ class SQLConf extends Serializable with Logging with SqlApiConf { } def legacyRaiseErrorWithoutErrorClass: Boolean = - getConf(SQLConf.LEGACY_RAISE_ERROR_WITHOUT_ERROR_CLASS) + getConf(SQLConf.LEGACY_RAISE_ERROR_WITHOUT_ERROR_CLASS) /** ********************** SQLConf functionality methods ************ */ diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala index cc0f6046de00..38acb56ad1e0 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala @@ -481,9 +481,9 @@ class AnsiTypeCoercionSuite extends TypeCoercionSuiteBase { :: Literal.create(null, DecimalType(22, 10)) :: Literal.create(null, DecimalType(38, 38)) :: Nil), - CreateArray(Literal.create(null, DecimalType(5, 3)).cast(DecimalType(38, 38)) - :: Literal.create(null, DecimalType(22, 10)).cast(DecimalType(38, 38)) - :: Literal.create(null, DecimalType(38, 38)) + CreateArray(Literal.create(null, DecimalType(5, 3)).cast(DecimalType(38, 26)) + :: Literal.create(null, DecimalType(22, 10)).cast(DecimalType(38, 26)) + :: Literal.create(null, DecimalType(38, 38)).cast(DecimalType(38, 26)) :: Nil)) } @@ -530,9 +530,9 @@ class AnsiTypeCoercionSuite extends TypeCoercionSuiteBase { :: Literal.create(null, DecimalType(38, 38)) :: Nil), CreateMap(Literal(1) - :: Literal.create(null, DecimalType(38, 0)).cast(DecimalType(38, 38)) + :: Literal.create(null, DecimalType(38, 0)) :: Literal(2) - :: Literal.create(null, DecimalType(38, 38)) + :: Literal.create(null, DecimalType(38, 38)).cast(DecimalType(38, 0)) :: Nil)) // type coercion for both map keys and values ruleTest(AnsiTypeCoercion.FunctionArgumentConversion, diff --git a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala index db86e7131446..330252d26dc5 100644 --- a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala +++ b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala @@ -782,7 +782,7 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase { widenTestWithStringPromotion( ArrayType(DecimalType(36, 0), containsNull = false), ArrayType(DecimalType(36, 35), containsNull = false), - Some(ArrayType(DecimalType(38, 35), containsNull = true))) + Some(ArrayType(DecimalType(38, 2), containsNull = false))) // MapType widenTestWithStringPromotion( @@ -808,7 +808,7 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase { widenTestWithStringPromotion( MapType(StringType, DecimalType(36, 0), valueContainsNull = false), MapType(StringType, DecimalType(36, 35), valueContainsNull = false), - Some(MapType(StringType, DecimalType(38, 35), valueContainsNull = true))) + Some(MapType(StringType, DecimalType(38, 2), valueContainsNull = false))) widenTestWithStringPromotion( MapType(IntegerType, StringType, valueContainsNull = false), MapType(DecimalType.IntDecimal, StringType, valueContainsNull = false), @@ -816,7 +816,7 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase { widenTestWithStringPromotion( MapType(DecimalType(36, 0), StringType, valueContainsNull = false), MapType(DecimalType(36, 35), StringType, valueContainsNull = false), - None) + Some(MapType(DecimalType(38, 2), StringType, valueContainsNull = false))) // StructType widenTestWithStringPromotion( @@ -847,7 +847,7 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase { widenTestWithStringPromotion( new StructType().add("num", DecimalType(36, 0), nullable = false), new StructType().add("num", DecimalType(36, 35), nullable = false), - Some(new StructType().add("num", DecimalType(38, 35), nullable = true))) + Some(new StructType().add("num", DecimalType(38, 2), nullable = false))) widenTestWithStringPromotion( new StructType().add("num", IntegerType), @@ -1046,9 +1046,9 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase { :: Literal.create(null, DecimalType(22, 10)) :: Literal.create(null, DecimalType(38, 38)) :: Nil), - CreateArray(Literal.create(null, DecimalType(5, 3)).cast(DecimalType(38, 38)) - :: Literal.create(null, DecimalType(22, 10)).cast(DecimalType(38, 38)) - :: Literal.create(null, DecimalType(38, 38)) + CreateArray(Literal.create(null, DecimalType(5, 3)).cast(DecimalType(38, 26)) + :: Literal.create(null, DecimalType(22, 10)).cast(DecimalType(38, 26)) + :: Literal.create(null, DecimalType(38, 38)).cast(DecimalType(38, 26)) :: Nil)) } @@ -1095,9 +1095,9 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase { :: Literal.create(null, DecimalType(38, 38)) :: Nil), CreateMap(Literal(1) - :: Literal.create(null, DecimalType(38, 0)).cast(DecimalType(38, 38)) + :: Literal.create(null, DecimalType(38, 0)) :: Literal(2) - :: Literal.create(null, DecimalType(38, 38)) + :: Literal.create(null, DecimalType(38, 38)).cast(DecimalType(38, 0)) :: Nil)) // type coercion for both map keys and values ruleTest(TypeCoercion.FunctionArgumentConversion, diff --git a/sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/mapZipWith.sql.out b/sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/mapZipWith.sql.out index bc8909f1c103..4b8d37d22548 100644 --- a/sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/mapZipWith.sql.out +++ b/sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/mapZipWith.sql.out @@ -128,24 +128,13 @@ Project [map_zip_with(double_map#x, cast(float_map#x as map<double,float>), lamb SELECT map_zip_with(decimal_map1, decimal_map2, (k, v1, v2) -> struct(k, v1, v2)) m FROM various_maps -- !query analysis -org.apache.spark.sql.catalyst.ExtendedAnalysisException -{ - "errorClass" : "DATATYPE_MISMATCH.MAP_ZIP_WITH_DIFF_TYPES", - "sqlState" : "42K09", - "messageParameters" : { - "functionName" : "`map_zip_with`", - "leftType" : "\"DECIMAL(36,0)\"", - "rightType" : "\"DECIMAL(36,35)\"", - "sqlExpr" : "\"map_zip_with(decimal_map1, decimal_map2, lambdafunction(struct(k, v1, v2), k, v1, v2))\"" - }, - "queryContext" : [ { - "objectType" : "", - "objectName" : "", - "startIndex" : 8, - "stopIndex" : 81, - "fragment" : "map_zip_with(decimal_map1, decimal_map2, (k, v1, v2) -> struct(k, v1, v2))" - } ] -} +Project [map_zip_with(cast(decimal_map1#x as map<decimal(38,2),decimal(36,0)>), cast(decimal_map2#x as map<decimal(38,2),decimal(36,35)>), lambdafunction(struct(k, lambda k#x, v1, lambda v1#x, v2, lambda v2#x), lambda k#x, lambda v1#x, lambda v2#x, false)) AS m#x] ++- SubqueryAlias various_maps + +- View (`various_maps`, [boolean_map#x,tinyint_map#x,smallint_map#x,int_map#x,bigint_map#x,decimal_map1#x,decimal_map2#x,double_map#x,float_map#x,date_map#x,timestamp_map#x,string_map1#x,string_map2#x,string_map3#x,string_map4#x,array_map1#x,array_map2#x,struct_map1#x,struct_map2#x]) + +- Project [cast(boolean_map#x as map<boolean,boolean>) AS boolean_map#x, cast(tinyint_map#x as map<tinyint,tinyint>) AS tinyint_map#x, cast(smallint_map#x as map<smallint,smallint>) AS smallint_map#x, cast(int_map#x as map<int,int>) AS int_map#x, cast(bigint_map#x as map<bigint,bigint>) AS bigint_map#x, cast(decimal_map1#x as map<decimal(36,0),decimal(36,0)>) AS decimal_map1#x, cast(decimal_map2#x as map<decimal(36,35),decimal(36,35)>) AS decimal_map2#x, cast(double_map#x as map<d [...] + +- Project [boolean_map#x, tinyint_map#x, smallint_map#x, int_map#x, bigint_map#x, decimal_map1#x, decimal_map2#x, double_map#x, float_map#x, date_map#x, timestamp_map#x, string_map1#x, string_map2#x, string_map3#x, string_map4#x, array_map1#x, array_map2#x, struct_map1#x, struct_map2#x] + +- SubqueryAlias various_maps + +- LocalRelation [boolean_map#x, tinyint_map#x, smallint_map#x, int_map#x, bigint_map#x, decimal_map1#x, decimal_map2#x, double_map#x, float_map#x, date_map#x, timestamp_map#x, string_map1#x, string_map2#x, string_map3#x, string_map4#x, array_map1#x, array_map2#x, struct_map1#x, struct_map2#x] -- !query @@ -178,24 +167,13 @@ Project [map_zip_with(cast(decimal_map1#x as map<double,decimal(36,0)>), double_ SELECT map_zip_with(decimal_map2, int_map, (k, v1, v2) -> struct(k, v1, v2)) m FROM various_maps -- !query analysis -org.apache.spark.sql.catalyst.ExtendedAnalysisException -{ - "errorClass" : "DATATYPE_MISMATCH.MAP_ZIP_WITH_DIFF_TYPES", - "sqlState" : "42K09", - "messageParameters" : { - "functionName" : "`map_zip_with`", - "leftType" : "\"DECIMAL(36,35)\"", - "rightType" : "\"INT\"", - "sqlExpr" : "\"map_zip_with(decimal_map2, int_map, lambdafunction(struct(k, v1, v2), k, v1, v2))\"" - }, - "queryContext" : [ { - "objectType" : "", - "objectName" : "", - "startIndex" : 8, - "stopIndex" : 76, - "fragment" : "map_zip_with(decimal_map2, int_map, (k, v1, v2) -> struct(k, v1, v2))" - } ] -} +Project [map_zip_with(cast(decimal_map2#x as map<decimal(38,28),decimal(36,35)>), cast(int_map#x as map<decimal(38,28),int>), lambdafunction(struct(k, lambda k#x, v1, lambda v1#x, v2, lambda v2#x), lambda k#x, lambda v1#x, lambda v2#x, false)) AS m#x] ++- SubqueryAlias various_maps + +- View (`various_maps`, [boolean_map#x,tinyint_map#x,smallint_map#x,int_map#x,bigint_map#x,decimal_map1#x,decimal_map2#x,double_map#x,float_map#x,date_map#x,timestamp_map#x,string_map1#x,string_map2#x,string_map3#x,string_map4#x,array_map1#x,array_map2#x,struct_map1#x,struct_map2#x]) + +- Project [cast(boolean_map#x as map<boolean,boolean>) AS boolean_map#x, cast(tinyint_map#x as map<tinyint,tinyint>) AS tinyint_map#x, cast(smallint_map#x as map<smallint,smallint>) AS smallint_map#x, cast(int_map#x as map<int,int>) AS int_map#x, cast(bigint_map#x as map<bigint,bigint>) AS bigint_map#x, cast(decimal_map1#x as map<decimal(36,0),decimal(36,0)>) AS decimal_map1#x, cast(decimal_map2#x as map<decimal(36,35),decimal(36,35)>) AS decimal_map2#x, cast(double_map#x as map<d [...] + +- Project [boolean_map#x, tinyint_map#x, smallint_map#x, int_map#x, bigint_map#x, decimal_map1#x, decimal_map2#x, double_map#x, float_map#x, date_map#x, timestamp_map#x, string_map1#x, string_map2#x, string_map3#x, string_map4#x, array_map1#x, array_map2#x, struct_map1#x, struct_map2#x] + +- SubqueryAlias various_maps + +- LocalRelation [boolean_map#x, tinyint_map#x, smallint_map#x, int_map#x, bigint_map#x, decimal_map1#x, decimal_map2#x, double_map#x, float_map#x, date_map#x, timestamp_map#x, string_map1#x, string_map2#x, string_map3#x, string_map4#x, array_map1#x, array_map2#x, struct_map1#x, struct_map2#x] -- !query diff --git a/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/mapZipWith.sql.out b/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/mapZipWith.sql.out index 452603c2b1eb..5c00e8a5b63d 100644 --- a/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/mapZipWith.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/mapZipWith.sql.out @@ -79,26 +79,9 @@ struct<m:map<double,struct<k:double,v1:double,v2:float>>> SELECT map_zip_with(decimal_map1, decimal_map2, (k, v1, v2) -> struct(k, v1, v2)) m FROM various_maps -- !query schema -struct<> +struct<m:map<decimal(38,2),struct<k:decimal(38,2),v1:decimal(36,0),v2:decimal(36,35)>>> -- !query output -org.apache.spark.sql.catalyst.ExtendedAnalysisException -{ - "errorClass" : "DATATYPE_MISMATCH.MAP_ZIP_WITH_DIFF_TYPES", - "sqlState" : "42K09", - "messageParameters" : { - "functionName" : "`map_zip_with`", - "leftType" : "\"DECIMAL(36,0)\"", - "rightType" : "\"DECIMAL(36,35)\"", - "sqlExpr" : "\"map_zip_with(decimal_map1, decimal_map2, lambdafunction(struct(k, v1, v2), k, v1, v2))\"" - }, - "queryContext" : [ { - "objectType" : "", - "objectName" : "", - "startIndex" : 8, - "stopIndex" : 81, - "fragment" : "map_zip_with(decimal_map1, decimal_map2, (k, v1, v2) -> struct(k, v1, v2))" - } ] -} +{9.22:{"k":9.22,"v1":null,"v2":9.22337203685477897945456575809789456},922337203685477897945456575809789456.00:{"k":922337203685477897945456575809789456.00,"v1":922337203685477897945456575809789456,"v2":null}} -- !query @@ -123,26 +106,9 @@ struct<m:map<double,struct<k:double,v1:decimal(36,0),v2:double>>> SELECT map_zip_with(decimal_map2, int_map, (k, v1, v2) -> struct(k, v1, v2)) m FROM various_maps -- !query schema -struct<> +struct<m:map<decimal(38,28),struct<k:decimal(38,28),v1:decimal(36,35),v2:int>>> -- !query output -org.apache.spark.sql.catalyst.ExtendedAnalysisException -{ - "errorClass" : "DATATYPE_MISMATCH.MAP_ZIP_WITH_DIFF_TYPES", - "sqlState" : "42K09", - "messageParameters" : { - "functionName" : "`map_zip_with`", - "leftType" : "\"DECIMAL(36,35)\"", - "rightType" : "\"INT\"", - "sqlExpr" : "\"map_zip_with(decimal_map2, int_map, lambdafunction(struct(k, v1, v2), k, v1, v2))\"" - }, - "queryContext" : [ { - "objectType" : "", - "objectName" : "", - "startIndex" : 8, - "stopIndex" : 76, - "fragment" : "map_zip_with(decimal_map2, int_map, (k, v1, v2) -> struct(k, v1, v2))" - } ] -} +{2.0000000000000000000000000000:{"k":2.0000000000000000000000000000,"v1":null,"v2":1},9.2233720368547789794545657581:{"k":9.2233720368547789794545657581,"v1":9.22337203685477897945456575809789456,"v2":null}} -- !query diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a.sf100/explain.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a.sf100/explain.txt index 033ff1940d34..4ea7fcca2c1e 100644 --- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a.sf100/explain.txt +++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a.sf100/explain.txt @@ -146,7 +146,7 @@ Input [4]: [i_category#12, i_class#11, sum#15, sum#16] Keys [2]: [i_category#12, i_class#11] Functions [2]: [sum(UnscaledValue(ss_net_profit#4)), sum(UnscaledValue(ss_ext_sales_price#3))] Aggregate Attributes [2]: [sum(UnscaledValue(ss_net_profit#4))#17, sum(UnscaledValue(ss_ext_sales_price#3))#18] -Results [6]: [cast((MakeDecimal(sum(UnscaledValue(ss_net_profit#4))#17,17,2) / MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2)) as decimal(38,20)) AS gross_margin#19, i_category#12, i_class#11, 0 AS t_category#20, 0 AS t_class#21, 0 AS lochierarchy#22] +Results [6]: [cast((MakeDecimal(sum(UnscaledValue(ss_net_profit#4))#17,17,2) / MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2)) as decimal(38,11)) AS gross_margin#19, i_category#12, i_class#11, 0 AS t_category#20, 0 AS t_class#21, 0 AS lochierarchy#22] (23) ReusedExchange [Reuses operator id: 21] Output [4]: [i_category#23, i_class#24, sum#25, sum#26] @@ -174,7 +174,7 @@ Input [5]: [i_category#23, sum#37, isEmpty#38, sum#39, isEmpty#40] Keys [1]: [i_category#23] Functions [2]: [sum(ss_net_profit#31), sum(ss_ext_sales_price#32)] Aggregate Attributes [2]: [sum(ss_net_profit#31)#41, sum(ss_ext_sales_price#32)#42] -Results [6]: [cast((sum(ss_net_profit#31)#41 / sum(ss_ext_sales_price#32)#42) as decimal(38,20)) AS gross_margin#43, i_category#23, null AS i_class#44, 0 AS t_category#45, 1 AS t_class#46, 1 AS lochierarchy#47] +Results [6]: [(sum(ss_net_profit#31)#41 / sum(ss_ext_sales_price#32)#42) AS gross_margin#43, i_category#23, null AS i_class#44, 0 AS t_category#45, 1 AS t_class#46, 1 AS lochierarchy#47] (28) ReusedExchange [Reuses operator id: 21] Output [4]: [i_category#48, i_class#49, sum#50, sum#51] @@ -202,7 +202,7 @@ Input [4]: [sum#60, isEmpty#61, sum#62, isEmpty#63] Keys: [] Functions [2]: [sum(ss_net_profit#54), sum(ss_ext_sales_price#55)] Aggregate Attributes [2]: [sum(ss_net_profit#54)#64, sum(ss_ext_sales_price#55)#65] -Results [6]: [cast((sum(ss_net_profit#54)#64 / sum(ss_ext_sales_price#55)#65) as decimal(38,20)) AS gross_margin#66, null AS i_category#67, null AS i_class#68, 1 AS t_category#69, 1 AS t_class#70, 2 AS lochierarchy#71] +Results [6]: [(sum(ss_net_profit#54)#64 / sum(ss_ext_sales_price#55)#65) AS gross_margin#66, null AS i_category#67, null AS i_class#68, 1 AS t_category#69, 1 AS t_class#70, 2 AS lochierarchy#71] (33) Union diff --git a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a/explain.txt b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a/explain.txt index 7c2e7d17af2b..0db1fa04fc6f 100644 --- a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a/explain.txt +++ b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a/explain.txt @@ -146,7 +146,7 @@ Input [4]: [i_category#10, i_class#9, sum#15, sum#16] Keys [2]: [i_category#10, i_class#9] Functions [2]: [sum(UnscaledValue(ss_net_profit#4)), sum(UnscaledValue(ss_ext_sales_price#3))] Aggregate Attributes [2]: [sum(UnscaledValue(ss_net_profit#4))#17, sum(UnscaledValue(ss_ext_sales_price#3))#18] -Results [6]: [cast((MakeDecimal(sum(UnscaledValue(ss_net_profit#4))#17,17,2) / MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2)) as decimal(38,20)) AS gross_margin#19, i_category#10, i_class#9, 0 AS t_category#20, 0 AS t_class#21, 0 AS lochierarchy#22] +Results [6]: [cast((MakeDecimal(sum(UnscaledValue(ss_net_profit#4))#17,17,2) / MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2)) as decimal(38,11)) AS gross_margin#19, i_category#10, i_class#9, 0 AS t_category#20, 0 AS t_class#21, 0 AS lochierarchy#22] (23) ReusedExchange [Reuses operator id: 21] Output [4]: [i_category#23, i_class#24, sum#25, sum#26] @@ -174,7 +174,7 @@ Input [5]: [i_category#23, sum#37, isEmpty#38, sum#39, isEmpty#40] Keys [1]: [i_category#23] Functions [2]: [sum(ss_net_profit#31), sum(ss_ext_sales_price#32)] Aggregate Attributes [2]: [sum(ss_net_profit#31)#41, sum(ss_ext_sales_price#32)#42] -Results [6]: [cast((sum(ss_net_profit#31)#41 / sum(ss_ext_sales_price#32)#42) as decimal(38,20)) AS gross_margin#43, i_category#23, null AS i_class#44, 0 AS t_category#45, 1 AS t_class#46, 1 AS lochierarchy#47] +Results [6]: [(sum(ss_net_profit#31)#41 / sum(ss_ext_sales_price#32)#42) AS gross_margin#43, i_category#23, null AS i_class#44, 0 AS t_category#45, 1 AS t_class#46, 1 AS lochierarchy#47] (28) ReusedExchange [Reuses operator id: 21] Output [4]: [i_category#48, i_class#49, sum#50, sum#51] @@ -202,7 +202,7 @@ Input [4]: [sum#60, isEmpty#61, sum#62, isEmpty#63] Keys: [] Functions [2]: [sum(ss_net_profit#54), sum(ss_ext_sales_price#55)] Aggregate Attributes [2]: [sum(ss_net_profit#54)#64, sum(ss_ext_sales_price#55)#65] -Results [6]: [cast((sum(ss_net_profit#54)#64 / sum(ss_ext_sales_price#55)#65) as decimal(38,20)) AS gross_margin#66, null AS i_category#67, null AS i_class#68, 1 AS t_category#69, 1 AS t_class#70, 2 AS lochierarchy#71] +Results [6]: [(sum(ss_net_profit#54)#64 / sum(ss_ext_sales_price#55)#65) AS gross_margin#66, null AS i_category#67, null AS i_class#68, 1 AS t_category#69, 1 AS t_class#70, 2 AS lochierarchy#71] (33) Union diff --git a/sql/core/src/test/resources/tpcds-query-results/v2_7/q36a.sql.out b/sql/core/src/test/resources/tpcds-query-results/v2_7/q36a.sql.out index a74e6211b49a..a3e82c380dcc 100644 --- a/sql/core/src/test/resources/tpcds-query-results/v2_7/q36a.sql.out +++ b/sql/core/src/test/resources/tpcds-query-results/v2_7/q36a.sql.out @@ -1,105 +1,105 @@ -- Automatically generated by TPCDSQueryTestSuite -- !query schema -struct<gross_margin:decimal(38,20),i_category:string,i_class:string,lochierarchy:int,rank_within_parent:int> +struct<gross_margin:decimal(38,11),i_category:string,i_class:string,lochierarchy:int,rank_within_parent:int> -- !query output --0.43310777865000000000 NULL NULL 2 1 --0.44057752675000000000 Home NULL 1 1 --0.43759152110000000000 Music NULL 1 2 --0.43708103961000000000 NULL NULL 1 3 --0.43616253139000000000 Shoes NULL 1 4 --0.43567118609000000000 Children NULL 1 5 --0.43423932352000000000 Sports NULL 1 6 --0.43342977300000000000 Electronics NULL 1 7 --0.43243283121000000000 Women NULL 1 8 --0.43164166900000000000 Men NULL 1 9 --0.42516187690000000000 Books NULL 1 10 --0.42448713381000000000 Jewelry NULL 1 11 --0.73902664238792748962 NULL shirts 0 1 --0.61125804873635587486 NULL country 0 2 --0.53129803597069255822 NULL dresses 0 3 --0.51266635289382758517 NULL athletic 0 4 --0.45290387783638603924 NULL mens 0 5 --0.41288056661656330013 NULL accessories 0 6 --0.40784754677005682440 NULL NULL 0 7 --0.34254844860867375832 NULL baseball 0 8 --0.32511461675631534897 NULL infants 0 9 --0.44733955704648003493 Books computers 0 1 --0.44221358112622373783 Books home repair 0 2 --0.44131129175272951442 Books romance 0 3 --0.43954111564375046074 Books history 0 4 --0.43921337505389731821 Books mystery 0 5 --0.43904020269360481109 Books sports 0 6 --0.42821476999837619396 Books travel 0 7 --0.42609067296303848297 Books cooking 0 8 --0.42538995145338568328 Books fiction 0 9 --0.42446563616188232944 Books arts 0 10 --0.42424821311884350413 Books parenting 0 11 --0.41822014479424203008 Books reference 0 12 --0.41350839325516811781 Books business 0 13 --0.40935208137315013129 Books science 0 14 --0.40159380735731858928 Books self-help 0 15 --0.36957884843305744526 Books entertainments 0 16 --0.44602461556731552282 Children school-uniforms 0 1 --0.44141106040000560852 Children toddlers 0 2 --0.43479886701046623711 Children infants 0 3 --0.41900662971936329442 Children newborn 0 4 --0.41526603781609697786 Children NULL 0 5 --0.45347482218635333366 Electronics personal 0 1 --0.44349670349829474271 Electronics stereo 0 2 --0.44262427232850112058 Electronics automotive 0 3 --0.44115886172705231970 Electronics portable 0 4 --0.43972786651639318010 Electronics memory 0 5 --0.43889275271590953040 Electronics scanners 0 6 --0.43879181695132886061 Electronics karoke 0 7 --0.43743655149948399284 Electronics dvd/vcr players 0 8 --0.43737666390514154910 Electronics cameras 0 9 --0.43390499017233926812 Electronics wireless 0 10 --0.43163869754114299547 Electronics audio 0 11 --0.42958938669780912634 Electronics camcorders 0 12 --0.42872845803629855724 Electronics musical 0 13 --0.42228240153396399656 Electronics televisions 0 14 --0.41893847772039275795 Electronics monitors 0 15 --0.39793878022746331540 Electronics disk drives 0 16 --0.49051156860507320113 Home NULL 0 1 --0.48431476750686752965 Home blinds/shades 0 2 --0.47545837941951440918 Home bathroom 0 3 --0.45726228921216284093 Home rugs 0 4 --0.45540507568891021759 Home furniture 0 5 --0.45303572267019508501 Home flatware 0 6 --0.44755542058111800358 Home tables 0 7 --0.44419847780930149402 Home wallpaper 0 8 --0.44092345226680695671 Home glassware 0 9 --0.43877591834074789745 Home decor 0 10 --0.43765482553654514822 Home accent 0 11 --0.43188199218974854630 Home bedding 0 12 --0.43107417904272222899 Home kids 0 13 --0.42474436355625900935 Home lighting 0 14 --0.41783311109052416746 Home curtains/drapes 0 15 --0.41767111806961188479 Home mattresses 0 16 --0.40562188698541221499 Home paint 0 17 --0.45165056505480816921 Jewelry jewelry boxes 0 1 --0.44372227804836590137 Jewelry estate 0 2 --0.44251815032563188894 Jewelry gold 0 3 --0.43978127753996883542 Jewelry consignment 0 4 --0.43821750044359339153 Jewelry custom 0 5 --0.43439645036479672989 Jewelry bracelets 0 6 --0.43208398325687772942 Jewelry loose stones 0 7 --0.43060897375114375156 Jewelry diamonds 0 8 --0.42847505748860847066 Jewelry costume 0 9 --0.42667449062277843561 Jewelry rings 0 10 --0.41987969011585456826 Jewelry mens watch 0 11 --0.41624621972944533035 Jewelry semi-precious 0 12 --0.41148949162100715771 Jewelry womens watch 0 13 --0.39725668174847694299 Jewelry birdal 0 14 --0.39665274051903254057 Jewelry pendants 0 15 --0.38423525233438861010 Jewelry earings 0 16 --0.44464388887858793403 Men shirts 0 1 --0.43719860800637369827 Men accessories 0 2 --0.43164606665359630905 Men sports-apparel 0 3 --0.41530906677293519754 Men pants 0 4 --0.38332708894803499123 Men NULL 0 5 --0.47339698705534020269 Music NULL 0 1 --0.44193214675249008923 Music rock 0 2 --0.44008174913565459246 Music country 0 3 --0.43863444992223641373 Music pop 0 4 +-0.43310777865 NULL NULL 2 1 +-0.44057752675 Home NULL 1 1 +-0.43759152110 Music NULL 1 2 +-0.43708103961 NULL NULL 1 3 +-0.43616253139 Shoes NULL 1 4 +-0.43567118609 Children NULL 1 5 +-0.43423932352 Sports NULL 1 6 +-0.43342977300 Electronics NULL 1 7 +-0.43243283121 Women NULL 1 8 +-0.43164166900 Men NULL 1 9 +-0.42516187690 Books NULL 1 10 +-0.42448713381 Jewelry NULL 1 11 +-0.73902664239 NULL shirts 0 1 +-0.61125804874 NULL country 0 2 +-0.53129803597 NULL dresses 0 3 +-0.51266635289 NULL athletic 0 4 +-0.45290387784 NULL mens 0 5 +-0.41288056662 NULL accessories 0 6 +-0.40784754677 NULL NULL 0 7 +-0.34254844861 NULL baseball 0 8 +-0.32511461676 NULL infants 0 9 +-0.44733955705 Books computers 0 1 +-0.44221358113 Books home repair 0 2 +-0.44131129175 Books romance 0 3 +-0.43954111564 Books history 0 4 +-0.43921337505 Books mystery 0 5 +-0.43904020269 Books sports 0 6 +-0.42821477000 Books travel 0 7 +-0.42609067296 Books cooking 0 8 +-0.42538995145 Books fiction 0 9 +-0.42446563616 Books arts 0 10 +-0.42424821312 Books parenting 0 11 +-0.41822014479 Books reference 0 12 +-0.41350839326 Books business 0 13 +-0.40935208137 Books science 0 14 +-0.40159380736 Books self-help 0 15 +-0.36957884843 Books entertainments 0 16 +-0.44602461557 Children school-uniforms 0 1 +-0.44141106040 Children toddlers 0 2 +-0.43479886701 Children infants 0 3 +-0.41900662972 Children newborn 0 4 +-0.41526603782 Children NULL 0 5 +-0.45347482219 Electronics personal 0 1 +-0.44349670350 Electronics stereo 0 2 +-0.44262427233 Electronics automotive 0 3 +-0.44115886173 Electronics portable 0 4 +-0.43972786652 Electronics memory 0 5 +-0.43889275272 Electronics scanners 0 6 +-0.43879181695 Electronics karoke 0 7 +-0.43743655150 Electronics dvd/vcr players 0 8 +-0.43737666391 Electronics cameras 0 9 +-0.43390499017 Electronics wireless 0 10 +-0.43163869754 Electronics audio 0 11 +-0.42958938670 Electronics camcorders 0 12 +-0.42872845804 Electronics musical 0 13 +-0.42228240153 Electronics televisions 0 14 +-0.41893847772 Electronics monitors 0 15 +-0.39793878023 Electronics disk drives 0 16 +-0.49051156861 Home NULL 0 1 +-0.48431476751 Home blinds/shades 0 2 +-0.47545837942 Home bathroom 0 3 +-0.45726228921 Home rugs 0 4 +-0.45540507569 Home furniture 0 5 +-0.45303572267 Home flatware 0 6 +-0.44755542058 Home tables 0 7 +-0.44419847781 Home wallpaper 0 8 +-0.44092345227 Home glassware 0 9 +-0.43877591834 Home decor 0 10 +-0.43765482554 Home accent 0 11 +-0.43188199219 Home bedding 0 12 +-0.43107417904 Home kids 0 13 +-0.42474436356 Home lighting 0 14 +-0.41783311109 Home curtains/drapes 0 15 +-0.41767111807 Home mattresses 0 16 +-0.40562188699 Home paint 0 17 +-0.45165056505 Jewelry jewelry boxes 0 1 +-0.44372227805 Jewelry estate 0 2 +-0.44251815033 Jewelry gold 0 3 +-0.43978127754 Jewelry consignment 0 4 +-0.43821750044 Jewelry custom 0 5 +-0.43439645036 Jewelry bracelets 0 6 +-0.43208398326 Jewelry loose stones 0 7 +-0.43060897375 Jewelry diamonds 0 8 +-0.42847505749 Jewelry costume 0 9 +-0.42667449062 Jewelry rings 0 10 +-0.41987969012 Jewelry mens watch 0 11 +-0.41624621973 Jewelry semi-precious 0 12 +-0.41148949162 Jewelry womens watch 0 13 +-0.39725668175 Jewelry birdal 0 14 +-0.39665274052 Jewelry pendants 0 15 +-0.38423525233 Jewelry earings 0 16 +-0.44464388888 Men shirts 0 1 +-0.43719860801 Men accessories 0 2 +-0.43164606665 Men sports-apparel 0 3 +-0.41530906677 Men pants 0 4 +-0.38332708895 Men NULL 0 5 +-0.47339698706 Music NULL 0 1 +-0.44193214675 Music rock 0 2 +-0.44008174914 Music country 0 3 +-0.43863444992 Music pop 0 4 diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala index 135ce834bfe5..bb024b79598e 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala @@ -1529,23 +1529,10 @@ class DataFrameFunctionsSuite extends QueryTest with SharedSparkSession { Seq(Row(false)) ) - if (!conf.ansiEnabled) { - checkError( - exception = intercept[AnalysisException] { - OneRowRelation().selectExpr("array_contains(array(1), .01234567890123456790123456780)") - }, - errorClass = "DATATYPE_MISMATCH.ARRAY_FUNCTION_DIFF_TYPES", - parameters = Map( - "sqlExpr" -> "\"array_contains(array(1), 0.01234567890123456790123456780)\"", - "functionName" -> "`array_contains`", - "dataType" -> "\"ARRAY\"", - "leftType" -> "\"ARRAY<INT>\"", - "rightType" -> "\"DECIMAL(38,29)\"" - ), - queryContext = Array(ExpectedContext("", "", 0, 55, - "array_contains(array(1), .01234567890123456790123456780)")) - ) - } + checkAnswer( + OneRowRelation().selectExpr("array_contains(array(1), .01234567890123456790123456780)"), + Seq(Row(false)) + ) checkError( exception = intercept[AnalysisException] { --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org