ShuMing Li created SPARK-28729: ---------------------------------- Summary: Comparison between DecimalType and StringType may lead to wrong results Key: SPARK-28729 URL: https://issues.apache.org/jira/browse/SPARK-28729 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.3.0 Reporter: ShuMing Li
{code:java} desc test_table; a int NULL b string NULL dt string NULL hh string NULL # Partition Information # col_name data_type comment dt string NULL hh string NULL select dt from test_table where dt=20190801002382000052000000017638; 20190801002382000052000000017638 20190801002382000052000000017638 20190801002382000052000000016558 {code} In the sql above, column `dt` is string type. when users forget to add '' in query, Spark returns wrong results. In `TypeCoercion` class, DecimalType/StringType is casted as `DoubleType` when DecimalType compares with StringType which maybe not safe with precision lose or truncating. {code:java} /** val findCommonTypeForBinaryComparison: (DataType, DataType) => Option[DataType] = { .... // There is no proper decimal type we can pick, // using double type is the best we can do. // See SPARK-22469 for details. case (n: DecimalType, s: StringType) => Some(DoubleType) case (s: StringType, n: DecimalType) => Some(DoubleType) ... } {code} However I cannot find a good solution to avoid this: maybe just throw exception when meets `precision lose` or add a config to avoid this? -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org