Abhishek Singh created SPARK-48652: -------------------------------------- Summary: Casting Issue in Spark SQL: String Column Compared to Integer Value Yields Empty Results Key: SPARK-48652 URL: https://issues.apache.org/jira/browse/SPARK-48652 Project: Spark Issue Type: Question Components: Spark Core, SQL Affects Versions: 3.3.2 Reporter: Abhishek Singh
In Spark SQL, comparing a string column to an integer value can lead to unexpected results due to implicit type casting. When a string column is compared to an integer, Spark attempts to cast the strings to integers, which fails for non-numeric strings, resulting in an empty result set. {code:java} case class Person(id: String, name: String) val personDF = Seq(Person("a", "amit"), Person("b", "abhishek")).toDF() personDF.createOrReplaceTempView("person_ddf") val sqlQuery = "SELECT * FROM person_ddf WHERE id <> -1" val resultDF = spark.sql(sqlQuery) resultDF.show() // Empty result due to type casting issue {code} Below is the logical and physical plan which I m getting {code:java} == Parsed Logical Plan == 'Project [*] +- 'Filter NOT ('id = -1) +- 'UnresolvedRelation [person_ddf], [], false == Analyzed Logical Plan == id: string, name: string Project [id#356, name#357] +- Filter NOT (cast(id#356 as int) = -1) +- SubqueryAlias person_ddf +- View (`person_ddf`, [id#356,name#357]) +- LocalRelation [id#356, name#357] == Optimized Logical Plan == LocalRelation <empty>, [id#356, name#357] == Physical Plan == LocalTableScan <empty>, [id#356, name#357] == Physical Plan == LocalTableScan (1) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org