Hi, I am using *where method* of dataframe to filter data. I am comparing Integer field with String type data, this comparision results full table data. I have tested same scenario with HIVE and MYSQL but this comparision will not give any result.
*Scenario : * val sqlDf = df.where("f1 = 'abc'") here f1 : Integer * Input:* 14 15 16 * output: * 14 15 16 *Logical and Physical Plan : * == Parsed Logical Plan == 'Filter ('f1 = abc) +- Relation[f1#0] csv == Analyzed Logical Plan == f1: int Filter (cast(f1#0 as double) = cast(abc as double)) +- Relation[f1#0] csv == Optimized Logical Plan == Filter (isnotnull(f1#0) && null) +- Relation[f1#0] csv == Physical Plan == *Project [f1#0] +- *Filter isnotnull(f1#0) +- *Scan csv [f1#0] Format: CSV, InputPaths: file:/C:/Users/santlalg/IdeaProjects/SparkTestPoc/Int, PartitionFilters: [null], PushedFilters: [IsNotNull(f1)], ReadSchema: struct<f1:int> In *Optimized Logical Plan*, why *cast(f1#0 as double) > cast(abc as double)* from *Analyzed Logical Plan* is replaced with /null/? I am using below version of dependency: Spark-core : 2.0.2 Spark-sql : 2.0.2 In My scenario this should be false, so that dataframe should not give any result. Can someone help me to achieve this? Thanks Santlal -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Mismatch-in-data-type-comparision-results-full-data-in-Spark-tp28521.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org