Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/8391#discussion_r39345066 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala --- @@ -275,6 +275,10 @@ private[sql] class JDBCRDD( */ private def compileFilter(f: Filter): String = f match { case EqualTo(attr, value) => s"$attr = ${compileValue(value)}" + // Since the null-safe equality operator is not a standard SQL operator, + // This was written as using is-null and normal equality. + case EqualNullSafe(attr, value) => + s"($attr = ${compileValue(value)} OR ($attr IS NULL AND ${compileValue(value)} IS NULL))" --- End diff -- Yes, it looks so.. If SparkSQL creates the SQL (for datasources) like the below, it can be a problem.. ``` SELECT 1 <=> 1, NULL <=> NULL, 1 <=> NULL; -> 1, 1, 0 SELECT 1 = 1, NULL = NULL, 1 = NULL; -> 1, NULL, NULL ``` However, as I see the codes, I think the `compileFilter()` is only used to construct WHERE clause. I tested the expression at MySQL and they look ok. In details, I tested four cases at MySQL. 1. attr is a column, value is null. 2. attr is a column, value is not null. 3. attr is null, value is null. 4. attr is null, value is not null. - `CREATE` ```SQL CREATE TABLE TestTable (`id` int, `test` varchar(7)) ; ``` - `INSERT` ```SQL INSERT INTO TestTable (`id`, `test`) VALUES (1, null), (2, 'OpenAM'), (3, 'OpenDJ') ; ``` - `SELECT` ```SQL /*1. attr is a column, value is null. */ SELECT * FROM TestTable WHERE (test = NULL OR (test IS NULL AND NULL IS NULL)); SELECT * FROM TestTable WHERE test <=> NULL; ``` ```SQL /*2. attr is a column, value is not null.*/ SELECT * FROM TestTable WHERE (test = 'OpenAM' OR (test IS NULL AND 'OpenAM' IS NULL)); SELECT * FROM TestTable WHERE test <=> 'OpenAM'; ``` ```SQL /*3. attr is null, value is null.*/ SELECT * FROM TestTable WHERE (NULL = NULL OR (NULL IS NULL AND NULL IS NULL)); SELECT * FROM TestTable WHERE NULL <=> NULL; ``` ```SQL /*4. attr is null, value is not null.*/ SELECT * FROM TestTable WHERE (NULL = 'OpenAM' OR (NULL IS NULL AND 'OpenAM' IS NULL)); SELECT * FROM TestTable WHERE NULL <=> 'OpenAM'; ```
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org