Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8391#discussion_r39345066
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
 ---
    @@ -275,6 +275,10 @@ private[sql] class JDBCRDD(
        */
       private def compileFilter(f: Filter): String = f match {
         case EqualTo(attr, value) => s"$attr = ${compileValue(value)}"
    +    // Since the null-safe equality operator is not a standard SQL 
operator,
    +    // This was written as using is-null and normal equality.
    +    case EqualNullSafe(attr, value) =>
    +      s"($attr = ${compileValue(value)} OR ($attr IS NULL AND 
${compileValue(value)} IS NULL))"
    --- End diff --
    
    
    Yes, it looks so..
    If SparkSQL creates the SQL (for datasources) like the below, it can be a 
problem.. 
    ```
    SELECT 1 <=> 1, NULL <=> NULL, 1 <=> NULL;
            -> 1, 1, 0
    SELECT 1 = 1, NULL = NULL, 1 = NULL;
            -> 1, NULL, NULL
    ```
    
    However, as I see the codes, I think the `compileFilter()` is only used to 
construct WHERE clause.
    I tested the expression at MySQL and they look ok.
    
    
    In details,
    I tested four cases at MySQL.
    
    1. attr is a column, value is null.
    
    2. attr is a column, value is not null.
    
    3. attr is null, value is null.
    
    4. attr is null, value is not null.
    
    - `CREATE`
    ```SQL
    CREATE TABLE TestTable
        (`id` int, `test` varchar(7))
    ;
    ```
    
    - `INSERT`
    ```SQL
    INSERT INTO TestTable
        (`id`, `test`)
        
    VALUES
        (1, null),
        (2, 'OpenAM'),
        (3, 'OpenDJ')
    ;
    ```
    
    - `SELECT`
    ```SQL
    /*1. attr is a column, value is null. */
    SELECT
      *
    FROM
      TestTable
    WHERE
      (test = NULL OR (test IS NULL AND NULL IS NULL));
      
    SELECT
      *
    FROM
      TestTable
    WHERE
      test <=> NULL;
    ```
    ```SQL
    /*2. attr is a column, value is not null.*/
    SELECT
      *
    FROM
      TestTable
    WHERE
      (test = 'OpenAM' OR (test IS NULL AND 'OpenAM' IS NULL));
    
    SELECT
      *
    FROM
      TestTable
    WHERE
      test <=> 'OpenAM';
    ```
    ```SQL
    /*3. attr is null, value is null.*/
    SELECT
      *
    FROM
      TestTable
    WHERE
      (NULL = NULL OR (NULL IS NULL AND NULL IS NULL));
    
    SELECT
      *
    FROM
      TestTable
    WHERE
      NULL <=> NULL;
    ```
    ```SQL
    /*4. attr is null, value is not null.*/
    SELECT
      *
    FROM
      TestTable
    WHERE
      (NULL = 'OpenAM' OR (NULL IS NULL AND 'OpenAM' IS NULL));
    
    SELECT
      *
    FROM
      TestTable
    WHERE
      NULL <=> 'OpenAM';
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to