cloud-fan commented on a change in pull request #25854: [SPARK-29145][SQL]Spark SQL cannot handle "NOT IN" condition when using "JOIN" URL: https://github.com/apache/spark/pull/25854#discussion_r335293923
########## File path: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala ########## @@ -204,6 +204,30 @@ class SubquerySuite extends QueryTest with SharedSparkSession { } } + test("SPARK-29145: JOIN Condition use QueryList") { + withTempView("s1", "s2", "s3") { + Seq(1, 3, 5, 7, 9).toDF("id").createOrReplaceTempView("s1") + Seq(1, 3, 4, 6, 9).toDF("id").createOrReplaceTempView("s2") + Seq(3, 4, 6, 9).toDF("id").createOrReplaceTempView("s3") + + checkAnswer( + sql("SELECT s1.id from s1 JOIN s2 ON s1.id = s2.id and s1.id IN (select 9)"), + Row(9) :: Nil) + + checkAnswer( + sql("SELECT s1.id from s1 JOIN s2 ON s1.id = s2.id and s1.id NOT IN (select 9)"), + Row(1) :: Row(3) :: Nil) + + checkAnswer( + sql("SELECT s1.id from s1 JOIN s2 ON s1.id = s2.id and s1.id IN (select id from s3)"), Review comment: for example, do we support `SELECT s1.id from s1 JOIN s2 ON s1.id = s2.id and s1.id IN (select id from s3 where s3.id = s2.id)` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org