Andy Lam created SPARK-46743: -------------------------------- Summary: Count bug introduced for scalar subquery when using TEMPORARY VIEW, as compared to using table Key: SPARK-46743 URL: https://issues.apache.org/jira/browse/SPARK-46743 Project: Spark Issue Type: Bug Components: Optimizer Affects Versions: 3.5.0 Reporter: Andy Lam
Using the temp view reproduces COUNT bug, returns nulls instead of 0. With a table: {code:java} scala> spark.sql("""CREATE TABLE outer_table USING parquet AS SELECT * FROM VALUES | (1, 1), | (2, 1), | (3, 3), | (6, 6), | (7, 7), | (9, 9) AS inner_table(a, b)""") val res6: org.apache.spark.sql.DataFrame = [] scala> spark.sql("CREATE TABLE null_table USING parquet AS SELECT CAST(null AS int) AS a, CAST(null as int) AS b ;") val res7: org.apache.spark.sql.DataFrame = [] scala> spark.sql("""SELECT ( SELECT COUNT(null_table.a) AS aggAlias FROM null_table WHERE null_table.a = outer_table.a) FROM outer_table""").collect() val res8: Array[org.apache.spark.sql.Row] = Array([0], [0], [0], [0], [0], [0]) {code} With a view: {code:java} spark.sql("CREATE TEMPORARY VIEW outer_view(a, b) AS VALUES (1, 1), (2, 1),(3, 3), (6, 6), (7, 7), (9, 9);") spark.sql("CREATE TEMPORARY VIEW null_view(a, b) AS SELECT CAST(null AS int), CAST(null as int);") spark.sql("""SELECT ( SELECT COUNT(null_view.a) AS aggAlias FROM null_view WHERE null_view.a = outer_view.a) FROM outer_view""").collect() val res2: Array[org.apache.spark.sql.Row] = Array([null], [null], [null], [null], [null], [null]){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org