[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user gatorsmile closed the pull request at: https://github.com/apache/spark/pull/14625 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14625#discussion_r75588865 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -245,6 +245,10 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext { (1 to 100).map(i => (i, i.toString)).toDF("key", "value").createOrReplaceTempView("testdata") +Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)) --- End diff -- To be honest, it is hard to write test data, especially when we want very few rows in each data set. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14625#discussion_r75588856 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -245,6 +245,10 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext { (1 to 100).map(i => (i, i.toString)).toDF("key", "value").createOrReplaceTempView("testdata") +Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)) --- End diff -- The major differences are the data. They have different data distribution. For example, testData` does not have duplicate key values, but `testData2` has fewer rows and duplicate key values. `src1` has null but `src` does not have it. Your concern is valid. We should change the name; otherwise, it is hard to understand the reasons. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14625#discussion_r75588814 --- Diff: sql/core/src/test/resources/sql-tests/inputs/join.sql --- @@ -0,0 +1,225 @@ +-- join nested table expressions (auto_join0.q) --- End diff -- : ) That is for helping reviewers know the origins of the queries. If you think we do not care, we can remove it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14625#discussion_r75588146 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -245,6 +245,10 @@ class SQLQueryTestSuite extends QueryTest with SharedSQLContext { (1 to 100).map(i => (i, i.toString)).toDF("key", "value").createOrReplaceTempView("testdata") +Seq((1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)) --- End diff -- previously we have 3 pre-loaded tables: `testdata`, `arraydata`, `mapdata`, which are key-value table, array type table and map type table. For the new join tests, I think only `lowerCaseData`, `upperCaseData`, `srcpart` make sense, why can't we use `testdata` for `testData2`, `src` and `src2`? They are all key-value tables. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14625#discussion_r75588118 --- Diff: sql/core/src/test/resources/sql-tests/inputs/join.sql --- @@ -0,0 +1,225 @@ +-- join nested table expressions (auto_join0.q) --- End diff -- Do we need to reference to the hive `.q` file? I think hive golden file tests will be removed eventually. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14625#discussion_r75215512 --- Diff: sql/core/src/test/resources/sql-tests/results/using-join.sql.out --- @@ -0,0 +1,132 @@ +-- Automatically generated by SQLQueryTestSuite +-- Number of queries: 13 + + +-- !query 0 +create temporary view ut1 as select * from values + ("r1c1", "r1c2", "t1r1c3"), + ("r2c1", "r2c2", "t1r2c3"), + ("r3c1x", "r3c2", "t1r3c3") + as ut1(c1, c2, c3) +-- !query 0 schema +struct<> +-- !query 0 output + + + +-- !query 1 +create temporary view ut2 as select * from values + ("r1c1", "r1c2", "t2r1c3"), + ("r2c1", "r2c2", "t2r2c3"), + ("r3c1y", "r3c2", "t2r3c3") + as ut2(c1, c2, c3) +-- !query 1 schema +struct<> +-- !query 1 output + + + +-- !query 2 +create temporary view ut3 as select * from values + (null, "r1c2", "t3r1c3"), + ("r2c1", "r2c2", "t3r2c3"), + ("r3c1y", "r3c2", "t3r3c3") + as ut3(c1, c2, c3) +-- !query 2 schema +struct<> +-- !query 2 output +scala.MatchError +NullType (of class org.apache.spark.sql.types.NullType$) --- End diff -- This failure is waiting for the PR: https://github.com/apache/spark/pull/14676 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/14625#discussion_r74891950 --- Diff: sql/core/src/test/resources/test-data/kv1.json --- @@ -0,0 +1,5 @@ +{"key":251,"value":"val_251"} --- End diff -- Sure, will do it. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #14625: [SPARK-17045] [SQL] Build/move Join-related test ...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/14625#discussion_r74890820 --- Diff: sql/core/src/test/resources/test-data/kv1.json --- @@ -0,0 +1,5 @@ +{"key":251,"value":"val_251"} --- End diff -- we can inline the data in `.sql` files using `values` syntax ``` create temporary view data as select * from values (1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2) as data(a, b); ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org