[GitHub] spark issue #22362: [SPARK-25372][YARN][K8S] Deprecate and generalize keytab...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22362 **[Test build #96457 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96457/testReport)** for PR 22362 at commit [`59986ef`](https://github.com/apache/spark/commit/59986efebd9742fa114f4e48dbe38e48f66ebb38). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22493: [SPARK-25485][TEST] Refactor UnsafeProjectionBenchmark t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22493 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96445/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22493: [SPARK-25485][TEST] Refactor UnsafeProjectionBenchmark t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22493 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22493: [SPARK-25485][TEST] Refactor UnsafeProjectionBenchmark t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22493 **[Test build #96445 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96445/testReport)** for PR 22493 at commit [`52d3f73`](https://github.com/apache/spark/commit/52d3f73f5f0f1a76d8d8a20e07543f99a70bb854). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22519 **[Test build #96456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96456/testReport)** for PR 22519 at commit [`edc261c`](https://github.com/apache/spark/commit/edc261cf4724708be67d61dbff2dc697d02db085). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22519 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22519 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3364/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-25454][SQL] add a new config for picking minimum ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96446/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-25454][SQL] add a new config for picking minimum ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-25454][SQL] add a new config for picking minimum ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22494 **[Test build #96446 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96446/testReport)** for PR 22494 at commit [`b4fdd13`](https://github.com/apache/spark/commit/b4fdd1307059c7df7c386a96aad6bc17b593d9c5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22519 **[Test build #96455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96455/testReport)** for PR 22519 at commit [`d4e314a`](https://github.com/apache/spark/commit/d4e314a5335718fa0ffdb5ea0210096613a0bcf6). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22519 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22519 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3363/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219624150 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -554,8 +554,10 @@ class Analyzer( Cast(value, pivotColumn.dataType, Some(conf.sessionLocalTimeZone)).eval(EmptyRow) } // Group-by expressions coming from SQL are implicit and need to be deduced. +val pivotColAndAggRefs = --- End diff -- Nice advice! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22461 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96434/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22461 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219624067 --- Diff: sql/core/src/test/resources/sql-tests/inputs/pivot.sql --- @@ -287,3 +287,13 @@ PIVOT ( sum(earnings) FOR (course, m) IN (('dotNET', map('1', 1)), ('Java', map('2', 2))) ); + +-- grouping columns output in the same order as input +SELECT * FROM ( + SELECT course, earnings, "a" as a, "z" as z, "b" as b, "y" as y, "c" as c, "x" as x, "d" as d, "w" as w --- End diff -- It wouldn't hurt anyway I think... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22461: [SPARK-25453][SQL][TEST] OracleIntegrationSuite IllegalA...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22461 **[Test build #96434 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96434/testReport)** for PR 22461 at commit [`f6274a5`](https://github.com/apache/spark/commit/f6274a50177e18be7b36d87913c44103f2fa02d2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...
Github user maryannxue commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219623907 --- Diff: sql/core/src/test/resources/sql-tests/results/pivot.sql.out --- @@ -1,5 +1,5 @@ --- Automatically generated by SQLQueryTestSuite --- Number of queries: 31 +-- Automatically generated by SparkServiceSQLQueryTestSuite --- End diff -- Good catch. Do you have any idea how it has turned out this way? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3362/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3362/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22495 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96444/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22495 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22495 **[Test build #96444 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96444/testReport)** for PR 22495 at commit [`be2d1c0`](https://github.com/apache/spark/commit/be2d1c0e1b224386b2d3a5c43b6f2b1638604607). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/3362/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(number)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22419 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(number)
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22419 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96448/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22419: [SPARK-23906][SQL] Add built-in UDF TRUNCATE(number)
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22419 **[Test build #96448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96448/testReport)** for PR 22419 at commit [`479b31f`](https://github.com/apache/spark/commit/479b31fa046e8402f4f93cdbad5fe93ef1ea570f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21718: [SPARK-24744][STRUCTRURED STREAMING] Set the SparkSessio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21718 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...
Github user peter-toth commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r219617722 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala --- @@ -166,7 +168,7 @@ case class ReuseSubquery(conf: SQLConf) extends Rule[SparkPlan] { val sameSchema = subqueries.getOrElseUpdate(sub.plan.schema, ArrayBuffer[SubqueryExec]()) val sameResult = sameSchema.find(_.sameResult(sub.plan)) if (sameResult.isDefined) { - sub.withNewPlan(sameResult.get) + sub.withNewPlan(sameResult.get).withNewExprId() --- End diff -- Can we avoid double copy()? Or is it cleaner this way? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22518: [SPARK-25482][SQL] ReuseSubquery can be useless w...
Github user peter-toth commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r219616464 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest with SharedSQLContext { assert(getNumSortsInQuery(query5) == 1) } } + + test("SPARK-25482: Reuse same Subquery in order to execute it only once") { +withTempView("t1", "t2", "t3") { --- End diff -- There is no need for "t3". --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21669: [SPARK-23257][K8S] Kerberos Support for Spark on K8S
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21669 **[Test build #96454 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96454/testReport)** for PR 21669 at commit [`78953e6`](https://github.com/apache/spark/commit/78953e65fc496b1f58adc06e9876ec712241912b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21999: [WIP][SQL] Flattening nested structures
Github user MaxGekk closed the pull request at: https://github.com/apache/spark/pull/21999 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22366: [SPARK-25384][SQL] Removing of spark.sql.fromJsonForceNu...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/22366 I am going to close the PR since I don't see any reasons so far to maintain it up to Spark 3.0. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22366: [SPARK-25384][SQL] Removing of spark.sql.fromJson...
Github user MaxGekk closed the pull request at: https://github.com/apache/spark/pull/22366 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22519 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22519 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96436/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22519: [SPARK-25505][SQL] The output order of grouping columns ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22519 **[Test build #96436 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96436/testReport)** for PR 22519 at commit [`bd416bd`](https://github.com/apache/spark/commit/bd416bd74ee77329b2527fffecd21f7f90090334). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22263 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22263 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96433/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22263: [SPARK-25269][SQL] SQL interface support specify Storage...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22263 **[Test build #96433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96433/testReport)** for PR 22263 at commit [`c3b6dfc`](https://github.com/apache/spark/commit/c3b6dfcee106c4d0e38975b1a31c3f3e97d2abc1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219615082 --- Diff: sql/core/src/test/resources/sql-tests/results/pivot.sql.out --- @@ -1,5 +1,5 @@ --- Automatically generated by SQLQueryTestSuite --- Number of queries: 31 +-- Automatically generated by SparkServiceSQLQueryTestSuite --- End diff -- `SparkServiceSQLQueryTestSuite` -> `SQLQueryTestSuite`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219614494 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -554,8 +554,10 @@ class Analyzer( Cast(value, pivotColumn.dataType, Some(conf.sessionLocalTimeZone)).eval(EmptyRow) } // Group-by expressions coming from SQL are implicit and need to be deduced. +val pivotColAndAggRefs = --- End diff -- `pivotColAndAggRefs` is used inside of `getOrElse` only. Could you move it there. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22519: [SPARK-25505][SQL] The output order of grouping c...
Github user MaxGekk commented on a diff in the pull request: https://github.com/apache/spark/pull/22519#discussion_r219614891 --- Diff: sql/core/src/test/resources/sql-tests/inputs/pivot.sql --- @@ -287,3 +287,13 @@ PIVOT ( sum(earnings) FOR (course, m) IN (('dotNET', map('1', 1)), ('Java', map('2', 2))) ); + +-- grouping columns output in the same order as input +SELECT * FROM ( + SELECT course, earnings, "a" as a, "z" as z, "b" as b, "y" as y, "c" as c, "x" as x, "d" as d, "w" as w --- End diff -- Is it necessary to have so many columns? Probably the test case could be shorter? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22510: [SPARK-25321][ML] Fix local LDA model constructor
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22510 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22492: [SPARK-25321][ML] Revert SPARK-14681 to avoid API breaki...
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/22492 @WeichenXu123 Please close this PR manually. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22510: [SPARK-25321][ML] Fix local LDA model constructor
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/22510 LGTM. Merged into master and branch 2.4. Thanks for checking compatibility with MLeap. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22492: [SPARK-25321][ML] Revert SPARK-14681 to avoid API breaki...
Github user mengxr commented on the issue: https://github.com/apache/spark/pull/22492 LGTM. Merged into branch-2.4. @WeichenXu123 Next time please create dedicated JIRAs for each QA task PR. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22288 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96432/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22288 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22288 **[Test build #96432 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96432/testReport)** for PR 22288 at commit [`ffbc9c3`](https://github.com/apache/spark/commit/ffbc9c32d14a0c82036defb90eb18167f93bad4d). * This patch **fails from timeout after a configured wait of \`300m\`**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22490 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96435/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22490 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22490 **[Test build #96435 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96435/testReport)** for PR 22490 at commit [`02ecf3f`](https://github.com/apache/spark/commit/02ecf3f6ba737332464021a4bbf7320b4d71bd70). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/22429 @gatorsmile @rednaxelafx @HyukjinKwon @viirya @hvanhovell Could you review the PR, please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22495 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96441/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22495 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22495: [SPARK-25486][TEST] Refactor SortBenchmark to use main m...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22495 **[Test build #96441 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96441/testReport)** for PR 22495 at commit [`3943a7f`](https://github.com/apache/spark/commit/3943a7f7b9cfa8f389c765ef4870323c4b40ab05). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22488 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22488 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96440/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22488: [SPARK-25479][TEST] Refactor DatasetBenchmark to use mai...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22488 **[Test build #96440 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96440/testReport)** for PR 22488 at commit [`71dfe03`](https://github.com/apache/spark/commit/71dfe03374466a780988a2d0ca3c6bc8cbdd11fd). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22288 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22288 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96443/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22288: [SPARK-22148][SPARK-15815][Scheduler] Acquire new execut...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22288 **[Test build #96443 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96443/testReport)** for PR 22288 at commit [`4c88168`](https://github.com/apache/spark/commit/4c881680fdde32244030b54b44125ac217dacb0d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22467: [SPARK-25465][TEST] Refactor Parquet test suites in proj...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22467 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22467: [SPARK-25465][TEST] Refactor Parquet test suites in proj...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22467 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96437/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22467: [SPARK-25465][TEST] Refactor Parquet test suites in proj...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22467 **[Test build #96437 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96437/testReport)** for PR 22467 at commit [`813d19c`](https://github.com/apache/spark/commit/813d19c63477b82a76bdd0d1da73cf3cb1d38564). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/22379 @dongjoon-hyun @HyukjinKwon Could you review this PR, please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22295: [SPARK-25255][PYTHON]Add getActiveSession to SparkSessio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22295 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96451/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22295: [SPARK-25255][PYTHON]Add getActiveSession to SparkSessio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22295 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22295: [SPARK-25255][PYTHON]Add getActiveSession to SparkSessio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22295 **[Test build #96451 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96451/testReport)** for PR 22295 at commit [`d7be3bf`](https://github.com/apache/spark/commit/d7be3bfbdbbcd2d95885f26bef690b7a949ff5ed). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22518 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96427/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22518 **[Test build #96427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96427/testReport)** for PR 22518 at commit [`36fa664`](https://github.com/apache/spark/commit/36fa664c6d251901270984115ff2ebfd1b665fca). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22518 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22518 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96428/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22379: [SPARK-25393][SQL] Adding new function from_csv()
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22379 **[Test build #96453 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96453/testReport)** for PR 22379 at commit [`4bba75e`](https://github.com/apache/spark/commit/4bba75ef332204afffd08ce282b77e0f3c53cab0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22518: [SPARK-25482][SQL] ReuseSubquery can be useless when the...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22518 **[Test build #96428 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96428/testReport)** for PR 22518 at commit [`7c75067`](https://github.com/apache/spark/commit/7c75067767ed6935960d09c7915da86fea3553fa). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22516: [SPARK-25468]Highlight current page index in the history...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22516 Hi @Adamyuanyuan , Please specify the code changes in your PR description. I take a quick check and you have added ``` .paginate_button.active { outline: none; background-color: #2b2b2b; } ``` Personally I feel that the color is not good looking. Currently we change the background to this color when the mouse pointer is over the element. Maybe we can set another color for this highlight for distinguish from that? Or we can change the border color only instead of the background. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219576996 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -83,6 +83,17 @@ package object config { private[spark] val EXECUTOR_CLASS_PATH = ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_CLASSPATH).stringConf.createOptional + private[spark] val EXECUTOR_HEARTBEAT_DROP_ZERO_METRICS = + ConfigBuilder("spark.executor.heartbeat.dropZeroMetrics").booleanConf.createWithDefault(true) + + private[spark] val EXECUTOR_HEARTBEAT_INTERVAL = +ConfigBuilder("spark.executor.heartbeatInterval") + .timeConf(TimeUnit.MILLISECONDS) + .createWithDefaultString("10s") + + private[spark] val EXECUTOR_HEARTBEAT_MAX_FAILURES = + ConfigBuilder("spark.executor.heartbeat.maxFailures").intConf.createWithDefault(60) --- End diff -- nit: call `internal()` to indicate that this is not a public config. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219575442 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -799,15 +799,21 @@ private[spark] class Executor( if (taskRunner.task != null) { taskRunner.task.metrics.mergeShuffleReadMetrics() taskRunner.task.metrics.setJvmGCTime(curGCTime - taskRunner.startGCTime) -accumUpdates += ((taskRunner.taskId, taskRunner.task.metrics.accumulators())) +val accumulatorsToReport = + if (conf.getBoolean(EXECUTOR_HEARTBEAT_DROP_ZERO_METRICS.key, true)) { +taskRunner.task.metrics.accumulators().filterNot(_.isZero) + } else { +taskRunner.task.metrics.accumulators() + } +accumUpdates += ((taskRunner.taskId, accumulatorsToReport)) } } val message = Heartbeat(executorId, accumUpdates.toArray, env.blockManager.blockManagerId, executorUpdates) try { val response = heartbeatReceiverRef.askSync[HeartbeatResponse]( - message, RpcTimeout(conf, "spark.executor.heartbeatInterval", "10s")) + message, RpcTimeout(conf, EXECUTOR_HEARTBEAT_INTERVAL.key, "10s")) --- End diff -- Could you add a new `apply` method to `object RpcTimeout` to support `ConfigEntry`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219574228 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -160,7 +160,7 @@ private[spark] class Executor( * times, it should kill itself. The default value is 60. It means we will retry to send * heartbeats about 10 minutes because the heartbeat interval is 10s. */ - private val HEARTBEAT_MAX_FAILURES = conf.getInt("spark.executor.heartbeat.maxFailures", 60) + private val HEARTBEAT_MAX_FAILURES = conf.getInt(EXECUTOR_HEARTBEAT_MAX_FAILURES.key, 60) --- End diff -- ditto --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219577386 --- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala --- @@ -252,18 +253,121 @@ class ExecutorSuite extends SparkFunSuite with LocalSparkContext with MockitoSug } } + test("Heartbeat should drop zero metrics") { +heartbeatZeroMetricTest(true) + } + + test("Heartbeat should not drop zero metrics when the conf is set to false") { +heartbeatZeroMetricTest(false) + } + + private def withHeartbeatExecutor(confs: (String, String)*) + (f: (Executor, ArrayBuffer[Heartbeat]) => Unit): Unit = { +val conf = new SparkConf +confs.foreach { case (k, v) => conf.set(k, v) } +val serializer = new JavaSerializer(conf) +val env = createMockEnv(conf, serializer) +val executor = + new Executor("id", "localhost", SparkEnv.get, userClassPath = Nil, isLocal = true) +val executorClass = classOf[Executor] + +// Set ExecutorMetricType.values to be a minimal set to avoid get null exceptions +val metricClass = + Utils.classForName(classOf[org.apache.spark.metrics.ExecutorMetricType].getName() + "$") +val metricTypeValues = metricClass.getDeclaredField("values") +metricTypeValues.setAccessible(true) +metricTypeValues.set( + org.apache.spark.metrics.ExecutorMetricType, + IndexedSeq(JVMHeapMemory, JVMOffHeapMemory)) + +// Save all heartbeats sent into an ArrayBuffer for verification +val heartbeats = ArrayBuffer[Heartbeat]() +val mockReceiver = mock[RpcEndpointRef] +when(mockReceiver.askSync(any[Heartbeat], any[RpcTimeout])(any)) + .thenAnswer(new Answer[HeartbeatResponse] { +override def answer(invocation: InvocationOnMock): HeartbeatResponse = { + val args = invocation.getArguments() + val mock = invocation.getMock + heartbeats += args(0).asInstanceOf[Heartbeat] + HeartbeatResponse(false) +} + }) +val receiverRef = executorClass.getDeclaredField("heartbeatReceiverRef") +receiverRef.setAccessible(true) +receiverRef.set(executor, mockReceiver) + +f(executor, heartbeats) + } + + private def invokeReportHeartbeat(executor: Executor): Unit = { +val method = classOf[Executor] + .getDeclaredMethod("org$apache$spark$executor$Executor$$reportHeartBeat") +method.setAccessible(true) +method.invoke(executor) + } + + private def heartbeatZeroMetricTest(dropZeroMetrics: Boolean): Unit = { +val c = "spark.executor.heartbeat.dropZeroMetrics" -> dropZeroMetrics.toString --- End diff -- nit: EXECUTOR_HEARTBEAT_DROP_ZERO_METRICS.key --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219575944 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -799,15 +799,21 @@ private[spark] class Executor( if (taskRunner.task != null) { taskRunner.task.metrics.mergeShuffleReadMetrics() taskRunner.task.metrics.setJvmGCTime(curGCTime - taskRunner.startGCTime) -accumUpdates += ((taskRunner.taskId, taskRunner.task.metrics.accumulators())) +val accumulatorsToReport = + if (conf.getBoolean(EXECUTOR_HEARTBEAT_DROP_ZERO_METRICS.key, true)) { --- End diff -- nit: I would prefer to keep this config value close to `HEARTBEAT_MAX_FAILURES` to avoid searching it in configs every heartbeat. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219576946 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -83,6 +83,17 @@ package object config { private[spark] val EXECUTOR_CLASS_PATH = ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_CLASSPATH).stringConf.createOptional + private[spark] val EXECUTOR_HEARTBEAT_DROP_ZERO_METRICS = + ConfigBuilder("spark.executor.heartbeat.dropZeroMetrics").booleanConf.createWithDefault(true) --- End diff -- Also please call `internal()` to indicate that this is not a public config. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219574155 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -149,7 +149,7 @@ private[spark] class Executor( // Executor for the heartbeat task. private val heartbeater = new Heartbeater(env.memoryManager, reportHeartBeat, -"executor-heartbeater", conf.getTimeAsMs("spark.executor.heartbeatInterval", "10s")) +"executor-heartbeater", conf.getTimeAsMs(EXECUTOR_HEARTBEAT_INTERVAL.key, "10s")) --- End diff -- nit: `conf.get(EXECUTOR_HEARTBEAT_INTERVAL)`. Could you search the whole code base and update them as well? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219573967 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -120,7 +120,7 @@ private[spark] class Executor( } // Whether to load classes in user jars before those in Spark jars - private val userClassPathFirst = conf.getBoolean("spark.executor.userClassPathFirst", false) + private val userClassPathFirst = conf.getBoolean(EXECUTOR_USER_CLASS_PATH_FIRST.key, false) --- End diff -- nit: `conf.get(EXECUTOR_USER_CLASS_PATH_FIRST)` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219576657 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -83,6 +83,17 @@ package object config { private[spark] val EXECUTOR_CLASS_PATH = ConfigBuilder(SparkLauncher.EXECUTOR_EXTRA_CLASSPATH).stringConf.createOptional + private[spark] val EXECUTOR_HEARTBEAT_DROP_ZERO_METRICS = + ConfigBuilder("spark.executor.heartbeat.dropZeroMetrics").booleanConf.createWithDefault(true) --- End diff -- maybe call it `spark.executor.heartbeat.dropZeroAccumulatorUpdates`? `externalAccums` may contain user accumulators and not all of them are metrics. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22473: [SPARK-25449][CORE] Heartbeat shouldn't include a...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/22473#discussion_r219580690 --- Diff: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala --- @@ -252,18 +253,121 @@ class ExecutorSuite extends SparkFunSuite with LocalSparkContext with MockitoSug } } + test("Heartbeat should drop zero metrics") { +heartbeatZeroMetricTest(true) + } + + test("Heartbeat should not drop zero metrics when the conf is set to false") { +heartbeatZeroMetricTest(false) + } + + private def withHeartbeatExecutor(confs: (String, String)*) + (f: (Executor, ArrayBuffer[Heartbeat]) => Unit): Unit = { +val conf = new SparkConf +confs.foreach { case (k, v) => conf.set(k, v) } +val serializer = new JavaSerializer(conf) +val env = createMockEnv(conf, serializer) +val executor = + new Executor("id", "localhost", SparkEnv.get, userClassPath = Nil, isLocal = true) +val executorClass = classOf[Executor] + +// Set ExecutorMetricType.values to be a minimal set to avoid get null exceptions +val metricClass = + Utils.classForName(classOf[org.apache.spark.metrics.ExecutorMetricType].getName() + "$") +val metricTypeValues = metricClass.getDeclaredField("values") +metricTypeValues.setAccessible(true) +metricTypeValues.set( + org.apache.spark.metrics.ExecutorMetricType, + IndexedSeq(JVMHeapMemory, JVMOffHeapMemory)) + +// Save all heartbeats sent into an ArrayBuffer for verification +val heartbeats = ArrayBuffer[Heartbeat]() +val mockReceiver = mock[RpcEndpointRef] +when(mockReceiver.askSync(any[Heartbeat], any[RpcTimeout])(any)) + .thenAnswer(new Answer[HeartbeatResponse] { +override def answer(invocation: InvocationOnMock): HeartbeatResponse = { + val args = invocation.getArguments() + val mock = invocation.getMock + heartbeats += args(0).asInstanceOf[Heartbeat] + HeartbeatResponse(false) +} + }) +val receiverRef = executorClass.getDeclaredField("heartbeatReceiverRef") +receiverRef.setAccessible(true) +receiverRef.set(executor, mockReceiver) + +f(executor, heartbeats) + } + + private def invokeReportHeartbeat(executor: Executor): Unit = { --- End diff -- You can mixin `org.scalatest.PrivateMethodTester` to replace this method, such as ``` val reportHeartBeat = PrivateMethod[Long]('reportHeartBeat) ... executor.invokePrivate(reportHeartBeat()) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22490 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96442/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22490 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22490: [SPARK-25481][TEST] Refactor ColumnarBatchBenchmark to u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22490 **[Test build #96442 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96442/testReport)** for PR 22490 at commit [`fb1ab6a`](https://github.com/apache/spark/commit/fb1ab6a35769cfdf743f7c880524b2a102ad2c3c). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22444: [SPARK-25409][Core]Speed up Spark History loading via in...
Github user jianjianjiao commented on the issue: https://github.com/apache/spark/pull/22444 @squito Yes, you are correct. I was trying to make the applications running during the scan be picked up quicker. It turns out the SPARK-6951 has done great job in achieving this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/22522 cc @cloud-fan @gengliangwang @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22522 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22522 **[Test build #96452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/96452/testReport)** for PR 22522 at commit [`275cc6c`](https://github.com/apache/spark/commit/275cc6c5f8f106eb339c7ed01734e279a223705e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22522: [SPARK-25510][TEST] Create new trait replace BenchmarkWi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22522 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/3361/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-25454][SQL] add a new config for picking minimum ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22494: [SPARK-25454][SQL] add a new config for picking minimum ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/96429/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org