[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20653 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20653 **[Test build #87612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87612/testReport)** for PR 20653 at commit [`b5c8d56`](https://github.com/apache/spark/commit/b5c8d562434354ed6e68f41b32c79eda60ae6d07). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87610 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87610/testReport)** for PR 19788 at commit [`583666b`](https://github.com/apache/spark/commit/583666bf5ae6102180851fa1cc3129cc344e919c). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20653 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87612/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87610/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20649: [SPARK-23462][SQL] improve missing field error me...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20649#discussion_r169882987 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/StructTypeSuite.scala --- @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql + +import org.apache.spark.SparkFunSuite +import org.apache.spark.sql.types.StructType + +class StructTypeSuite extends SparkFunSuite { --- End diff -- Could we put this file under `sql/catalyst/src/test/scala/org/apache/spark/sql/types` or move the test cases into `DataTypeSuite`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 @cloud-fan if encryption is enabled `blockManager.serializerManager().encryptionEnabled() == true`, shall we disable this feature also? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r169898393 --- Diff: project/MimaExcludes.scala --- @@ -1129,6 +1129,12 @@ object MimaExcludes { ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.org$apache$spark$ml$param$shared$HasLoss$_setter_$loss_="), ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.getLoss"), ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.loss") +) ++ Seq( + // [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs. + ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.FetchFailed$"), --- End diff -- We can get rid of this if we don't touch `FetchFailed`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r169899114 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -295,8 +307,8 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf) extends Logging * and the second item is a sequence of (shuffle block id, shuffle block size) tuples * describing the shuffle blocks that are stored at that block manager. */ - def getMapSizesByExecutorId(shuffleId: Int, startPartition: Int, endPartition: Int) - : Seq[(BlockManagerId, Seq[(BlockId, Long)])] + def getMapSizesByExecutorId(shuffleId: Int, startPartition: Int, endPartition: Int, + serializer: Serializer = null): Seq[(BlockManagerId, Seq[(BlockId, Long)])] --- End diff -- if possible let's not introduce a default value here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/20651 @zsxwing SKIPPED stages were previously shown as PENDING. Despite I agree that your fix is needed, I think this is needed too. Do you agree @vanzin? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...
Github user yucai commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r169901683 --- Diff: project/MimaExcludes.scala --- @@ -1129,6 +1129,12 @@ object MimaExcludes { ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.org$apache$spark$ml$param$shared$HasLoss$_setter_$loss_="), ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.getLoss"), ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.loss") +) ++ Seq( + // [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs. + ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.FetchFailed$"), --- End diff -- I see, so I will keep FetchFailed as it was (no numBlocks is introduced), is it OK? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r169902931 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -295,8 +307,8 @@ private[spark] abstract class MapOutputTracker(conf: SparkConf) extends Logging * and the second item is a sequence of (shuffle block id, shuffle block size) tuples * describing the shuffle blocks that are stored at that block manager. */ - def getMapSizesByExecutorId(shuffleId: Int, startPartition: Int, endPartition: Int) - : Seq[(BlockManagerId, Seq[(BlockId, Long)])] + def getMapSizesByExecutorId(shuffleId: Int, startPartition: Int, endPartition: Int, + serializer: Serializer = null): Seq[(BlockManagerId, Seq[(BlockId, Long)])] --- End diff -- I'd like to pass in a `serializerRelocatable: Boolean` instead of a serializer. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20659 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20659 **[Test build #87613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87613/testReport)** for PR 20659 at commit [`a5bb731`](https://github.com/apache/spark/commit/a5bb731985488892ef9bc8ec9bbcff2a218d0130). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1005/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20659 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20659 **[Test build #87613 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87613/testReport)** for PR 20659 at commit [`a5bb731`](https://github.com/apache/spark/commit/a5bb731985488892ef9bc8ec9bbcff2a218d0130). * This patch **fails to generate documentation**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20659 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87613/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20653 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20653 **[Test build #87614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87614/testReport)** for PR 20653 at commit [`b5c8d56`](https://github.com/apache/spark/commit/b5c8d562434354ed6e68f41b32c79eda60ae6d07). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20653 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20653 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1006/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20651: [SPARK-23475][UI] Show also skipped stages
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/20651#discussion_r169919844 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -172,6 +185,20 @@ private[ui] class AllStagesPage(parent: StagesTab) extends WebUIPage("") { {completedStagesTable.toNodeSeq} } +if (shouldShowSkippedStages) { + content ++= +
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20651 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20651 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1007/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20651 **[Test build #87615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87615/testReport)** for PR 20651 at commit [`e04bf24`](https://github.com/apache/spark/commit/e04bf24ed0a712aa380de8dd929c7c1d4fdb2990). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20659 **[Test build #87616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87616/testReport)** for PR 20659 at commit [`80fd8a8`](https://github.com/apache/spark/commit/80fd8a8aa3c3e42cd99f164f80cfcc6f46e2f247). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20659 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1008/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20648: [SPARK-23448][SQL] JSON parser should return partial row...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20648 From the codes, looks like there is an intention to have partial results when failing to parse the documents. This patch makes the partial results. But this should be considered as behavior change, and we should discuss if this is acceptable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20653 **[Test build #87614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87614/testReport)** for PR 20653 at commit [`b5c8d56`](https://github.com/apache/spark/commit/b5c8d562434354ed6e68f41b32c79eda60ae6d07). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20653 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87614/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20653: [SPARK-23459][SQL] Improve the error message when unknow...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20653 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20659 **[Test build #87616 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87616/testReport)** for PR 20659 at commit [`80fd8a8`](https://github.com/apache/spark/commit/80fd8a8aa3c3e42cd99f164f80cfcc6f46e2f247). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20659 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87616/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20651 **[Test build #87615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87615/testReport)** for PR 20651 at commit [`e04bf24`](https://github.com/apache/spark/commit/e04bf24ed0a712aa380de8dd929c7c1d4fdb2990). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20659: [DNM] Try to update Hive to 2.3.2
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20659 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20651 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87615/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20651 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20648: [SPARK-23448][SQL] JSON parser should return partial row...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20648 I was just double checking the current status for both CSV and JSON: Seems CSV fills up the partial results with an exception (which is caught by permissive mode with the corrupt record text) when the length of schema is mismatched whereas JSON just allows this case without the exception (and also without the corrupt record text). and ... both CSV and JSON don't fill the partial results when the type of the field is mismatched but this change supports this case in JSON. cc @cloud-fan too. I remember we had a short talk about this partial results before. Do you think we should produce partial results in both CSV and JSON for mismatched types? Looks we are currently not doing this for both CSV and JSON when the types are mismatched, if I haven't missed something. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/20651 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20651 **[Test build #87617 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87617/testReport)** for PR 20651 at commit [`e04bf24`](https://github.com/apache/spark/commit/e04bf24ed0a712aa380de8dd929c7c1d4fdb2990). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20651 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1009/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20651 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19788: [SPARK-9853][Core] Optimize shuffle fetch of cont...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19788#discussion_r169967506 --- Diff: project/MimaExcludes.scala --- @@ -1129,6 +1129,12 @@ object MimaExcludes { ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.org$apache$spark$ml$param$shared$HasLoss$_setter_$loss_="), ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.getLoss"), ProblemFilters.exclude[InheritedNewAbstractMethodProblem]("org.apache.spark.ml.param.shared.HasLoss.loss") +) ++ Seq( + // [SPARK-9853][Core] Optimize shuffle fetch of contiguous partition IDs. + ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.FetchFailed$"), --- End diff -- yup --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20659: [DNM] Try to update Hive to 2.3.2
Github user wangyum closed the pull request at: https://github.com/apache/spark/pull/20659 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20618: [SPARK-23329][SQL] Fix documentation of trigonometric fu...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20618 Was the result to add more changes to this PR or add them in another PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20652: [SPARK-23476][CORE] Generate secret in local mode when a...
Github user squito commented on the issue: https://github.com/apache/spark/pull/20652 lgtm --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r169983494 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -58,7 +58,8 @@ public MemoryBlock allocate(long size) throws OutOfMemoryError { final long[] array = arrayReference.get(); if (array != null) { assert (array.length * 8L >= size); - MemoryBlock memory = new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, size); + MemoryBlock memory = +new OnHeapMemoryBlock(array, Platform.LONG_ARRAY_OFFSET, size); --- End diff -- `OnHeapMemoryBlock.fromArray`? we should hide `Platform.LONG_ARRAY_OFFSET` from the caller side. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r169983548 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/HeapMemoryAllocator.java --- @@ -70,7 +71,7 @@ public MemoryBlock allocate(long size) throws OutOfMemoryError { } } long[] array = new long[numWords]; -MemoryBlock memory = new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, size); +MemoryBlock memory = new OnHeapMemoryBlock(array, Platform.LONG_ARRAY_OFFSET, size); --- End diff -- ditto --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r169983885 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -22,10 +22,9 @@ import org.apache.spark.unsafe.Platform; /** - * A consecutive block of memory, starting at a {@link MemoryLocation} with a fixed size. + * A declaration of interfaces of MemoryBlock classes . --- End diff -- `A representation of a consecutive memory block in Spark. It defines the common interfaces for memory accessing and mutating.` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r169984899 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -45,38 +44,149 @@ */ public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3; - private final long length; + @Nullable + protected Object obj; + + protected long offset; + + protected long length; /** * Optional page number; used when this MemoryBlock represents a page allocated by a - * TaskMemoryManager. This field is public so that it can be modified by the TaskMemoryManager, - * which lives in a different package. + * TaskMemoryManager. This field can be updated using setPageNumber method so that + * this can be modified by the TaskMemoryManager, which lives in a different package. */ - public int pageNumber = NO_PAGE_NUMBER; + private int pageNumber = NO_PAGE_NUMBER; public MemoryBlock(@Nullable Object obj, long offset, long length) { -super(obj, offset); +this.obj = obj; +this.offset = offset; this.length = length; } + public MemoryBlock() { +this(null, 0, 0); + } + + public final Object getBaseObject() { +return obj; + } + + public final long getBaseOffset() { +return offset; + } + + public void resetObjAndOffset() { +this.obj = null; +this.offset = 0; + } + /** * Returns the size of the memory block. */ - public long size() { + public final long size() { return length; } - /** - * Creates a memory block pointing to the memory used by the long array. - */ - public static MemoryBlock fromLongArray(final long[] array) { -return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length * 8L); + public final void setPageNumber(int pageNum) { +pageNumber = pageNum; + } + + public final int getPageNumber() { +return pageNumber; } /** * Fills the memory block with the specified byte value. */ - public void fill(byte value) { + public final void fill(byte value) { Platform.setMemory(obj, offset, length, value); } + + /** + * Instantiate MemoryBlock for given object type with new offset + */ + public final static MemoryBlock allocateFromObject(Object obj, long offset, long length) { +MemoryBlock mb = null; +if (obj instanceof byte[]) { + byte[] array = (byte[])obj; + mb = new ByteArrayMemoryBlock(array, offset, length); +} else if (obj instanceof long[]) { + long[] array = (long[])obj; + mb = new OnHeapMemoryBlock(array, offset, length); +} else if (obj == null) { + // we assume that to pass null pointer means off-heap + mb = new OffHeapMemoryBlock(offset, length); +} else { + throw new UnsupportedOperationException(obj.getClass() + " is not supported now"); +} +return mb; + } + + /** + * Instantiate the same type of MemoryBlock with new offset and size + */ + public abstract MemoryBlock allocate(long offset, long size); + + + public abstract int getInt(long offset); --- End diff -- What should we expect the offset to be? Ideally I think we should assume the beginning offset of this memory block is 0, and concrete implementations should adjust by adding `Platform.BYTE_ARRAY_OFFSET` or something accordingly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r169985377 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -45,38 +44,149 @@ */ public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3; - private final long length; + @Nullable + protected Object obj; + + protected long offset; + + protected long length; /** * Optional page number; used when this MemoryBlock represents a page allocated by a - * TaskMemoryManager. This field is public so that it can be modified by the TaskMemoryManager, - * which lives in a different package. + * TaskMemoryManager. This field can be updated using setPageNumber method so that + * this can be modified by the TaskMemoryManager, which lives in a different package. */ - public int pageNumber = NO_PAGE_NUMBER; + private int pageNumber = NO_PAGE_NUMBER; public MemoryBlock(@Nullable Object obj, long offset, long length) { -super(obj, offset); +this.obj = obj; +this.offset = offset; this.length = length; } + public MemoryBlock() { +this(null, 0, 0); + } + + public final Object getBaseObject() { +return obj; + } + + public final long getBaseOffset() { +return offset; + } + + public void resetObjAndOffset() { +this.obj = null; +this.offset = 0; + } + /** * Returns the size of the memory block. */ - public long size() { + public final long size() { return length; } - /** - * Creates a memory block pointing to the memory used by the long array. - */ - public static MemoryBlock fromLongArray(final long[] array) { -return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length * 8L); + public final void setPageNumber(int pageNum) { +pageNumber = pageNum; + } + + public final int getPageNumber() { +return pageNumber; } /** * Fills the memory block with the specified byte value. */ - public void fill(byte value) { + public final void fill(byte value) { Platform.setMemory(obj, offset, length, value); } + + /** + * Instantiate MemoryBlock for given object type with new offset + */ + public final static MemoryBlock allocateFromObject(Object obj, long offset, long length) { +MemoryBlock mb = null; +if (obj instanceof byte[]) { + byte[] array = (byte[])obj; + mb = new ByteArrayMemoryBlock(array, offset, length); +} else if (obj instanceof long[]) { + long[] array = (long[])obj; + mb = new OnHeapMemoryBlock(array, offset, length); +} else if (obj == null) { + // we assume that to pass null pointer means off-heap + mb = new OffHeapMemoryBlock(offset, length); +} else { + throw new UnsupportedOperationException(obj.getClass() + " is not supported now"); +} +return mb; + } + + /** + * Instantiate the same type of MemoryBlock with new offset and size + */ + public abstract MemoryBlock allocate(long offset, long size); + + + public abstract int getInt(long offset); + + public abstract void putInt(long offset, int value); + + public abstract boolean getBoolean(long offset); + + public abstract void putBoolean(long offset, boolean value); + + public abstract byte getByte(long offset); + + public abstract void putByte(long offset, byte value); + + public abstract short getShort(long offset); + + public abstract void putShort(long offset, short value); + + public abstract long getLong(long offset); + + public abstract void putLong(long offset, long value); + + public abstract float getFloat(long offset); + + public abstract void putFloat(long offset, float value); + + public abstract double getDouble(long offset); + + public abstract void putDouble(long offset, double value); + + public abstract Object getObjectVolatile(long offset); --- End diff -- do we really need this? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r169987267 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -45,38 +44,149 @@ */ public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3; - private final long length; + @Nullable + protected Object obj; + + protected long offset; + + protected long length; /** * Optional page number; used when this MemoryBlock represents a page allocated by a - * TaskMemoryManager. This field is public so that it can be modified by the TaskMemoryManager, - * which lives in a different package. + * TaskMemoryManager. This field can be updated using setPageNumber method so that + * this can be modified by the TaskMemoryManager, which lives in a different package. */ - public int pageNumber = NO_PAGE_NUMBER; + private int pageNumber = NO_PAGE_NUMBER; public MemoryBlock(@Nullable Object obj, long offset, long length) { -super(obj, offset); +this.obj = obj; +this.offset = offset; this.length = length; } + public MemoryBlock() { +this(null, 0, 0); + } + + public final Object getBaseObject() { +return obj; + } + + public final long getBaseOffset() { +return offset; + } + + public void resetObjAndOffset() { +this.obj = null; +this.offset = 0; + } + /** * Returns the size of the memory block. */ - public long size() { + public final long size() { return length; } - /** - * Creates a memory block pointing to the memory used by the long array. - */ - public static MemoryBlock fromLongArray(final long[] array) { -return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length * 8L); + public final void setPageNumber(int pageNum) { +pageNumber = pageNum; + } + + public final int getPageNumber() { +return pageNumber; } /** * Fills the memory block with the specified byte value. */ - public void fill(byte value) { + public final void fill(byte value) { Platform.setMemory(obj, offset, length, value); } + + /** + * Instantiate MemoryBlock for given object type with new offset + */ + public final static MemoryBlock allocateFromObject(Object obj, long offset, long length) { +MemoryBlock mb = null; +if (obj instanceof byte[]) { + byte[] array = (byte[])obj; + mb = new ByteArrayMemoryBlock(array, offset, length); +} else if (obj instanceof long[]) { + long[] array = (long[])obj; + mb = new OnHeapMemoryBlock(array, offset, length); +} else if (obj == null) { + // we assume that to pass null pointer means off-heap + mb = new OffHeapMemoryBlock(offset, length); +} else { + throw new UnsupportedOperationException(obj.getClass() + " is not supported now"); +} +return mb; + } + + /** + * Instantiate the same type of MemoryBlock with new offset and size + */ + public abstract MemoryBlock allocate(long offset, long size); + + + public abstract int getInt(long offset); + + public abstract void putInt(long offset, int value); + + public abstract boolean getBoolean(long offset); + + public abstract void putBoolean(long offset, boolean value); + + public abstract byte getByte(long offset); + + public abstract void putByte(long offset, byte value); + + public abstract short getShort(long offset); + + public abstract void putShort(long offset, short value); + + public abstract long getLong(long offset); + + public abstract void putLong(long offset, long value); + + public abstract float getFloat(long offset); + + public abstract void putFloat(long offset, float value); + + public abstract double getDouble(long offset); + + public abstract void putDouble(long offset, double value); + + public abstract Object getObjectVolatile(long offset); + + public abstract void putObjectVolatile(long offset, Object value); + + public static void copyMemory( + MemoryBlock src, long srcOffset, MemoryBlock dst, long dstOffset, long length) { +Platform.copyMemory(src.getBaseObject(), srcOffset, dst.getBaseObject(), dstOffset, length); + } + + public abstract void copyFrom(byte[] src, long srcOffset, long dstOffset, long length); --- End diff -- 2 thoughts: 1. How much benefit can we get by making this method abstract? Is it as significant as other methods like
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r169987802 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -45,38 +44,149 @@ */ public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3; - private final long length; + @Nullable + protected Object obj; + + protected long offset; + + protected long length; /** * Optional page number; used when this MemoryBlock represents a page allocated by a - * TaskMemoryManager. This field is public so that it can be modified by the TaskMemoryManager, - * which lives in a different package. + * TaskMemoryManager. This field can be updated using setPageNumber method so that + * this can be modified by the TaskMemoryManager, which lives in a different package. */ - public int pageNumber = NO_PAGE_NUMBER; + private int pageNumber = NO_PAGE_NUMBER; public MemoryBlock(@Nullable Object obj, long offset, long length) { -super(obj, offset); +this.obj = obj; +this.offset = offset; this.length = length; } + public MemoryBlock() { +this(null, 0, 0); + } + + public final Object getBaseObject() { +return obj; + } + + public final long getBaseOffset() { +return offset; + } + + public void resetObjAndOffset() { +this.obj = null; +this.offset = 0; + } + /** * Returns the size of the memory block. */ - public long size() { + public final long size() { return length; } - /** - * Creates a memory block pointing to the memory used by the long array. - */ - public static MemoryBlock fromLongArray(final long[] array) { -return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length * 8L); + public final void setPageNumber(int pageNum) { +pageNumber = pageNum; + } + + public final int getPageNumber() { +return pageNumber; } /** * Fills the memory block with the specified byte value. */ - public void fill(byte value) { + public final void fill(byte value) { Platform.setMemory(obj, offset, length, value); } + + /** + * Instantiate MemoryBlock for given object type with new offset + */ + public final static MemoryBlock allocateFromObject(Object obj, long offset, long length) { +MemoryBlock mb = null; +if (obj instanceof byte[]) { + byte[] array = (byte[])obj; + mb = new ByteArrayMemoryBlock(array, offset, length); +} else if (obj instanceof long[]) { + long[] array = (long[])obj; + mb = new OnHeapMemoryBlock(array, offset, length); +} else if (obj == null) { + // we assume that to pass null pointer means off-heap + mb = new OffHeapMemoryBlock(offset, length); +} else { + throw new UnsupportedOperationException(obj.getClass() + " is not supported now"); +} +return mb; + } + + /** + * Instantiate the same type of MemoryBlock with new offset and size + */ + public abstract MemoryBlock allocate(long offset, long size); + + + public abstract int getInt(long offset); --- End diff -- Looks like now we ask the caller side to add `MemoryBlock.baseOffset` themselves, I think we should not leave this to the caller side. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20647: [SPARK-23303][SQL] improve the explain result for...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20647#discussion_r169990652 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala --- @@ -107,17 +106,24 @@ case class DataSourceV2Relation( } /** - * A specialization of DataSourceV2Relation with the streaming bit set to true. Otherwise identical - * to the non-streaming relation. + * A specialization of [[DataSourceV2Relation]] with the streaming bit set to true. + * + * Note that, this plan has a mutable reader, so Spark won't apply operator push-down for this plan, + * to avoid making the plan mutable. We should consolidate this plan and [[DataSourceV2Relation]] + * after we figure out how to apply operator push-down for streaming data sources. --- End diff -- We can also ask the implementations to clear out all the state after `createX` is called, then we don't need to add any new APIs. Anyway there should be a detailed design doc for data source v2 operator push down, it doesn't block this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r169995800 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -45,38 +44,149 @@ */ public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3; - private final long length; + @Nullable + protected Object obj; + + protected long offset; + + protected long length; /** * Optional page number; used when this MemoryBlock represents a page allocated by a - * TaskMemoryManager. This field is public so that it can be modified by the TaskMemoryManager, - * which lives in a different package. + * TaskMemoryManager. This field can be updated using setPageNumber method so that + * this can be modified by the TaskMemoryManager, which lives in a different package. */ - public int pageNumber = NO_PAGE_NUMBER; + private int pageNumber = NO_PAGE_NUMBER; public MemoryBlock(@Nullable Object obj, long offset, long length) { -super(obj, offset); +this.obj = obj; +this.offset = offset; this.length = length; } + public MemoryBlock() { +this(null, 0, 0); + } + + public final Object getBaseObject() { +return obj; + } + + public final long getBaseOffset() { +return offset; + } + + public void resetObjAndOffset() { +this.obj = null; +this.offset = 0; + } + /** * Returns the size of the memory block. */ - public long size() { + public final long size() { return length; } - /** - * Creates a memory block pointing to the memory used by the long array. - */ - public static MemoryBlock fromLongArray(final long[] array) { -return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length * 8L); + public final void setPageNumber(int pageNum) { +pageNumber = pageNum; + } + + public final int getPageNumber() { +return pageNumber; } /** * Fills the memory block with the specified byte value. */ - public void fill(byte value) { + public final void fill(byte value) { Platform.setMemory(obj, offset, length, value); } + + /** + * Instantiate MemoryBlock for given object type with new offset + */ + public final static MemoryBlock allocateFromObject(Object obj, long offset, long length) { +MemoryBlock mb = null; +if (obj instanceof byte[]) { + byte[] array = (byte[])obj; + mb = new ByteArrayMemoryBlock(array, offset, length); +} else if (obj instanceof long[]) { + long[] array = (long[])obj; + mb = new OnHeapMemoryBlock(array, offset, length); +} else if (obj == null) { + // we assume that to pass null pointer means off-heap + mb = new OffHeapMemoryBlock(offset, length); +} else { + throw new UnsupportedOperationException(obj.getClass() + " is not supported now"); +} +return mb; + } + + /** + * Instantiate the same type of MemoryBlock with new offset and size + */ + public abstract MemoryBlock allocate(long offset, long size); + + + public abstract int getInt(long offset); + + public abstract void putInt(long offset, int value); + + public abstract boolean getBoolean(long offset); + + public abstract void putBoolean(long offset, boolean value); + + public abstract byte getByte(long offset); + + public abstract void putByte(long offset, byte value); + + public abstract short getShort(long offset); + + public abstract void putShort(long offset, short value); + + public abstract long getLong(long offset); + + public abstract void putLong(long offset, long value); + + public abstract float getFloat(long offset); + + public abstract void putFloat(long offset, float value); + + public abstract double getDouble(long offset); + + public abstract void putDouble(long offset, double value); + + public abstract Object getObjectVolatile(long offset); --- End diff -- Since Spark doe not use `xxxObjectVolatile`, we can remove them. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20652: [SPARK-23476][CORE] Generate secret in local mode...
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/20652#discussion_r169996395 --- Diff: core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala --- @@ -440,23 +440,41 @@ class SecurityManagerSuite extends SparkFunSuite with ResetSystemProperties { assert(keyFromEnv === new SecurityManager(conf2).getSecretKey()) } - test("secret key generation in yarn mode") { -val conf = new SparkConf() - .set(NETWORK_AUTH_ENABLED, true) - .set(SparkLauncher.SPARK_MASTER, "yarn") -val mgr = new SecurityManager(conf) - -UserGroupInformation.createUserForTesting("authTest", Array()).doAs( - new PrivilegedExceptionAction[Unit]() { -override def run(): Unit = { - mgr.initializeAuth() - val creds = UserGroupInformation.getCurrentUser().getCredentials() - val secret = creds.getSecretKey(SecurityManager.SECRET_LOOKUP_KEY) - assert(secret != null) - assert(new String(secret, UTF_8) === mgr.getSecretKey()) + test("secret key generation") { +Seq( + ("yarn", true), + ("local", true), + ("local[*]", true), + ("local[1,2]", true), --- End diff -- nit: `local[1,2]` -> `local[1, 2]` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20660: [SPARK-23490][SQL]Check storage.locationUri with ...
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/20660 [SPARK-23490][SQL]Check storage.locationUri with existing table in CreateTable ## What changes were proposed in this pull request? For CreateTable with Append mode, we should check if `storage.locationUri` is the same with existing table in `PreprocessTableCreation` In the current code, there is only a simple exception if the `storage.locationUri` is different with existing table: `org.apache.spark.sql.AnalysisException: Table or view not found:` which can be improved. ## How was this patch tested? Unit test You can merge this pull request into a Git repository by running: $ git pull https://github.com/gengliangwang/spark locationUri Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20660.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20660 commit d72f56d2cb4a4a3ba56f1013d0ddd21b145d299c Author: Wang Gengliang Date: 2018-02-22T15:35:28Z Check storage.locationUri with existing table in CreateTable --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20660: [SPARK-23490][SQL]Check storage.locationUri with existin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20660 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1010/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20660: [SPARK-23490][SQL]Check storage.locationUri with existin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20660 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20660: [SPARK-23490][SQL]Check storage.locationUri with existin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20660 **[Test build #87618 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87618/testReport)** for PR 20660 at commit [`d72f56d`](https://github.com/apache/spark/commit/d72f56d2cb4a4a3ba56f1013d0ddd21b145d299c). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20645: SPARK-23472: Add defaultJavaOptions for drivers and exec...
Github user squito commented on the issue: https://github.com/apache/spark/pull/20645 I agree it would be nicer to have this be a more general feature. I would prefer an approach which didn't require a different configuration name, just as its more to document & for users to keep track of. An alternative would be to have another syntax for appending to an options, perhaps ++= like scala , eg. "--conf spark.executor.extraJavaOptions++=..." You'd still need to tag the ConfigBuilder with what separator to use to merge the values together. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20652: [SPARK-23476][CORE] Generate secret in local mode...
Github user gaborgsomogyi commented on a diff in the pull request: https://github.com/apache/spark/pull/20652#discussion_r170003945 --- Diff: core/src/test/scala/org/apache/spark/SecurityManagerSuite.scala --- @@ -440,23 +440,41 @@ class SecurityManagerSuite extends SparkFunSuite with ResetSystemProperties { assert(keyFromEnv === new SecurityManager(conf2).getSecretKey()) } - test("secret key generation in yarn mode") { -val conf = new SparkConf() - .set(NETWORK_AUTH_ENABLED, true) - .set(SparkLauncher.SPARK_MASTER, "yarn") -val mgr = new SecurityManager(conf) - -UserGroupInformation.createUserForTesting("authTest", Array()).doAs( - new PrivilegedExceptionAction[Unit]() { -override def run(): Unit = { - mgr.initializeAuth() - val creds = UserGroupInformation.getCurrentUser().getCredentials() - val secret = creds.getSecretKey(SecurityManager.SECRET_LOOKUP_KEY) - assert(secret != null) - assert(new String(secret, UTF_8) === mgr.getSecretKey()) + test("secret key generation") { +Seq( + ("yarn", true), + ("local", true), + ("local[*]", true), + ("local[1,2]", true), --- End diff -- Fixed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20652: [SPARK-23476][CORE] Generate secret in local mode when a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20652 **[Test build #87619 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87619/testReport)** for PR 20652 at commit [`e01feef`](https://github.com/apache/spark/commit/e01feeff30ed226d638055fed0238ef0e1c0c69e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20658: [SPARK-23488][python] Add missing catalog methods...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20658#discussion_r170003647 --- Diff: python/pyspark/sql/catalog.py --- @@ -137,6 +138,78 @@ def listColumns(self, tableName, dbName=None): isBucket=jcolumn.isBucket())) return columns +@ignore_unicode_prefix +# TODO: @since() decorator? --- End diff -- I think `@since(2.4)`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20658: [SPARK-23488][python] Add missing catalog methods...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20658#discussion_r170003442 --- Diff: python/pyspark/sql/catalog.py --- @@ -28,7 +28,7 @@ Database = namedtuple("Database", "name description locationUri") Table = namedtuple("Table", "name database description tableType isTemporary") Column = namedtuple("Column", "name description dataType nullable isPartition isBucket") -Function = namedtuple("Function", "name description className isTemporary") +Function = namedtuple("Function", "name database description className isTemporary") --- End diff -- Hm, wouldn't this break backward compatibility? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20653: [SPARK-23459][SQL] Improve the error message when...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20653#discussion_r170025695 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala --- @@ -491,6 +491,22 @@ class FileSourceStrategySuite extends QueryTest with SharedSQLContext with Predi } } + test("SPARK-23459: Improve error message when specfieed unknown column in partition columns") { --- End diff -- move it to SaveLoadSuite? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170031873 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -387,6 +390,101 @@ case class CatalogStatistics( } } +/** + * This class of statistics for a column is used in [[CatalogTable]] to interact with metastore. + */ +case class CatalogColumnStat( + distinctCount: Option[BigInt] = None, + min: Option[String] = None, + max: Option[String] = None, + nullCount: Option[BigInt] = None, + avgLen: Option[Long] = None, + maxLen: Option[Long] = None, + histogram: Option[Histogram] = None) { + + /** + * Returns a map from string to string that can be used to serialize the column stats. + * The key is the name of the column and name of the field (e.g. "colName.distinctCount"), + * and the value is the string representation for the value. + * min/max values are stored as Strings. They can be deserialized using + * [[ColumnStat.fromExternalString]]. + * + * As part of the protocol, the returned map always contains a key called "version". + * In the case min/max values are null (None), they won't appear in the map. + */ + def toMap(colName: String): Map[String, String] = { +val map = new scala.collection.mutable.HashMap[String, String] +map.put(s"${colName}.${CatalogColumnStat.KEY_VERSION}", "1") +distinctCount.foreach { v => + map.put(s"${colName}.${CatalogColumnStat.KEY_DISTINCT_COUNT}", v.toString) +} +nullCount.foreach { v => + map.put(s"${colName}.${CatalogColumnStat.KEY_NULL_COUNT}", v.toString) +} +avgLen.foreach { v => map.put(s"${colName}.${CatalogColumnStat.KEY_AVG_LEN}", v.toString) } +maxLen.foreach { v => map.put(s"${colName}.${CatalogColumnStat.KEY_MAX_LEN}", v.toString) } +min.foreach { v => map.put(s"${colName}.${CatalogColumnStat.KEY_MIN_VALUE}", v) } +max.foreach { v => map.put(s"${colName}.${CatalogColumnStat.KEY_MAX_VALUE}", v) } +histogram.foreach { h => + map.put(s"${colName}.${CatalogColumnStat.KEY_HISTOGRAM}", HistogramSerializer.serialize(h)) +} +map.toMap + } + + /** Convert [[CatalogColumnStat]] to [[ColumnStat]]. */ + def toPlanStat( + colName: String, + dataType: DataType): ColumnStat = +ColumnStat( + distinctCount = distinctCount, + min = min.map(ColumnStat.fromExternalString(_, colName, dataType)), + max = max.map(ColumnStat.fromExternalString(_, colName, dataType)), + nullCount = nullCount, + avgLen = avgLen, + maxLen = maxLen, + histogram = histogram) +} + +object CatalogColumnStat extends Logging { + + // List of string keys used to serialize CatalogColumnStat + val KEY_VERSION = "version" + private val KEY_DISTINCT_COUNT = "distinctCount" + private val KEY_MIN_VALUE = "min" + private val KEY_MAX_VALUE = "max" + private val KEY_NULL_COUNT = "nullCount" + private val KEY_AVG_LEN = "avgLen" + private val KEY_MAX_LEN = "maxLen" + private val KEY_HISTOGRAM = "histogram" + + /** + * Creates a [[CatalogColumnStat]] object from the given map. + * This is used to deserialize column stats from some external storage. + * The serialization side is defined in [[CatalogColumnStat.toMap]]. + */ + def fromMap( +table: String, +colName: String, +map: Map[String, String]): Option[CatalogColumnStat] = { + +try { + Some(CatalogColumnStat( +distinctCount = map.get(s"${colName}.${KEY_DISTINCT_COUNT}").map(v => BigInt(v.toLong)), --- End diff -- Do we have migration issue here? Now, the key is changed. Can Spark 2.4 read the catalog prop wrote by Spark 2.3? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20624: [SPARK-23445] ColumnStat refactoring
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20624#discussion_r170028784 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -387,6 +390,101 @@ case class CatalogStatistics( } } +/** + * This class of statistics for a column is used in [[CatalogTable]] to interact with metastore. + */ +case class CatalogColumnStat( + distinctCount: Option[BigInt] = None, + min: Option[String] = None, + max: Option[String] = None, + nullCount: Option[BigInt] = None, + avgLen: Option[Long] = None, + maxLen: Option[Long] = None, + histogram: Option[Histogram] = None) { --- End diff -- Nit: indents. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20362: [Spark-22886][ML][TESTS] ML test for structured streamin...
Github user gaborgsomogyi commented on the issue: https://github.com/apache/spark/pull/20362 ping @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20651 **[Test build #87617 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87617/testReport)** for PR 20651 at commit [`e04bf24`](https://github.com/apache/spark/commit/e04bf24ed0a712aa380de8dd929c7c1d4fdb2990). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20651 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87617/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20651 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20362: [Spark-22886][ML][TESTS] ML test for structured streamin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20362 **[Test build #4126 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4126/testReport)** for PR 20362 at commit [`acfb092`](https://github.com/apache/spark/commit/acfb0920d0c5ae4601580cec633f2c3b87a2673f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20362: [Spark-22886][ML][TESTS] ML test for structured s...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20362#discussion_r170047180 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -693,7 +766,7 @@ class ALSSuite val data = ratings.toDF val model = new ALS().fit(data) Seq("nan", "NaN", "Nan", "drop", "DROP", "Drop").foreach { s => - model.setColdStartStrategy(s).transform(data) + testTransformer[Rating[Int]](data, model.setColdStartStrategy(s), "prediction") { _ => } --- End diff -- What's the no-op function at the end for -- just because it requires an argument? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20362: [Spark-22886][ML][TESTS] ML test for structured s...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20362#discussion_r170046857 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -413,34 +411,36 @@ class ALSSuite .setSeed(0) val alpha = als.getAlpha val model = als.fit(training.toDF()) -val predictions = model.transform(test.toDF()).select("rating", "prediction").rdd.map { - case Row(rating: Float, prediction: Float) => -(rating.toDouble, prediction.toDouble) +testTransformerByGlobalCheckFunc[Rating[Int]](test.toDF(), model, "rating", "prediction") { +case rows: Seq[Row] => + val predictions = rows.map(row => (row.getFloat(0).toDouble, row.getFloat(1).toDouble)) + + val rmse = +if (implicitPrefs) { + // TODO: Use a better (rank-based?) evaluation metric for implicit feedback. + // We limit the ratings and the predictions to interval [0, 1] and compute the + // weighted RMSE with the confidence scores as weights. + val (totalWeight, weightedSumSq) = predictions.map { case (rating, prediction) => +val confidence = 1.0 + alpha * math.abs(rating) +val rating01 = math.max(math.min(rating, 1.0), 0.0) +val prediction01 = math.max(math.min(prediction, 1.0), 0.0) +val err = prediction01 - rating01 +(confidence, confidence * err * err) + }.reduce[(Double, Double)] { case ((c0, e0), (c1, e1)) => +(c0 + c1, e0 + e1) + } + math.sqrt(weightedSumSq / totalWeight) +} else { + val errorSquares = predictions.map { case (rating, prediction) => +val err = rating - prediction +err * err + } + val mse = errorSquares.sum / errorSquares.length + math.sqrt(mse) +} + logInfo(s"Test RMSE is $rmse.") + assert(rmse < targetRMSE) } -val rmse = --- End diff -- This change is just a move, really? or did something else change as well? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20362: [Spark-22886][ML][TESTS] ML test for structured s...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/20362#discussion_r170046788 --- Diff: mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala --- @@ -586,6 +586,68 @@ class ALSSuite allModelParamSettings, checkModelData) } + private def checkNumericTypesALS( --- End diff -- Is this mostly just moving the code from the test class, or did it change too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20622: [SPARK-23441][SS] Remove queryExecutionThread.interrupt(...
Github user jose-torres commented on the issue: https://github.com/apache/spark/pull/20622 The difference in ContinuousExecution is that the thread isn't doing any metadata work like looking for new batches - it's either running the Spark job or cleaning up after finishing it. In any case, this sounds like something we aren't plausibly going to solve here, so I'll update the PR to just resolve the flakiness in the way you suggest. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #10942: [SPARK-12850] [SQL] Support Bucket Pruning (Predicate Pu...
Github user lonehacker commented on the issue: https://github.com/apache/spark/pull/10942 @gatorsmile Can you confirm if this feature is active in current master? It seems like this code was removed in [SPARK-14535][SQL] Remove buildInternalScan from FileFormat --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20645: SPARK-23472: Add defaultJavaOptions for drivers and exec...
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/20645 I like the `ConfigBuilder` approach. That would make this much more useful. I'll add an implementation like that. I think append option syntax would be confusing for users and administrators. We need to make sure it is not easy to overwrite the default options and we can't expect users to know they need to append, and how to do it. Administrators could add `++=` options, but then we would be applying these in the wrong order to get the behavior we're after, which is bad because order matters in some cases. I like the two-property approach better because it is simpler and more predictable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20622: [SPARK-23491][SS] Remove explicit job cancellation from ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20622 **[Test build #87620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87620/testReport)** for PR 20622 at commit [`0e5e52f`](https://github.com/apache/spark/commit/0e5e52f2c6b934a372d098a0d7780da18d3f99e0). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20660: [SPARK-23490][SQL]Check storage.locationUri with existin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20660 **[Test build #87618 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87618/testReport)** for PR 20660 at commit [`d72f56d`](https://github.com/apache/spark/commit/d72f56d2cb4a4a3ba56f1013d0ddd21b145d299c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20660: [SPARK-23490][SQL]Check storage.locationUri with existin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20660 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20660: [SPARK-23490][SQL]Check storage.locationUri with existin...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20660 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87618/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20651: [SPARK-23475][UI] Show also skipped stages
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/20651#discussion_r170057919 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllStagesPage.scala --- @@ -172,6 +185,20 @@ private[ui] class AllStagesPage(parent: StagesTab) extends WebUIPage("") { {completedStagesTable.toNodeSeq} } +if (shouldShowSkippedStages) { + content ++= +
[GitHub] spark issue #20657: [SPARK-23361][yarn] Allow AM to restart after initial to...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/20657 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20657: [SPARK-23361][yarn] Allow AM to restart after initial to...
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/20657 Known flaky (SPARK-23458). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/20651 LGTM, merging to master / 2.3. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20651: [SPARK-23475][UI] Show also skipped stages
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/20651 Merged to master, but there were conflicts in 2.3; please open a separate PR for the backport. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20657: [SPARK-23361][yarn] Allow AM to restart after initial to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20657 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20657: [SPARK-23361][yarn] Allow AM to restart after initial to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20657 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1011/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20657: [SPARK-23361][yarn] Allow AM to restart after initial to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20657 **[Test build #87621 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87621/testReport)** for PR 20657 at commit [`2c3448d`](https://github.com/apache/spark/commit/2c3448dd3aa4071234a65a1c9317b1a3c4fe8d24). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20651: [SPARK-23475][UI] Show also skipped stages
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20651 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20362: [Spark-22886][ML][TESTS] ML test for structured streamin...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20362 **[Test build #4126 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4126/testReport)** for PR 20362 at commit [`acfb092`](https://github.com/apache/spark/commit/acfb0920d0c5ae4601580cec633f2c3b87a2673f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r170068231 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/memory/MemoryBlock.java --- @@ -45,38 +44,149 @@ */ public static final int FREED_IN_ALLOCATOR_PAGE_NUMBER = -3; - private final long length; + @Nullable + protected Object obj; + + protected long offset; + + protected long length; /** * Optional page number; used when this MemoryBlock represents a page allocated by a - * TaskMemoryManager. This field is public so that it can be modified by the TaskMemoryManager, - * which lives in a different package. + * TaskMemoryManager. This field can be updated using setPageNumber method so that + * this can be modified by the TaskMemoryManager, which lives in a different package. */ - public int pageNumber = NO_PAGE_NUMBER; + private int pageNumber = NO_PAGE_NUMBER; public MemoryBlock(@Nullable Object obj, long offset, long length) { -super(obj, offset); +this.obj = obj; +this.offset = offset; this.length = length; } + public MemoryBlock() { +this(null, 0, 0); + } + + public final Object getBaseObject() { +return obj; + } + + public final long getBaseOffset() { +return offset; + } + + public void resetObjAndOffset() { +this.obj = null; +this.offset = 0; + } + /** * Returns the size of the memory block. */ - public long size() { + public final long size() { return length; } - /** - * Creates a memory block pointing to the memory used by the long array. - */ - public static MemoryBlock fromLongArray(final long[] array) { -return new MemoryBlock(array, Platform.LONG_ARRAY_OFFSET, array.length * 8L); + public final void setPageNumber(int pageNum) { +pageNumber = pageNum; + } + + public final int getPageNumber() { +return pageNumber; } /** * Fills the memory block with the specified byte value. */ - public void fill(byte value) { + public final void fill(byte value) { Platform.setMemory(obj, offset, length, value); } + + /** + * Instantiate MemoryBlock for given object type with new offset + */ + public final static MemoryBlock allocateFromObject(Object obj, long offset, long length) { +MemoryBlock mb = null; +if (obj instanceof byte[]) { + byte[] array = (byte[])obj; + mb = new ByteArrayMemoryBlock(array, offset, length); +} else if (obj instanceof long[]) { + long[] array = (long[])obj; + mb = new OnHeapMemoryBlock(array, offset, length); +} else if (obj == null) { + // we assume that to pass null pointer means off-heap + mb = new OffHeapMemoryBlock(offset, length); +} else { + throw new UnsupportedOperationException(obj.getClass() + " is not supported now"); +} +return mb; + } + + /** + * Instantiate the same type of MemoryBlock with new offset and size + */ + public abstract MemoryBlock allocate(long offset, long size); + + + public abstract int getInt(long offset); --- End diff -- Yeah, this is a part where we should clean up. This is derived from the conventional usage. Let me think about this cleanup. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20652: [SPARK-23476][CORE] Generate secret in local mode when a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20652 **[Test build #87619 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87619/testReport)** for PR 20652 at commit [`e01feef`](https://github.com/apache/spark/commit/e01feeff30ed226d638055fed0238ef0e1c0c69e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20652: [SPARK-23476][CORE] Generate secret in local mode when a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20652 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87619/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20652: [SPARK-23476][CORE] Generate secret in local mode when a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20652 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org