[GitHub] [spark] AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602418228 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120177/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602418223 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602418223 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
SparkQA removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602396898 **[Test build #120177 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120177/testReport)** for PR 27982 at commit [`cd968ff`](https://github.com/apache/spark/commit/cd968ffe90aef52e37acdb37d5fc6261143fb20c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602418228 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120177/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
SparkQA commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602417833 **[Test build #120177 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120177/testReport)** for PR 27982 at commit [`cd968ff`](https://github.com/apache/spark/commit/cd968ffe90aef52e37acdb37d5fc6261143fb20c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE
AmplabJenkins removed a comment on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE URL: https://github.com/apache/spark/pull/27984#issuecomment-602415068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24893/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE
AmplabJenkins removed a comment on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE URL: https://github.com/apache/spark/pull/27984#issuecomment-602415062 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE
AmplabJenkins commented on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE URL: https://github.com/apache/spark/pull/27984#issuecomment-602415068 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24893/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE
AmplabJenkins commented on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE URL: https://github.com/apache/spark/pull/27984#issuecomment-602415062 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE
SparkQA commented on issue #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE URL: https://github.com/apache/spark/pull/27984#issuecomment-602414660 **[Test build #120180 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120180/testReport)** for PR 27984 at commit [`b49db3b`](https://github.com/apache/spark/commit/b49db3b939e918680639b8ef30fb7fdf533e3578). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya opened a new pull request #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE
viirya opened a new pull request #27984: [SPARK-31224][SQL] Add view support to SHOW CREATE TABLE URL: https://github.com/apache/spark/pull/27984 ### What changes were proposed in this pull request? For now `SHOW CREATE TABLE` command doesn't support views, but `SHOW CREATE TABLE AS SERDE` supports it. Since the views syntax are the same between Hive DDL and Spark DDL, we should be able to support views in both two commands. This patch proposes to add views support to `SHOW CREATE TABLE`. ### Why are the changes needed? To extend the view support of `SHOW CREATE TABLE`, so users can use `SHOW CREATE TABLE` to show Spark DDL for views. ### Does this PR introduce any user-facing change? Yes. `SHOW CREATE TABLE` can be used to show Spark DDL for views. ### How was this patch tested? Unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils
SparkQA commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils URL: https://github.com/apache/spark/pull/27617#issuecomment-602409929 **[Test build #120179 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120179/testReport)** for PR 27617 at commit [`8669e29`](https://github.com/apache/spark/commit/8669e294ac38592b41cc733b87c176151801b7a3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils
AmplabJenkins removed a comment on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils URL: https://github.com/apache/spark/pull/27617#issuecomment-602408297 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24892/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils
AmplabJenkins removed a comment on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils URL: https://github.com/apache/spark/pull/27617#issuecomment-602408288 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #27974: [SPARK-31211][SQL] Fix rebasing of 29 February of Julian leap years
cloud-fan closed pull request #27974: [SPARK-31211][SQL] Fix rebasing of 29 February of Julian leap years URL: https://github.com/apache/spark/pull/27974 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF URL: https://github.com/apache/spark/pull/27937#issuecomment-602408130 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120173/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils
AmplabJenkins commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils URL: https://github.com/apache/spark/pull/27617#issuecomment-602408297 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24892/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
AmplabJenkins removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF URL: https://github.com/apache/spark/pull/27937#issuecomment-602408123 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils
AmplabJenkins commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils URL: https://github.com/apache/spark/pull/27617#issuecomment-602408288 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF URL: https://github.com/apache/spark/pull/27937#issuecomment-602408123 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
AmplabJenkins commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF URL: https://github.com/apache/spark/pull/27937#issuecomment-602408130 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120173/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
SparkQA removed a comment on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF URL: https://github.com/apache/spark/pull/27937#issuecomment-602324100 **[Test build #120173 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120173/testReport)** for PR 27937 at commit [`8e82f3f`](https://github.com/apache/spark/commit/8e82f3f75a770fc9c6163a483f297eac38c30edd). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils
MaxGekk commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils URL: https://github.com/apache/spark/pull/27617#issuecomment-602407687 jenkins, retest this, please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF
SparkQA commented on issue #27937: [SPARK-30127][SQL] Support case class parameter for typed Scala UDF URL: https://github.com/apache/spark/pull/27937#issuecomment-602407564 **[Test build #120173 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120173/testReport)** for PR 27937 at commit [`8e82f3f`](https://github.com/apache/spark/commit/8e82f3f75a770fc9c6163a483f297eac38c30edd). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils
MaxGekk commented on issue #27617: [SPARK-30865][SQL] Refactor DateTimeUtils URL: https://github.com/apache/spark/pull/27617#issuecomment-602407299 The issue will be fixed by https://github.com/apache/spark/pull/27980. I am going to re-trigger build, and hope that a leap year in Julian (and not leap year in Gregorian) will be not generated in tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core
AmplabJenkins removed a comment on issue #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-602405596 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core
AmplabJenkins commented on issue #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-602405926 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core
AmplabJenkins commented on issue #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-602405596 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core
AngersZh commented on issue #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-602405307 @gatorsmile @tejasapatil @wangyum I have restart implement ScriptTransform in sql/core now, Rewrite origin pr in current code. Hope for your review and suggestion This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core (part 1)
AngersZh commented on a change in pull request #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core (part 1) URL: https://github.com/apache/spark/pull/27983#discussion_r396228748 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/script/ScriptTrnasformationExecSuite.scala ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.script + +import java.sql.{Date, Timestamp} + +import org.scalatest.Assertions._ +import org.scalatest.BeforeAndAfterEach +import org.scalatest.exceptions.TestFailedException + +import org.apache.spark.{SparkException, TaskContext, TestUtils} +import org.apache.spark.rdd.RDD +import org.apache.spark.sql.Column +import org.apache.spark.sql.catalyst.InternalRow +import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference} +import org.apache.spark.sql.catalyst.plans.physical.Partitioning +import org.apache.spark.sql.execution.{SparkPlan, SparkPlanTest, UnaryExecNode} +import org.apache.spark.sql.test.{SharedSparkSession, SQLTestUtils} +import org.apache.spark.sql.types.StringType + +class ScriptTransformationSuite extends SparkPlanTest with SharedSparkSession + with BeforeAndAfterEach { + import testImplicits._ + + private val noSerdeIOSchema = new ScriptTransformIOSchema( +inputRowFormat = Seq.empty, +outputRowFormat = Seq.empty, +inputSerdeClass = None, +outputSerdeClass = None, +inputSerdeProps = Seq.empty, +outputSerdeProps = Seq.empty, +recordReaderClass = None, +recordWriterClass = None, +schemaLess = false + ) + + private var defaultUncaughtExceptionHandler: Thread.UncaughtExceptionHandler = _ + + private val uncaughtExceptionHandler = new TestUncaughtExceptionHandler + + protected override def beforeAll(): Unit = { +super.beforeAll() +defaultUncaughtExceptionHandler = Thread.getDefaultUncaughtExceptionHandler +Thread.setDefaultUncaughtExceptionHandler(uncaughtExceptionHandler) + } + + protected override def afterAll(): Unit = { +super.afterAll() +Thread.setDefaultUncaughtExceptionHandler(defaultUncaughtExceptionHandler) + } + + override protected def afterEach(): Unit = { +super.afterEach() +uncaughtExceptionHandler.cleanStatus() + } + + test("cat without SerDe") { +assume(TestUtils.testCommandAvailable("/bin/bash")) + +val rowsDf = Seq("a", "b", "c").map(Tuple1.apply).toDF("a") +checkAnswer( + rowsDf, + (child: SparkPlan) => new ScriptTransformationExec( +input = Seq(rowsDf.col("a").expr), +script = "cat", +output = Seq(AttributeReference("a", StringType)()), +child = child, +ioschema = noSerdeIOSchema + ), + rowsDf.collect()) +assert(uncaughtExceptionHandler.exception.isEmpty) + } + + test("script transformation should not swallow errors from upstream operators (no serde)") { +assume(TestUtils.testCommandAvailable("/bin/bash")) + +val rowsDf = Seq("a", "b", "c").map(Tuple1.apply).toDF("a") +val e = intercept[TestFailedException] { + checkAnswer( +rowsDf, +(child: SparkPlan) => new ScriptTransformationExec( + input = Seq(rowsDf.col("a").expr), + script = "cat", + output = Seq(AttributeReference("a", StringType)()), + child = ExceptionInjectingOperator(child), + ioschema = noSerdeIOSchema +), +rowsDf.collect()) +} +assert(e.getMessage().contains("intentional exception")) +// Before SPARK-25158, uncaughtExceptionHandler will catch IllegalArgumentException +assert(uncaughtExceptionHandler.exception.isEmpty) + } + + + test("SPARK-14400 script transformation should fail for bad script command") { +assume(TestUtils.testCommandAvailable("/bin/bash")) + +val rowsDf = Seq("a", "b", "c").map(Tuple1.apply).toDF("a") + +val e = intercept[SparkException] { + val plan = +new ScriptTransformationExec( + input = Seq(rowsDf.col("a").expr), + script = "some_non_existent_command", + output = Seq(AttributeReference("a", StringType)()), + child = rowsDf.queryExecutio
[GitHub] [spark] AngersZhuuuu opened a new pull request #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core (part 1)
AngersZh opened a new pull request #27983: [SPARK-15694][SQL][FOLLOW-UP] Implement ScriptTransformation in sql/core (part 1) URL: https://github.com/apache/spark/pull/27983 ### What changes were proposed in this pull request? * Renamed `hive/execution/ScriptTransformationExec` to `hive/execution/script/HiveScriptTransformationExec` * Added ScriptTransformationExec which would run script operator in SQL mode (without Hive). The output of script would be read as a string and column values are extracted by using a delimiter (default : tab character) * `ScriptTransformBase` has common code used across `ScriptTransformationExec` and `HiveScriptTransformationExec` * For thread writing data to script, ScriptTransformationWriterThread has the core logic. HiveScriptTransformationWriterThread extends that for Hive specific stuff. * `ScriptTransformationWriterThread` will be used for Spark SQL. It only supports writing data to script process by serializing column values as tab delimited string * `HiveScriptTransformationWriterThread` additionally supports Hive serde * Added a Strategy named Scripts which would emit ScriptTransformationExec in physical plans. This would be used in non-Hive mode. Todo List; - For Hive, by default only serde's must be used, and for without hive can't use serde - Cleanup past hacks that are observed (and people suggest / report) - support use transform with aggregation [SPARK-28227](https://issues.apache.org/jira/browse/SPARK-28227) - support array/map as transform's input [SPARK-22435](https://issues.apache.org/jira/browse/SPARK-22435) - Use code-gen projection to serialize rows to output stream() ### Why are the changes needed? Support run transform in SQL mode without hive ### Does this PR introduce any user-facing change? Yes ### How was this patch tested? Added UT This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #27974: [SPARK-31211][SQL] Fix rebasing of 29 February of Julian leap years
cloud-fan commented on issue #27974: [SPARK-31211][SQL] Fix rebasing of 29 February of Julian leap years URL: https://github.com/apache/spark/pull/27974#issuecomment-602403292 This makes me realize that we can't be fully compatible, and picking the closest valid date in Proleptic Gregorian calendar makes sense to me. Thanks, merging to master/3.0! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602399308 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24891/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602399307 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602399307 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602399308 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24891/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
SparkQA commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602398991 **[Test build #120178 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120178/testReport)** for PR 27982 at commit [`594b830`](https://github.com/apache/spark/commit/594b830450a15c67746a47cc37f4baa01354). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on issue #27980: [SPARK-31221][SQL] Rebase any date-times in conversions to/from Java types
MaxGekk commented on issue #27980: [SPARK-31221][SQL] Rebase any date-times in conversions to/from Java types URL: https://github.com/apache/spark/pull/27980#issuecomment-602398253 @cloud-fan @HyukjinKwon Please, review this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602397229 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24890/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins removed a comment on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602397225 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602397225 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
AmplabJenkins commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602397229 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24890/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
zhengruifeng commented on a change in pull request #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#discussion_r396222712 ## File path: mllib/src/test/scala/org/apache/spark/ml/stat/ANOVATestSuite.scala ## @@ -144,22 +144,30 @@ class ANOVATestSuite } test("test DataFrame with sparse vector") { -val df = spark.createDataFrame(Seq( - (3, Vectors.sparse(6, Array((0, 6.0), (1, 7.0), (3, 7.0), (4, 6.0, - (1, Vectors.sparse(6, Array((1, 9.0), (2, 6.0), (4, 5.0), (5, 9.0, - (3, Vectors.sparse(6, Array((1, 9.0), (2, 3.0), (4, 5.0), (5, 5.0, - (2, Vectors.dense(Array(0.0, 9.0, 8.0, 5.0, 6.0, 4.0))), - (2, Vectors.dense(Array(8.0, 9.0, 6.0, 5.0, 4.0, 4.0))), - (3, Vectors.dense(Array(8.0, 9.0, 6.0, 4.0, 0.0, 0.0))) -)).toDF("label", "features") +val data = Seq( + (3, Vectors.dense(Array(6.0, 7.0, 0.0, 7.0, 6.0, 0.0, 0.0))), + (1, Vectors.dense(Array(0.0, 9.0, 6.0, 0.0, 5.0, 9.0, 0.0))), + (3, Vectors.dense(Array(0.0, 9.0, 3.0, 0.0, 5.0, 5.0, 0.0))), + (2, Vectors.dense(Array(0.0, 9.0, 8.0, 5.0, 6.0, 4.0, 0.0))), + (2, Vectors.dense(Array(8.0, 9.0, 6.0, 5.0, 4.0, 4.0, 0.0))), + (3, Vectors.dense(Array(8.0, 9.0, 6.0, 4.0, 0.0, 0.0, 0.0 -val anovaResult = ANOVATest.test(df, "features", "label") -val (pValues: Vector, fValues: Vector) = - anovaResult.select("pValues", "fValues") -.as[(Vector, Vector)].head() -assert(pValues ~== Vectors.dense(0.71554175, 0.71554175, 0.34278574, 0.45824059, 0.84633632, - 0.15673368) relTol 1e-6) -assert(fValues ~== Vectors.dense(0.375, 0.375, 1.5625, 1.02364865, 0.17647059, - 3.66) relTol 1e-6) +val df1 = spark.createDataFrame(data.map(t => (t._1, t._2.toDense))) + .toDF("label", "features") +val df2 = spark.createDataFrame(data.map(t => (t._1, t._2.toSparse))) + .toDF("label", "features") +val df3 = spark.createDataFrame(data.map(t => (t._1, t._2.compressed))) + .toDF("label", "features") + +Seq(df1, df2, df3).foreach { df => + val anovaResult = ANOVATest.test(df, "features", "label") + val (pValues: Vector, fValues: Vector) = +anovaResult.select("pValues", "fValues") + .as[(Vector, Vector)].head() + assert(pValues ~== Vectors.dense(0.71554175, 0.71554175, 0.34278574, 0.45824059, 0.84633632, +0.15673368, Double.NaN) relTol 1e-6) Review comment: for column only containing zero values, sklearn also returns `nan`: ```python X = np.zeros([3,5]) y = [1,2,3] f_classif(X, y) /home/zrf/Applications/anaconda3/lib/python3.7/site-packages/sklearn/feature_selection/_univariate_selection.py:110: RuntimeWarning: invalid value encountered in true_divide msw = sswn / float(dfwn) Out[24]: (array([nan, nan, nan, nan, nan]), array([nan, nan, nan, nan, nan])) = ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
SparkQA commented on issue #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#issuecomment-602396898 **[Test build #120177 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120177/testReport)** for PR 27982 at commit [`cd968ff`](https://github.com/apache/spark/commit/cd968ffe90aef52e37acdb37d5fc6261143fb20c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27972: [SPARK-31207][CORE] Ensure the total number of blocks to fetch equals to the sum of local/hostLocal/remote blocks
cloud-fan commented on a change in pull request #27972: [SPARK-31207][CORE] Ensure the total number of blocks to fetch equals to the sum of local/hostLocal/remote blocks URL: https://github.com/apache/spark/pull/27972#discussion_r396222718 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -312,19 +310,57 @@ final class ShuffleBlockFetcherIterator( hostLocalBlocks ++= blocksForAddress.map(info => (info._1, info._3)) hostLocalBlockBytes += mergedBlockInfos.map(_.size).sum } else { -numRemoteBlocks += blockInfos.size remoteBlockBytes += blockInfos.map(_._2).sum collectFetchRequests(address, blockInfos, collectedRemoteRequests) } } +val numRemoteBlocks = collectedRemoteRequests.map(_.blocks.size).sum val totalBytes = localBlockBytes + remoteBlockBytes + hostLocalBlockBytes +assert(numBlocksToFetch == localBlocks.size + hostLocalBlocks.size + numRemoteBlocks, + s"The number of non-empty blocks $numBlocksToFetch doesn't equal to the number of local " + +s"blocks ${localBlocks.size} + the number of host-local blocks ${hostLocalBlocks.size} " + +s"+ the number of remote blocks ${numRemoteBlocks}.") logInfo(s"Getting $numBlocksToFetch (${Utils.bytesToString(totalBytes)}) non-empty blocks " + s"including ${localBlocks.size} (${Utils.bytesToString(localBlockBytes)}) local and " + s"${hostLocalBlocks.size} (${Utils.bytesToString(hostLocalBlockBytes)}) " + s"host-local and $numRemoteBlocks (${Utils.bytesToString(remoteBlockBytes)}) remote blocks") collectedRemoteRequests } + def createFetchRequest( + blocks: Seq[FetchBlockInfo], + address: BlockManagerId, + curRequestSize: Long): FetchRequest = { +logDebug(s"Creating fetch request of $curRequestSize at $address " + + s"with ${blocks.size} blocks") +FetchRequest(address, blocks) + } + + def createFetchRequests( + curBlocks: ArrayBuffer[FetchBlockInfo], Review comment: does this parameter have to be an `ArrayBuffer` here? We don't mutate it in this method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on a change in pull request #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
zhengruifeng commented on a change in pull request #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982#discussion_r396222712 ## File path: mllib/src/test/scala/org/apache/spark/ml/stat/ANOVATestSuite.scala ## @@ -144,22 +144,30 @@ class ANOVATestSuite } test("test DataFrame with sparse vector") { -val df = spark.createDataFrame(Seq( - (3, Vectors.sparse(6, Array((0, 6.0), (1, 7.0), (3, 7.0), (4, 6.0, - (1, Vectors.sparse(6, Array((1, 9.0), (2, 6.0), (4, 5.0), (5, 9.0, - (3, Vectors.sparse(6, Array((1, 9.0), (2, 3.0), (4, 5.0), (5, 5.0, - (2, Vectors.dense(Array(0.0, 9.0, 8.0, 5.0, 6.0, 4.0))), - (2, Vectors.dense(Array(8.0, 9.0, 6.0, 5.0, 4.0, 4.0))), - (3, Vectors.dense(Array(8.0, 9.0, 6.0, 4.0, 0.0, 0.0))) -)).toDF("label", "features") +val data = Seq( + (3, Vectors.dense(Array(6.0, 7.0, 0.0, 7.0, 6.0, 0.0, 0.0))), + (1, Vectors.dense(Array(0.0, 9.0, 6.0, 0.0, 5.0, 9.0, 0.0))), + (3, Vectors.dense(Array(0.0, 9.0, 3.0, 0.0, 5.0, 5.0, 0.0))), + (2, Vectors.dense(Array(0.0, 9.0, 8.0, 5.0, 6.0, 4.0, 0.0))), + (2, Vectors.dense(Array(8.0, 9.0, 6.0, 5.0, 4.0, 4.0, 0.0))), + (3, Vectors.dense(Array(8.0, 9.0, 6.0, 4.0, 0.0, 0.0, 0.0 -val anovaResult = ANOVATest.test(df, "features", "label") -val (pValues: Vector, fValues: Vector) = - anovaResult.select("pValues", "fValues") -.as[(Vector, Vector)].head() -assert(pValues ~== Vectors.dense(0.71554175, 0.71554175, 0.34278574, 0.45824059, 0.84633632, - 0.15673368) relTol 1e-6) -assert(fValues ~== Vectors.dense(0.375, 0.375, 1.5625, 1.02364865, 0.17647059, - 3.66) relTol 1e-6) +val df1 = spark.createDataFrame(data.map(t => (t._1, t._2.toDense))) + .toDF("label", "features") +val df2 = spark.createDataFrame(data.map(t => (t._1, t._2.toSparse))) + .toDF("label", "features") +val df3 = spark.createDataFrame(data.map(t => (t._1, t._2.compressed))) + .toDF("label", "features") + +Seq(df1, df2, df3).foreach { df => + val anovaResult = ANOVATest.test(df, "features", "label") + val (pValues: Vector, fValues: Vector) = +anovaResult.select("pValues", "fValues") + .as[(Vector, Vector)].head() + assert(pValues ~== Vectors.dense(0.71554175, 0.71554175, 0.34278574, 0.45824059, 0.84633632, +0.15673368, Double.NaN) relTol 1e-6) Review comment: for column only containing zero values, sklearn will return nan: ```python X = np.zeros([3,5]) y = [1,2,3] f_classif(X, y) /home/zrf/Applications/anaconda3/lib/python3.7/site-packages/sklearn/feature_selection/_univariate_selection.py:110: RuntimeWarning: invalid value encountered in true_divide msw = sswn / float(dfwn) Out[24]: (array([nan, nan, nan, nan, nan]), array([nan, nan, nan, nan, nan])) = ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng opened a new pull request #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware
zhengruifeng opened a new pull request #27982: [SPARK-31222][ML] Make ANOVATest Sparsity-Aware URL: https://github.com/apache/spark/pull/27982 ### What changes were proposed in this pull request? when input dataset is sparse, make `ANOVATest` only process non-zero value ### Why are the changes needed? for performance ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type
AmplabJenkins removed a comment on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type URL: https://github.com/apache/spark/pull/27959#issuecomment-602394345 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120172/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type
AmplabJenkins removed a comment on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type URL: https://github.com/apache/spark/pull/27959#issuecomment-602394335 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type
AmplabJenkins commented on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type URL: https://github.com/apache/spark/pull/27959#issuecomment-602394345 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120172/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type
AmplabJenkins commented on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type URL: https://github.com/apache/spark/pull/27959#issuecomment-602394335 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type
SparkQA removed a comment on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type URL: https://github.com/apache/spark/pull/27959#issuecomment-602314748 **[Test build #120172 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120172/testReport)** for PR 27959 at commit [`ceb2ce6`](https://github.com/apache/spark/commit/ceb2ce6eb49a39ab82568522af369b4e5c3ecd13). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies
cloud-fan commented on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies URL: https://github.com/apache/spark/pull/27977#issuecomment-602393765 thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type
SparkQA commented on issue #27959: [SPARK-31190][SQL] ScalaReflection should erasure non user defined AnyVal type URL: https://github.com/apache/spark/pull/27959#issuecomment-602393783 **[Test build #120172 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120172/testReport)** for PR 27959 at commit [`ceb2ce6`](https://github.com/apache/spark/commit/ceb2ce6eb49a39ab82568522af369b4e5c3ecd13). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `// `case class Foo(i: Int) extends AnyVal` will return type `Int` instead of `Foo`.` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies
cloud-fan commented on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies URL: https://github.com/apache/spark/pull/27977#issuecomment-602393675 thanks, merging to 3.0! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan removed a comment on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies
cloud-fan removed a comment on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies URL: https://github.com/apache/spark/pull/27977#issuecomment-602393675 thanks, merging to 3.0! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27185: [SPARK-30494][SQL] Fix cached data leakage during replacing an existing view
dongjoon-hyun commented on issue #27185: [SPARK-30494][SQL] Fix cached data leakage during replacing an existing view URL: https://github.com/apache/spark/pull/27185#issuecomment-602392216 Hi, @LantaoJin . Could you make a backport PR against `branch-2.4`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #27185: [SPARK-30494][SQL] Fix cached data leakage during replacing an existing view
dongjoon-hyun closed pull request #27185: [SPARK-30494][SQL] Fix cached data leakage during replacing an existing view URL: https://github.com/apache/spark/pull/27185 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27185: [SPARK-30494][SQL] Fix the leak of cached data when replace an existing temp view
dongjoon-hyun commented on a change in pull request #27185: [SPARK-30494][SQL] Fix the leak of cached data when replace an existing temp view URL: https://github.com/apache/spark/pull/27185#discussion_r396217365 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ## @@ -1122,4 +1122,47 @@ class CachedTableSuite extends QueryTest with SQLTestUtils assert(!spark.catalog.isCached("t1")) } } + + test("SPARK-30494 avoid duplicated cached RDD when replace an existing view") { +withTempView("tempView") { + spark.catalog.clearCache() + sql("create or replace temporary view tempView as select 1") + sql("cache table tempView") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isDefined) + sql("create or replace temporary view tempView as select 1, 2") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isEmpty) + sql("cache table tempView") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1, 2")).isDefined) + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isEmpty) +} + +withGlobalTempView("tempGlobalTempView") { + spark.catalog.clearCache() + sql("create or replace global temporary view tempGlobalTempView as select 1") + sql("cache table global_temp.tempGlobalTempView") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isDefined) + sql("create or replace global temporary view tempGlobalTempView as select 1, 2") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isEmpty) + sql("cache table global_temp.tempGlobalTempView") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1, 2")).isDefined) + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isEmpty) +} + +withView("view1") { + spark.catalog.clearCache() + sql("create or replace view view1 as select 1") + sql("cache table view1") + sql("create or replace view view1 as select 1, 2") + sql("cache table view1") + // the cached plan of persisted view likes below, + // so we cannot use the same assertion of temp view. + // SubqueryAlias + //| + //+ View + //| + //+ Project[1 AS 1] + spark.sharedState.cacheManager.uncacheQuery(spark.table("view1"), cascade = false) + assert(spark.sharedState.cacheManager.isEmpty) Review comment: Then, please remove this misleading test case between line 1149 and 1165. > no cached data leak for persisted view This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27185: [SPARK-30494][SQL] Fix the leak of cached data when replace an existing temp view
dongjoon-hyun commented on a change in pull request #27185: [SPARK-30494][SQL] Fix the leak of cached data when replace an existing temp view URL: https://github.com/apache/spark/pull/27185#discussion_r396217365 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ## @@ -1122,4 +1122,47 @@ class CachedTableSuite extends QueryTest with SQLTestUtils assert(!spark.catalog.isCached("t1")) } } + + test("SPARK-30494 avoid duplicated cached RDD when replace an existing view") { +withTempView("tempView") { + spark.catalog.clearCache() + sql("create or replace temporary view tempView as select 1") + sql("cache table tempView") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isDefined) + sql("create or replace temporary view tempView as select 1, 2") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isEmpty) + sql("cache table tempView") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1, 2")).isDefined) + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isEmpty) +} + +withGlobalTempView("tempGlobalTempView") { + spark.catalog.clearCache() + sql("create or replace global temporary view tempGlobalTempView as select 1") + sql("cache table global_temp.tempGlobalTempView") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isDefined) + sql("create or replace global temporary view tempGlobalTempView as select 1, 2") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isEmpty) + sql("cache table global_temp.tempGlobalTempView") + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1, 2")).isDefined) + assert(spark.sharedState.cacheManager.lookupCachedData(sql("select 1")).isEmpty) +} + +withView("view1") { + spark.catalog.clearCache() + sql("create or replace view view1 as select 1") + sql("cache table view1") + sql("create or replace view view1 as select 1, 2") + sql("cache table view1") + // the cached plan of persisted view likes below, + // so we cannot use the same assertion of temp view. + // SubqueryAlias + //| + //+ View + //| + //+ Project[1 AS 1] + spark.sharedState.cacheManager.uncacheQuery(spark.table("view1"), cascade = false) + assert(spark.sharedState.cacheManager.isEmpty) Review comment: Then, please remove this misleading test case between line 1149 and 1165. > no cached data leak for persisted view This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-602383435 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24889/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
AmplabJenkins removed a comment on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-602383432 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-602383435 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24889/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
AmplabJenkins commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-602383432 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
SparkQA commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-602383174 **[Test build #120176 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120176/testReport)** for PR 27916 at commit [`37f441e`](https://github.com/apache/spark/commit/37f441ed424d2ae2088565cf0ad6ddc68f039b85). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax
HyukjinKwon commented on issue #27916: [SPARK-30532] DataFrameStatFunctions to work with TABLE.COLUMN syntax URL: https://github.com/apache/spark/pull/27916#issuecomment-602381874 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies
dongjoon-hyun commented on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies URL: https://github.com/apache/spark/pull/27977#issuecomment-602380901 Thank you, @MaxGekk and @HyukjinKwon ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] nchammas commented on a change in pull request #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
nchammas commented on a change in pull request #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#discussion_r396197611 ## File path: dev/requirements.txt ## @@ -1,5 +1,10 @@ -flake8==3.5.0 +flake8==3.7.* Review comment: I mentioned it elsewhere but I'll mention it again here: Linters like flake8 and pycodestyle introduce new checks in minor/feature releases. There is very high chance that every new check they introduce will flag new problems and fail the build. In fact, we saw exactly that behavior with pydocstyle just before we removed it. And I [experienced this](https://github.com/nchammas/flintrock/commit/9157b25d735ff6ef690cfdfb761f336bf999fc82) with pycodestyle in Flintrock before [pinning the version](https://github.com/nchammas/flintrock/commit/6c6d9562673b430d142d9062a99e9ac1c87366b8). I don't understand the point of waiting for the build to break before pinning or severely limiting the versions for libraries like these. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27248: [WIP][SPARK-30538][SQL] Control spark sql output small file by merge small partition
AmplabJenkins removed a comment on issue #27248: [WIP][SPARK-30538][SQL] Control spark sql output small file by merge small partition URL: https://github.com/apache/spark/pull/27248#issuecomment-575452587 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27248: [WIP][SPARK-30538][SQL] Control spark sql output small file by merge small partition
AmplabJenkins commented on issue #27248: [WIP][SPARK-30538][SQL] Control spark sql output small file by merge small partition URL: https://github.com/apache/spark/pull/27248#issuecomment-602376013 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] nchammas commented on a change in pull request #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
nchammas commented on a change in pull request #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#discussion_r396197611 ## File path: dev/requirements.txt ## @@ -1,5 +1,10 @@ -flake8==3.5.0 +flake8==3.7.* Review comment: I mentioned it elsewhere but I'll mention it again here: Linters like flake8 and pycodestyle introduce new checks in minor/feature releases. There is very high chance that every new check they introduce will flag new problems and fail the build. (In fact, we saw exactly that behavior with pydocstyle just before we removed it. And I experienced this with flake8 in Flintrock before [pinning the version](https://github.com/nchammas/flintrock/blob/52c6c84c9a1845b0ce89ca138172a6ec4cf0d632/requirements/developer.in#L4).) I don't understand the point of waiting for the build to break before pinning or severely limiting the versions for libraries like these. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] nchammas commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
nchammas commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-602375329 > With this change, we will have to maintain and keep `dev/requirements.txt` up to date. Maybe this is the disconnect between our points of view, because so far I haven't really been following your objections to pinning. Assuming we pin every library, why do we have to keep `dev/requirements.txt` up-to-date? As long as we can build the docs, run tests, and do whatever else we need to do as part of regular development, that file can remain frozen as-is for years. It's only when we specifically want to use some new feature of, say, Sphinx, that we need to bump versions. But that will happen very rarely, I imagine not more than once every couple of years. Does that address your concern? Why do you think we'd need to touch that file more often than once in a long while? > We shouldn't pin `numpy` to encourage people to test the highest versions. It should ideally be `numpy>=1.7` according to `setup.py`. > > * `numpy` is an explicit dependency for ML/MLlib in PySpark. But the specification of numpy in `dev/requirements.txt` is so that we can build our docs. (It seems strange, but yes, numpy is a requirement to build our Python API docs.) Maybe we can improve this by replacing numpy in `dev/requirements.txt` with a reference to `setup.py`. That way we can track PySpark dependencies (whether for building the docs or for general execution) in one place. This will also pick up the Pandas requirement. How does that sound? A separate issue I raised earlier is that, if we want to not pin our build/test dependencies, we need to figure out what to do about the Spark Docker image and CI. Either those will also source the unpinned requirements from the same file, or we go back to having the requirements specified in duplicate--with pinned versions for Docker and CI, and without pinned versions for developers. Obviously, I'd prefer to pin everything and keep it in one place, but if you want to go one of these routes I guess I'll do that. I just want to understand and try to address your objections before going there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] liupc edited a comment on issue #27871: [SPARK-31105][CORE]Respect sql execution id for FIFO scheduling mode
liupc edited a comment on issue #27871: [SPARK-31105][CORE]Respect sql execution id for FIFO scheduling mode URL: https://github.com/apache/spark/pull/27871#issuecomment-602373911 Thanks @dongjoon-hyun , let's spill the scopes, and add an option to respect `jobGroup` level priority in the `core` module. And I think even in current approach, the congestion issue is serious, so this PR is not about to solve it, but I proposed another PR for this: https://github.com/apache/spark/pull/27862 I really think this is helpful for OLAP senarios, and we test this in real workloads in xiaomi. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies
HyukjinKwon commented on issue #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies URL: https://github.com/apache/spark/pull/27977#issuecomment-602374308 Merged to branch-3.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies
HyukjinKwon closed pull request #27977: [SPARK-31183][SQL][FOLLOWUP][3.0] Move rebase tests to `AvroSuite` and check the rebase flag out of function bodies URL: https://github.com/apache/spark/pull/27977 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] liupc edited a comment on issue #27871: [SPARK-31105][CORE]Respect sql execution id for FIFO scheduling mode
liupc edited a comment on issue #27871: [SPARK-31105][CORE]Respect sql execution id for FIFO scheduling mode URL: https://github.com/apache/spark/pull/27871#issuecomment-602373911 Thanks @dongjoon-hyun , let's spill the scopes, and add an option to respect `jobGroup` level priority in the `core` module. I really think this is helpful for OLAP senarios, and that's what we do in xiaomi. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] liupc commented on issue #27871: [SPARK-31105][CORE]Respect sql execution id for FIFO scheduling mode
liupc commented on issue #27871: [SPARK-31105][CORE]Respect sql execution id for FIFO scheduling mode URL: https://github.com/apache/spark/pull/27871#issuecomment-602373911 Thanks @dongjoon-hyun , let's spill the scopes, and add an option to respect `jobGroup` level priority in the `core` module. I really think this is helpful for OLAP senarios. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams
HyukjinKwon closed pull request #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams URL: https://github.com/apache/spark/pull/27898 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams
HyukjinKwon commented on issue #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams URL: https://github.com/apache/spark/pull/27898#issuecomment-602372894 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] nchammas commented on a change in pull request #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
nchammas commented on a change in pull request #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#discussion_r396197611 ## File path: dev/requirements.txt ## @@ -1,5 +1,10 @@ -flake8==3.5.0 +flake8==3.7.* Review comment: I mentioned it elsewhere but I'll mention it again here: Linters like flake8 and pycodestyle introduce new checks in minor/feature releases. There is very high chance that every new check they introduce will flag new problems and fail the build. (In fact, we saw exactly that behavior with pydocstyle just before we removed it.) I don't understand the point of waiting for the build to break before pinning or severely limiting the versions for libraries like these. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on issue #27497: [SPARK-30245][SQL][FOLLOWUP] Improve regex expression when pattern not changed
beliefer commented on issue #27497: [SPARK-30245][SQL][FOLLOWUP] Improve regex expression when pattern not changed URL: https://github.com/apache/spark/pull/27497#issuecomment-602364756 @HyukjinKwon With pleasure do it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL
SparkQA commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL URL: https://github.com/apache/spark/pull/27981#issuecomment-602364594 **[Test build #120175 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120175/testReport)** for PR 27981 at commit [`86d4eff`](https://github.com/apache/spark/commit/86d4eff3d3f8d5ba37d22eed81de912d8929309b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on a change in pull request #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams
beliefer commented on a change in pull request #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams URL: https://github.com/apache/spark/pull/27898#discussion_r396195734 ## File path: docs/configuration.md ## @@ -2483,6 +2483,7 @@ Spark subsystems. spark.streaming.receiver.maxRate and spark.streaming.kafka.maxRatePerPartition Review comment: Thanks for your explanation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on issue #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams
beliefer commented on issue #27898: [SPARK-31141][DSTREAMS][DOC] Add version information to the configuration of Dstreams URL: https://github.com/apache/spark/pull/27898#issuecomment-602363897 @HyukjinKwon OK. I updated them. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng commented on issue #27979: [SPARK-31138][ML][FOLLOWUP] ANOVA optimization
zhengruifeng commented on issue #27979: [SPARK-31138][ML][FOLLOWUP] ANOVA optimization URL: https://github.com/apache/spark/pull/27979#issuecomment-602363561 Merged to master, thanks @srowen @huaxingao This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL
AmplabJenkins removed a comment on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL URL: https://github.com/apache/spark/pull/27981#issuecomment-602363292 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL
AmplabJenkins removed a comment on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL URL: https://github.com/apache/spark/pull/27981#issuecomment-602363298 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24888/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhengruifeng closed pull request #27979: [SPARK-31138][ML][FOLLOWUP] ANOVA optimization
zhengruifeng closed pull request #27979: [SPARK-31138][ML][FOLLOWUP] ANOVA optimization URL: https://github.com/apache/spark/pull/27979 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL
AmplabJenkins commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL URL: https://github.com/apache/spark/pull/27981#issuecomment-602363292 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL
AmplabJenkins commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL URL: https://github.com/apache/spark/pull/27981#issuecomment-602363298 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24888/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on issue #27931: [SPARK-31002][CORE][DOC][FOLLOWUP] Add version information to the configuration of Core
beliefer commented on issue #27931: [SPARK-31002][CORE][DOC][FOLLOWUP] Add version information to the configuration of Core URL: https://github.com/apache/spark/pull/27931#issuecomment-602362761 @HyukjinKwon Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] beliefer commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL
beliefer commented on issue #27981: [SPARK-31215][SQL][DOC] Add version information to the static configuration of SQL URL: https://github.com/apache/spark/pull/27981#issuecomment-602362943 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty
AmplabJenkins removed a comment on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty URL: https://github.com/apache/spark/pull/27861#issuecomment-602333577 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty
AmplabJenkins removed a comment on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty URL: https://github.com/apache/spark/pull/27861#issuecomment-602333612 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24887/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty
AmplabJenkins commented on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty URL: https://github.com/apache/spark/pull/27861#issuecomment-602333612 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/24887/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty
AmplabJenkins commented on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty URL: https://github.com/apache/spark/pull/27861#issuecomment-602333577 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty
SparkQA commented on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty URL: https://github.com/apache/spark/pull/27861#issuecomment-602332620 **[Test build #120174 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120174/testReport)** for PR 27861 at commit [`d369cbc`](https://github.com/apache/spark/commit/d369cbc61811f4334f511fcda9e191a89c040e3c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty
HyukjinKwon commented on issue #27861: [SPARK-30707][SQL]Window function set partitionSpec as order spec when orderSpec is empty URL: https://github.com/apache/spark/pull/27861#issuecomment-602330738 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage
HyukjinKwon commented on a change in pull request #27690: [SPARK-21514][SQL] Added a new option to use non-blobstore storage when writing into blobstore storage URL: https://github.com/apache/spark/pull/27690#discussion_r396186608 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala ## @@ -125,10 +163,22 @@ private[hive] trait SaveAsHiveFile extends DataWritingCommand { val stagingDir = hadoopConf.get("hive.exec.stagingdir", ".hive-staging") val scratchDir = hadoopConf.get("hive.exec.scratchdir", "/tmp/hive") +// Hive sets session_path as HDFS_SESSION_PATH_KEY(_hive.hdfs.session.path) in hive config +val sessionScratchDir = externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog] + .client.getConf("_hive.hdfs.session.path", "") Review comment: Thanks, @moomindani. How does Hive behaves when `_hive.hdfs.session.path` is not set? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org