[GitHub] [spark] AmplabJenkins commented on issue #26438: [SPARK-29408][SQL] Support sign before `interval` in interval literals
AmplabJenkins commented on issue #26438: [SPARK-29408][SQL] Support sign before `interval` in interval literals URL: https://github.com/apache/spark/pull/26438#issuecomment-551866065 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26438: [SPARK-29408][SQL] Support sign before `interval` in interval literals
SparkQA removed a comment on issue #26438: [SPARK-29408][SQL] Support sign before `interval` in interval literals URL: https://github.com/apache/spark/pull/26438#issuecomment-551584500 **[Test build #113457 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113457/testReport)** for PR 26438 at commit [`f7c0f8c`](https://github.com/apache/spark/commit/f7c0f8c053454bd801c8ba9e7cae200b788be511). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26320: [SPARK-29654][CORE] Add configuration to allow disabling registration of static sources to the metrics system
SparkQA commented on issue #26320: [SPARK-29654][CORE] Add configuration to allow disabling registration of static sources to the metrics system URL: https://github.com/apache/spark/pull/26320#issuecomment-551865517 **[Test build #113464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113464/testReport)** for PR 26320 at commit [`1a03124`](https://github.com/apache/spark/commit/1a031249ca4de6f46f5a49897db72caecf17555d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
AngersZh commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344219600 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +171,63 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec(child: Expression, + subQuery: String, + plan: BaseSubqueryExec, + exprId: ExprId, + private var resultBroadcast: Broadcast[Boolean] = null) + extends ExecSubqueryExpression { + + @transient private var result: Boolean = _ + + override def dataType: DataType = BooleanType + override def children: Seq[Expression] = child :: Nil + override def nullable: Boolean = child.nullable + override def toString: String = s"EXISTS ${plan.name}" + override def withNewPlan(plan: BaseSubqueryExec): ExistsExec = copy(plan = plan) + + override def semanticEquals(other: Expression): Boolean = other match { +case in: ExistsExec => child.semanticEquals(in.child) && plan.sameResult(in.plan) +case _ => false + } + + + def updateResult(): Unit = { +result = !plan.execute().isEmpty() Review comment: > seems like this is better to execute a non-correlated EXISTS subquery. Maybe we should update `RewritePredicateSubquery` to only handle correlated EXISTS subquery. @dilipbiswal what do you think? Yeah, wait for his advise. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
LantaoJin commented on a change in pull request #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#discussion_r344220026 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterHeartbeatEndpoint.scala ## @@ -0,0 +1,58 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.storage + +import scala.collection.mutable + +import org.apache.spark.internal.Logging +import org.apache.spark.rpc.{IsolatedRpcEndpoint, RpcCallContext, RpcEnv} +import org.apache.spark.storage.BlockManagerMessages.{BlockManagerHeartbeat, StopBlockManagerMaster} + +/** + * Separate heartbeat out of BlockManagerMasterEndpoint due to performance consideration. + */ +private[spark] class BlockManagerMasterHeartbeatEndpoint( +override val rpcEnv: RpcEnv, +isLocal: Boolean, +blockManagerInfo: mutable.Map[BlockManagerId, BlockManagerInfo]) + extends IsolatedRpcEndpoint with Logging { + + override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = { +case BlockManagerHeartbeat(blockManagerId) => + context.reply(heartbeatReceived(blockManagerId)) + +case StopBlockManagerMaster => Review comment: Do we need to emphasize **reuse**? All endpoints belong to BlockManagerMaster should stop themselves when they received `StopBlockManagerMaster ` event. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26438: [SPARK-29408][SQL] Support sign before `interval` in interval literals
SparkQA commented on issue #26438: [SPARK-29408][SQL] Support sign before `interval` in interval literals URL: https://github.com/apache/spark/pull/26438#issuecomment-551865416 **[Test build #113457 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113457/testReport)** for PR 26438 at commit [`f7c0f8c`](https://github.com/apache/spark/commit/f7c0f8c053454bd801c8ba9e7cae200b788be511). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
AmplabJenkins removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551862516 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
AmplabJenkins removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551862531 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113458/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
AmplabJenkins commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551862516 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
AmplabJenkins commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551862531 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113458/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
SparkQA removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551584523 **[Test build #113458 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113458/testReport)** for PR 26167 at commit [`391559d`](https://github.com/apache/spark/commit/391559d9457678eee74f40cba52b09a935b9beb9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
SparkQA commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551861873 **[Test build #113458 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113458/testReport)** for PR 26167 at commit [`391559d`](https://github.com/apache/spark/commit/391559d9457678eee74f40cba52b09a935b9beb9). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class DeleteAction(condition: Option[Expression]) extends MergeAction(condition)` * `case class Assignment(key: Expression, value: Expression) extends Expression with Unevaluable ` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gaborgsomogyi commented on issue #26436: [MINOR]FsHistoryProvider import cleanup
gaborgsomogyi commented on issue #26436: [MINOR]FsHistoryProvider import cleanup URL: https://github.com/apache/spark/pull/26436#issuecomment-551858374 I've gone through https://github.com/apache/spark/pull/25670 + https://github.com/apache/spark/pull/26397 changes and fixed the import there in order to cover this effort. Do you have something specific in mind? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344210658 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") Review comment: ah nvm, it's already tested in `HiveDDLSuite` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344210033 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") Review comment: let's test CREATE TABLE LIKE temp view here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344209322 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t3 LIKE s USING com.databricks.Spark.csv") Review comment: we can just use a fake name like `foo` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344209322 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t3 LIKE s USING com.databricks.Spark.csv") Review comment: we can just use a fake name like `foo` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344209648 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t3 LIKE s USING com.databricks.Spark.csv") + }.getMessage + assert(e1.contains("Failed to find data source: com.databricks.Spark.csv")) + + val e2 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t4 LIKE s USING unknown") Review comment: This is good enough. Don't need to test `com.databricks.Spark.csv` specially. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
cloud-fan commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344208711 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala ## @@ -123,6 +123,19 @@ class GlobalTempViewSuite extends QueryTest with SharedSparkSession { } } + test("CREATE TABLE LIKE USING provider for view") { +withTable("cloned") { + withGlobalTempView("src") { +sql("CREATE GLOBAL TEMP VIEW src AS SELECT 1 AS a, '2' AS b") +sql(s"CREATE TABLE cloned LIKE $globalTempDB.src USING orc") Review comment: This is nothing special with global temp view, we can just put the test in `DDLSuite` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
cloud-fan commented on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#issuecomment-551854901 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
cloud-fan closed pull request #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344205166 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +171,63 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec(child: Expression, + subQuery: String, + plan: BaseSubqueryExec, + exprId: ExprId, + private var resultBroadcast: Broadcast[Boolean] = null) + extends ExecSubqueryExpression { + + @transient private var result: Boolean = _ + + override def dataType: DataType = BooleanType + override def children: Seq[Expression] = child :: Nil + override def nullable: Boolean = child.nullable + override def toString: String = s"EXISTS ${plan.name}" + override def withNewPlan(plan: BaseSubqueryExec): ExistsExec = copy(plan = plan) + + override def semanticEquals(other: Expression): Boolean = other match { +case in: ExistsExec => child.semanticEquals(in.child) && plan.sameResult(in.plan) +case _ => false + } + + + def updateResult(): Unit = { +result = !plan.execute().isEmpty() +resultBroadcast = plan.sqlContext.sparkContext.broadcast[Boolean](result) + } + + def values(): Option[Boolean] = Option(resultBroadcast).map(_.value) + + private def prepareResult(): Unit = { +require(resultBroadcast != null, s"$this has not finished") +result = resultBroadcast.value + } + + override def eval(input: InternalRow): Any = { +prepareResult() +result + } + + override lazy val canonicalized: ExistsExec = { +copy( + child = child.canonicalized, + subQuery = subQuery, + plan = plan.canonicalized.asInstanceOf[BaseSubqueryExec], + exprId = ExprId(0), + resultBroadcast = null) + } + + override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = { +prepareResult() +ExistsSubquery(child, subQuery, result).doGenCode(ctx, ev) Review comment: why we create `ExistsSubquery` to only do codegen? can we put the codegen logic in `ExistsExec`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344203937 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +171,63 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec(child: Expression, + subQuery: String, + plan: BaseSubqueryExec, + exprId: ExprId, + private var resultBroadcast: Broadcast[Boolean] = null) + extends ExecSubqueryExpression { + + @transient private var result: Boolean = _ + + override def dataType: DataType = BooleanType + override def children: Seq[Expression] = child :: Nil + override def nullable: Boolean = child.nullable + override def toString: String = s"EXISTS ${plan.name}" + override def withNewPlan(plan: BaseSubqueryExec): ExistsExec = copy(plan = plan) + + override def semanticEquals(other: Expression): Boolean = other match { +case in: ExistsExec => child.semanticEquals(in.child) && plan.sameResult(in.plan) +case _ => false + } + + + def updateResult(): Unit = { +result = !plan.execute().isEmpty() Review comment: seems like this is better to execute a non-correlated EXISTS subquery. Maybe we should update `RewritePredicateSubquery` to only handle correlated EXISTS subquery. @dilipbiswal what do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
AmplabJenkins removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-551850663 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113463/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
AmplabJenkins removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-551850653 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
AmplabJenkins commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-551850663 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113463/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
AngersZh commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344201008 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -194,6 +257,19 @@ case class PlanSubqueries(sparkSession: SparkSession) extends Rule[SparkPlan] { } val executedPlan = new QueryExecution(sparkSession, query).executedPlan InSubqueryExec(expr, SubqueryExec(s"subquery#${exprId.id}", executedPlan), exprId) + case expressions.Exists(sub, children, exprId) => Review comment: > We can think more about how to solve this problem in your original PR. Try some ways, result is not the same as PostgresSQL. Since for full outer join, we can't push down or build a new join. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
AmplabJenkins commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-551850653 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
SparkQA commented on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-551849814 **[Test build #113463 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113463/testReport)** for PR 25971 at commit [`7b8b398`](https://github.com/apache/spark/commit/7b8b398633789b65d116ce716d6fb1afcded0427). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
SparkQA removed a comment on issue #25971: [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint URL: https://github.com/apache/spark/pull/25971#issuecomment-551665691 **[Test build #113463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113463/testReport)** for PR 25971 at commit [`7b8b398`](https://github.com/apache/spark/commit/7b8b398633789b65d116ce716d6fb1afcded0427). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
AngersZh commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344200356 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -194,6 +257,19 @@ case class PlanSubqueries(sparkSession: SparkSession) extends Rule[SparkPlan] { } val executedPlan = new QueryExecution(sparkSession, query).executedPlan InSubqueryExec(expr, SubqueryExec(s"subquery#${exprId.id}", executedPlan), exprId) + case expressions.Exists(sub, children, exprId) => Review comment: > We can think more about how to solve this problem in your original PR. See current change, don't collect data, only judge has result, may don't have oom problem This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.
maropu commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression. URL: https://github.com/apache/spark/pull/26420#issuecomment-551848131 Ur, I see nice suggestion. I need more time to think about how to implement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344195267 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { Review comment: ok, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour
AmplabJenkins removed a comment on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-551844715 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113460/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour
AmplabJenkins removed a comment on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-551844706 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344193896 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { Review comment: This is for upper/lower case test. S in “Spark” is upper case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour
AmplabJenkins commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-551844715 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113460/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour
AmplabJenkins commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-551844706 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour
SparkQA commented on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-551844457 **[Test build #113460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113460/testReport)** for PR 25728 at commit [`b031199`](https://github.com/apache/spark/commit/b031199bcd5e6b2a0a5e80775a8ed153ee98f454). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour
SparkQA removed a comment on issue #25728: [SPARK-29020][WIP][SQL] Improving array_sort behaviour URL: https://github.com/apache/spark/pull/25728#issuecomment-551608250 **[Test build #113460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113460/testReport)** for PR 25728 at commit [`b031199`](https://github.com/apache/spark/commit/b031199bcd5e6b2a0a5e80775a8ed153ee98f454). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344192261 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") Review comment: I see. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344191576 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -1223,16 +1223,19 @@ class HiveDDLSuite } test("CREATE TABLE LIKE a temporary view") { -// CREATE TABLE LIKE a temporary view. -withCreateTableLikeTempView(location = None) +Seq(None, Some("parquet"), Some("orc")) foreach { provider => Review comment: Add hive or just hive? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344189751 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t3 LIKE s USING com.databricks.Spark.csv") + }.getMessage + assert(e1.contains("Failed to find data source: com.databricks.Spark.csv")) + + val e2 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t4 LIKE s USING unknown") + }.getMessage + assert(e2.contains("Failed to find data source")) + + if (spark.sparkContext.conf.get(CATALOG_IMPLEMENTATION) == "hive") { Review comment: `DDLSuite` is a abstract class, it has a subclass in Hive package. cc @wangyum This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
cloud-fan commented on a change in pull request #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#discussion_r344190185 ## File path: sql/core/src/test/resources/sql-tests/inputs/interval.sql ## @@ -0,0 +1,43 @@ +-- test for intervals Review comment: now we have a dedicated sql test file for interval, maybe we should put all interval related tests here. We can do it in followup This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
LantaoJin commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344189751 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t3 LIKE s USING com.databricks.Spark.csv") + }.getMessage + assert(e1.contains("Failed to find data source: com.databricks.Spark.csv")) + + val e2 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t4 LIKE s USING unknown") + }.getMessage + assert(e2.contains("Failed to find data source")) + + if (spark.sparkContext.conf.get(CATALOG_IMPLEMENTATION) == "hive") { Review comment: `DDLSutie` is a abstract class, it has a subclass in Hive package. cc @wangyum This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
cloud-fan commented on a change in pull request #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#discussion_r344189789 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ## @@ -855,6 +855,11 @@ object TypeCoercion { case Divide(l @ CalendarIntervalType(), r @ NumericType()) => DivideInterval(l, r) + case b @ BinaryOperator(l @ CalendarIntervalType(), r @ NullType()) => Review comment: This is a little hacky. Maybe we should introduce `UnresolvedMultiply` and `UnresolvedDivide`, so that we don't need to hack the type coercion rules. We can try it in followup. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
AngersZh commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344188653 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +171,69 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec(child: Expression, + subQuery: String, + plan: BaseSubqueryExec, + exprId: ExprId, + private var resultBroadcast: Broadcast[Array[Any]] = null) + extends ExecSubqueryExpression { + + @transient private var result: Array[Any] = _ + + override def dataType: DataType = BooleanType + override def children: Seq[Expression] = child :: Nil + override def nullable: Boolean = child.nullable + override def toString: String = s"EXISTS ${plan.name}" + override def withNewPlan(plan: BaseSubqueryExec): ExistsExec = copy(plan = plan) + + override def semanticEquals(other: Expression): Boolean = other match { +case in: ExistsExec => child.semanticEquals(in.child) && plan.sameResult(in.plan) +case _ => false + } + + + def updateResult(): Unit = { +val rows = plan.executeCollect() Review comment: > The reason why we don't have a physical plan for Exists is: it's not robust. Collecting the entire result of a query plan at the driver side is very likely to hit OOM. That's why we have to convert Exists to a join. We can make it just return rdd.isEmpy() since exists just need to judge if result is empty. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.
cloud-fan commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression. URL: https://github.com/apache/spark/pull/26420#issuecomment-551840315 @maropu looks like a good idea. But we need to make sure the aggregate function ignore nulls. may not work for `count(*) filter (where a > 1)` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.
maropu commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression. URL: https://github.com/apache/spark/pull/26420#issuecomment-551834128 I just thought a super simple case like this; ``` postgres=# select * from t; k | v1 | v2 ---++ 1 | 2 | 3 1 | 4 | 5 1 || 9 2 | 3 | 2 | 5 | 8 (5 rows) postgres=# select k, sum(v1) filter (where v1 > 2), avg(v2) filter (where v2 < 6) from t group by k; k | sum |avg ---+-+ 2 | 8 | 1 | 4 | 4. (2 rows) ``` The query above might be transformed into... ``` scala> sql("select k, sum(v1), avg(v2) from (select k, if(v1 > 2, v1, null) v1, if(v2 < 6, v2, null) v2 from t) group by k").show() +---+---+---+ | k|sum(v1)|avg(v2)| +---+---+---+ | 1| 4|4.0| | 2| 8| null| +---+---+---+ ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #26347: [SPARK-29688][SQL] Support average for interval type values
cloud-fan closed pull request #26347: [SPARK-29688][SQL] Support average for interval type values URL: https://github.com/apache/spark/pull/26347 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26347: [SPARK-29688][SQL] Support average for interval type values
cloud-fan commented on issue #26347: [SPARK-29688][SQL] Support average for interval type values URL: https://github.com/apache/spark/pull/26347#issuecomment-551832075 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344183324 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -194,6 +257,19 @@ case class PlanSubqueries(sparkSession: SparkSession) extends Rule[SparkPlan] { } val executedPlan = new QueryExecution(sparkSession, query).executedPlan InSubqueryExec(expr, SubqueryExec(s"subquery#${exprId.id}", executedPlan), exprId) + case expressions.Exists(sub, children, exprId) => Review comment: We should simply throw exception for any other `SubqueryExpression`, explicitly saying that it's not supported. We can think more about how to solve this problem in your original PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries
cloud-fan commented on a change in pull request #26437: [SPARK-29800][SQL] Plan Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#discussion_r344182146 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -171,6 +171,69 @@ case class InSubqueryExec( } } +/** + * The physical node of exists-subquery. This is for support use exists in join's on condition, + * since some join type we can't pushdown exists condition, we plan it here + */ +case class ExistsExec(child: Expression, + subQuery: String, + plan: BaseSubqueryExec, + exprId: ExprId, + private var resultBroadcast: Broadcast[Array[Any]] = null) + extends ExecSubqueryExpression { + + @transient private var result: Array[Any] = _ + + override def dataType: DataType = BooleanType + override def children: Seq[Expression] = child :: Nil + override def nullable: Boolean = child.nullable + override def toString: String = s"EXISTS ${plan.name}" + override def withNewPlan(plan: BaseSubqueryExec): ExistsExec = copy(plan = plan) + + override def semanticEquals(other: Expression): Boolean = other match { +case in: ExistsExec => child.semanticEquals(in.child) && plan.sameResult(in.plan) +case _ => false + } + + + def updateResult(): Unit = { +val rows = plan.executeCollect() Review comment: The reason why we don't have a physical plan for Exists is: it's not robust. Collecting the entire result of a query plan at the driver side is very likely to hit OOM. That's why we have to convert Exists to a join. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
AmplabJenkins removed a comment on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#issuecomment-551823137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113459/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
AmplabJenkins removed a comment on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#issuecomment-551823102 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
AmplabJenkins commented on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#issuecomment-551823102 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
AmplabJenkins commented on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#issuecomment-551823137 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113459/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
SparkQA commented on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#issuecomment-551820312 **[Test build #113459 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113459/testReport)** for PR 26337 at commit [`5404d70`](https://github.com/apache/spark/commit/5404d701d702e9308ca91f354aa5304d7beac789). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable
SparkQA removed a comment on issue #26337: [SPARK-29679][SQL] Make interval type comparable and orderable URL: https://github.com/apache/spark/pull/26337#issuecomment-551596740 **[Test build #113459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113459/testReport)** for PR 26337 at commit [`5404d70`](https://github.com/apache/spark/commit/5404d701d702e9308ca91f354aa5304d7beac789). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
AmplabJenkins removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551813972 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113455/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] davidvrba commented on issue #26294: [SPARK-28477] [SQL] Rewrite CaseWhen with single branch to If
davidvrba commented on issue #26294: [SPARK-28477] [SQL] Rewrite CaseWhen with single branch to If URL: https://github.com/apache/spark/pull/26294#issuecomment-551814825 Thank you very much! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
AmplabJenkins removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551813938 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
AmplabJenkins commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551813938 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
AmplabJenkins commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551813972 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113455/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
SparkQA removed a comment on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551478479 **[Test build #113455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113455/testReport)** for PR 26167 at commit [`b7c228d`](https://github.com/apache/spark/commit/b7c228de6b7c0a6acc767ee31b0468a06ac4631b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan
SparkQA commented on issue #26167: [SPARK-28893][SQL] Support MERGE INTO in the parser and add the corresponding logical plan URL: https://github.com/apache/spark/pull/26167#issuecomment-551811485 **[Test build #113455 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113455/testReport)** for PR 26167 at commit [`b7c228d`](https://github.com/apache/spark/commit/b7c228de6b7c0a6acc767ee31b0468a06ac4631b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #26315: [SPARK-29152] Executor Plugin shutdown when dynamic allocation is ena…
srowen commented on a change in pull request #26315: [SPARK-29152] Executor Plugin shutdown when dynamic allocation is ena… URL: https://github.com/apache/spark/pull/26315#discussion_r344175574 ## File path: core/src/main/scala/org/apache/spark/executor/Executor.scala ## @@ -294,13 +300,16 @@ private[spark] class Executor( threadPool.shutdown() // Notify plugins that executor is shutting down so they can terminate cleanly -Utils.withContextClassLoader(replClassLoader) { - executorPlugins.foreach { plugin => -try { - plugin.shutdown() -} catch { - case e: Exception => -logWarning("Plugin " + plugin.getClass().getCanonicalName() + " shutdown failed", e) +if (!executorShutdown) { Review comment: This is too late to check ; you'd potentially execute some of stop() twice This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #26418: [SPARK-29783][SQL] Support SQL Standard output style for interval type
yaooqinn commented on a change in pull request #26418: [SPARK-29783][SQL] Support SQL Standard output style for interval type URL: https://github.com/apache/spark/pull/26418#discussion_r344174342 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1774,6 +1773,19 @@ object SQLConf { .booleanConf .createWithDefault(false) + object IntervalStyle extends Enumeration { +val SQL_STANDARD, MULTI_UNITS = Value + } + + val INTERVAL_STYLE = buildConf("spark.sql.IntervalOutputStyle") +.doc("Display format for interval values. The value SQL_STANDARD will produce output" + + " matching SQL standard interval literals. The value MULTI_UNITS (which is the default)" + + " will produce output in form of value unit pairs, i.e. '3 year 2 months 10 days'") +.stringConf +.transform(_.toUpperCase(Locale.ROOT)) +.checkValues(IntervalStyle.values.map(_.toString)) +.createWithDefault(IntervalStyle.MULTI_UNITS.toString) Review comment: yes, I guess some users may already rely on the output string This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.
cloud-fan commented on a change in pull request #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression. URL: https://github.com/apache/spark/pull/26420#discussion_r344173936 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/higherOrderFunctions.scala ## @@ -33,7 +33,7 @@ import org.apache.spark.sql.types.DataType case class ResolveHigherOrderFunctions(catalog: SessionCatalog) extends Rule[LogicalPlan] { override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveExpressions { -case u @ UnresolvedFunction(fn, children, false) +case u @ UnresolvedFunction(fn, children, false, _) Review comment: it's better if we can throw exception if FILTER is specified in an improper place. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] nerush commented on a change in pull request #26391: [SPARK-29749][SQL] Add ParquetScan statistics
nerush commented on a change in pull request #26391: [SPARK-29749][SQL] Add ParquetScan statistics URL: https://github.com/apache/spark/pull/26391#discussion_r343203769 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetUtils.scala ## @@ -107,6 +114,29 @@ object ParquetUtils { ParquetFileFormat.mergeSchemasInParallel(filesToTouch, sparkSession) } + def getStatistics(files: Array[String], configuration: Configuration): Statistics = { +var bytes = 0L +var rows = 0L +files.foreach { file => Review comment: We can make statistics collection more functional with `foldLeft` on the given array and avoid using a mutable state. Furthermore, is it not reasonable to collect it in parallel? Should we have a more proper exception handling when file cannot be opened? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression.
cloud-fan commented on issue #26420: [SPARK-27986][SQL] Support ANSI SQL filter predicate for aggregate expression. URL: https://github.com/apache/spark/pull/26420#issuecomment-551794254 This is a nice feature! I'd like to know how it's implemented. Seems like we can't transform it into another logical form that we support, and we need to adjust our backend engine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26294: [SPARK-28477] [SQL] Rewrite CaseWhen with single branch to If
cloud-fan commented on issue #26294: [SPARK-28477] [SQL] Rewrite CaseWhen with single branch to If URL: https://github.com/apache/spark/pull/26294#issuecomment-551789572 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26418: [SPARK-29783][SQL] Support SQL Standard output style for interval type
maropu commented on a change in pull request #26418: [SPARK-29783][SQL] Support SQL Standard output style for interval type URL: https://github.com/apache/spark/pull/26418#discussion_r344171443 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1774,6 +1773,19 @@ object SQLConf { .booleanConf .createWithDefault(false) + object IntervalStyle extends Enumeration { +val SQL_STANDARD, MULTI_UNITS = Value + } + + val INTERVAL_STYLE = buildConf("spark.sql.IntervalOutputStyle") +.doc("Display format for interval values. The value SQL_STANDARD will produce output" + + " matching SQL standard interval literals. The value MULTI_UNITS (which is the default)" + + " will produce output in form of value unit pairs, i.e. '3 year 2 months 10 days'") +.stringConf +.transform(_.toUpperCase(Locale.ROOT)) +.checkValues(IntervalStyle.values.map(_.toString)) +.createWithDefault(IntervalStyle.MULTI_UNITS.toString) Review comment: I personally think `ansiEnabled` is enough for this feature. Any concern? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #26294: [SPARK-28477] [SQL] Rewrite CaseWhen with single branch to If
cloud-fan closed pull request #26294: [SPARK-28477] [SQL] Rewrite CaseWhen with single branch to If URL: https://github.com/apache/spark/pull/26294 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-551665715 **[Test build #113462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113462/testReport)** for PR 26439 at commit [`2251bd7`](https://github.com/apache/spark/commit/2251bd7ff2790c4adc5c0f2a1701aa59781cd799). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-551759440 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113462/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins removed a comment on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-551759409 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-551759440 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113462/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
AmplabJenkins commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-551759409 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method
SparkQA commented on issue #26439: [SPARK-29801][ML] ML models unify toString method URL: https://github.com/apache/spark/pull/26439#issuecomment-551757971 **[Test build #113462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113462/testReport)** for PR 26439 at commit [`2251bd7`](https://github.com/apache/spark/commit/2251bd7ff2790c4adc5c0f2a1701aa59781cd799). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26436: [MINOR]FsHistoryProvider import cleanup
AmplabJenkins removed a comment on issue #26436: [MINOR]FsHistoryProvider import cleanup URL: https://github.com/apache/spark/pull/26436#issuecomment-551746992 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113456/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26436: [MINOR]FsHistoryProvider import cleanup
AmplabJenkins commented on issue #26436: [MINOR]FsHistoryProvider import cleanup URL: https://github.com/apache/spark/pull/26436#issuecomment-551746992 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113456/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26436: [MINOR]FsHistoryProvider import cleanup
AmplabJenkins removed a comment on issue #26436: [MINOR]FsHistoryProvider import cleanup URL: https://github.com/apache/spark/pull/26436#issuecomment-551746960 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26436: [MINOR]FsHistoryProvider import cleanup
AmplabJenkins commented on issue #26436: [MINOR]FsHistoryProvider import cleanup URL: https://github.com/apache/spark/pull/26436#issuecomment-551746960 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26436: [MINOR]FsHistoryProvider import cleanup
AmplabJenkins removed a comment on issue #26436: [MINOR]FsHistoryProvider import cleanup URL: https://github.com/apache/spark/pull/26436#issuecomment-551538301 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26436: [MINOR]FsHistoryProvider import cleanup
SparkQA removed a comment on issue #26436: [MINOR]FsHistoryProvider import cleanup URL: https://github.com/apache/spark/pull/26436#issuecomment-551536768 **[Test build #113456 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113456/testReport)** for PR 26436 at commit [`e6a8364`](https://github.com/apache/spark/commit/e6a836484cd2f1c18b1802c455fbb7244232e552). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344159404 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t3 LIKE s USING com.databricks.Spark.csv") + }.getMessage + assert(e1.contains("Failed to find data source: com.databricks.Spark.csv")) + + val e2 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t4 LIKE s USING unknown") + }.getMessage + assert(e2.contains("Failed to find data source")) + + if (spark.sparkContext.conf.get(CATALOG_IMPLEMENTATION) == "hive") { Review comment: This is the `core` package, so you don't need to add tests for hive here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344154231 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { Review comment: Is this test the same with the test in L2841-L2844? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26436: [MINOR]FsHistoryProvider import cleanup
SparkQA commented on issue #26436: [MINOR]FsHistoryProvider import cleanup URL: https://github.com/apache/spark/pull/26436#issuecomment-551744392 **[Test build #113456 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113456/testReport)** for PR 26436 at commit [`e6a8364`](https://github.com/apache/spark/commit/e6a836484cd2f1c18b1802c455fbb7244232e552). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344154954 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -57,23 +55,32 @@ import org.apache.spark.sql.util.SchemaUtils * The CatalogTable attributes copied from the source table are storage(inputFormat, outputFormat, * serde, compressed, properties), schema, provider, partitionColumnNames, bucketSpec. * + * Use "CREATE TABLE t1 LIKE t2 USING file_format" + * to specify new file format for t1 from a data source table t2. + * * The syntax of using this command in SQL is: * {{{ * CREATE TABLE [IF NOT EXISTS] [db_name.]table_name - * LIKE [other_db_name.]existing_table_name [locationSpec] + * LIKE [other_db_name.]existing_table_name [USING provider] [locationSpec] * }}} */ case class CreateTableLikeCommand( targetTable: TableIdentifier, sourceTable: TableIdentifier, +provider: Option[String], location: Option[String], ifNotExists: Boolean) extends RunnableCommand { override def run(sparkSession: SparkSession): Seq[Row] = { val catalog = sparkSession.sessionState.catalog val sourceTableDesc = catalog.getTempViewOrPermanentTableMetadata(sourceTable) -val newProvider = if (sourceTableDesc.tableType == CatalogTableType.VIEW) { +val newProvider = if (provider.isDefined) { + // check the validation of provider input, invalid provider will throw + // AnalysisException or ClassNotFoundException or NoSuchMethodException Review comment: nit: AnalysisException, ClassNotFoundException, or NoSuchMethodException This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344154231 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { Review comment: Is this test totally the same with the test in L2841-L2844? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344155524 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -1223,16 +1223,19 @@ class HiveDDLSuite } test("CREATE TABLE LIKE a temporary view") { -// CREATE TABLE LIKE a temporary view. -withCreateTableLikeTempView(location = None) +Seq(None, Some("parquet"), Some("orc")) foreach { provider => Review comment: How about just using `hiveFormats` here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344154795 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") + val table2 = catalog.getTableMetadata(TableIdentifier("t2")) + assert(table2.provider == Some("com.databricks.spark.csv")) + + val e1 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t3 LIKE s USING com.databricks.Spark.csv") + }.getMessage + assert(e1.contains("Failed to find data source: com.databricks.Spark.csv")) + + val e2 = intercept[ClassNotFoundException] { +sql("CREATE TABLE t4 LIKE s USING unknown") + }.getMessage + assert(e2.contains("Failed to find data source")) + Review comment: How about tests for `NoSuchMethodException`? https://github.com/apache/spark/pull/26097/files#diff-a53c8b7022d13417a2ef33372464f9b5R80 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344158547 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { +val catalog = spark.sessionState.catalog +withTable("s", "t1", "t2", "t3", "t4", "t5") { + sql("CREATE TABLE s(a INT, b INT) USING parquet") + val source = catalog.getTableMetadata(TableIdentifier("s")) + assert(source.provider == Some("parquet")) + + sql("CREATE TABLE t1 LIKE s USING orc") + val table1 = catalog.getTableMetadata(TableIdentifier("t1")) + assert(table1.provider == Some("orc")) + + sql("CREATE TABLE t2 LIKE s USING com.databricks.spark.csv") Review comment: Please do not depend on the 3rd-party package here (I personally think this test is not needed). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider
maropu commented on a change in pull request #26097: [SPARK-29421][SQL] Supporting Create Table Like Using Provider URL: https://github.com/apache/spark/pull/26097#discussion_r344153840 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ## @@ -2817,4 +2817,42 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils { } } } + + test("SPARK-29421: Create Table LIKE USING provider") { Review comment: Since this is not a bug fix, you don't need the prefix: `SPARK-29421: `. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26347: [SPARK-29688][SQL] Support average for interval type values
AmplabJenkins removed a comment on issue #26347: [SPARK-29688][SQL] Support average for interval type values URL: https://github.com/apache/spark/pull/26347#issuecomment-551725695 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113452/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26347: [SPARK-29688][SQL] Support average for interval type values
AmplabJenkins commented on issue #26347: [SPARK-29688][SQL] Support average for interval type values URL: https://github.com/apache/spark/pull/26347#issuecomment-551725695 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/113452/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26347: [SPARK-29688][SQL] Support average for interval type values
AmplabJenkins commented on issue #26347: [SPARK-29688][SQL] Support average for interval type values URL: https://github.com/apache/spark/pull/26347#issuecomment-551725670 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26347: [SPARK-29688][SQL] Support average for interval type values
AmplabJenkins removed a comment on issue #26347: [SPARK-29688][SQL] Support average for interval type values URL: https://github.com/apache/spark/pull/26347#issuecomment-551725670 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26347: [SPARK-29688][SQL] Support average for interval type values
SparkQA removed a comment on issue #26347: [SPARK-29688][SQL] Support average for interval type values URL: https://github.com/apache/spark/pull/26347#issuecomment-551438951 **[Test build #113452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/113452/testReport)** for PR 26347 at commit [`2e06305`](https://github.com/apache/spark/commit/2e06305647f79d3e80a5c72d303ef88bc2ef8258). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org