[GitHub] [spark] huaxingao commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
huaxingao commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402068421 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. + + + + + zero: BUF + +A zero value for this aggregation. + + + +### org.apache.spark.sql.UDFRegistration + +Functions for registering user-defined functions. Use `SparkSession.udf` to access this: `spark.udf` + + + register(name: String, udf: UserDefinedFunction): UserDefinedFunction + +Registers a user-defined function (UDF). + + + +### Examples + +{% highlight sql %} + +// Define and register a UDAF to calculate the sum of product of two columns +// Scala +import org.apache.spark.sql.expressions.Aggregator +import org.apache.spark.sql.functions.udaf + +val agg = udaf(new Aggregator[(Long, Long), Long, Long] { + def zero: Long = 0L + def reduce(b: Long, a: (Long, Long)): Long = b + (a._1 * a._2) + def merge(b1: Long, b2: Long): Long = b1 + b2 + def finish(r: Long): Long = r + def bufferEncoder: Encoder[Long] = Encoders.scalaLong + def outputEncoder: Encoder[Long] = Encoders.scalaLong +}) + +spark.udf.register("agg", agg) + +val df = Seq( + (1, 1), + (1, 5), + (2, 10), + (2, -1), + (4, 7), + (3, 8), + (2, 4)) + .toDF("a", "b") + +df.createOrReplaceTempView("testUDAF") + +-- SQL +SELECT agg(a, b) FROM testUDAF; Review comment: I will change all examples to this format This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] huaxingao commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
huaxingao commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402068474 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. Review comment: changed to b because ` doesn't work inside html. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
SparkQA removed a comment on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607572692 **[Test build #120703 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120703/testReport)** for PR 28040 at commit [`68848d9`](https://github.com/apache/spark/commit/68848d993a7a2fc7c7b860d4129bf27147c57a81). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
SparkQA commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607637753 **[Test build #120703 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120703/testReport)** for PR 28040 at commit [`68848d9`](https://github.com/apache/spark/commit/68848d993a7a2fc7c7b860d4129bf27147c57a81). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
cloud-fan commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#discussion_r402067498 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ## @@ -295,6 +295,41 @@ case class AlterViewAsCommand( } } +/** + * A command for users to get views in the given database. + * If a databaseName is not given, the current database will be used. + * The syntax of using this command in SQL is: + * {{{ + * SHOW VIEWS [(IN|FROM) database_name] [[LIKE] 'identifier_with_wildcards']; + * }}} + */ +case class ShowViewsCommand( +databaseName: Option[String], +tableIdentifierPattern: Option[String]) extends RunnableCommand { + + // The result of SHOW VIEWS has three basic columns: namespace, viewName and isTemporary. + override val output: Seq[Attribute] = Seq( +AttributeReference("namespace", StringType, nullable = false)(), +AttributeReference("viewName", StringType, nullable = false)(), +AttributeReference("isTemporary", BooleanType, nullable = false)()) + + override def run(sparkSession: SparkSession): Seq[Row] = { +val catalog = sparkSession.sessionState.catalog +val db = databaseName.getOrElse(catalog.getCurrentDatabase) + +// Show the information of views. +val views = tableIdentifierPattern.map(catalog.listViews(db, _)) + .getOrElse(catalog.listViews(db, "*")) +views.map { tableIdent => + val namespace = tableIdent.database.getOrElse("") Review comment: How about `tableIdent.database.toSeq.quoted`? We should quote it if the database name has dot, to be consistent with the future v2 command view name. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
cloud-fan commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#discussion_r402067008 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ## @@ -295,6 +295,41 @@ case class AlterViewAsCommand( } } +/** + * A command for users to get views in the given database. + * If a databaseName is not given, the current database will be used. + * The syntax of using this command in SQL is: + * {{{ + * SHOW VIEWS [(IN|FROM) database_name] [[LIKE] 'identifier_with_wildcards']; + * }}} + */ +case class ShowViewsCommand( +databaseName: Option[String], +tableIdentifierPattern: Option[String]) extends RunnableCommand { + + // The result of SHOW VIEWS has three basic columns: database, viewName and isTemporary. + override val output: Seq[Attribute] = Seq( +AttributeReference("namespace", StringType, nullable = false)(), Review comment: Let's use namespace to avoid potential breaking changes. `SHOW TABLES` will have breaking change when we fully migrate v1 commands to v2, but that's inevitable. At least we can avoid it for `SHOW VIEWS`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination
AngersZh commented on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination URL: https://github.com/apache/spark/pull/28098#issuecomment-607630408 cc @Ngone51 @slamke I meet same problem these days and reproduce it in noserde mode This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination
AmplabJenkins removed a comment on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination URL: https://github.com/apache/spark/pull/28098#issuecomment-607630148 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination
AmplabJenkins commented on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination URL: https://github.com/apache/spark/pull/28098#issuecomment-607630508 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination
AmplabJenkins commented on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination URL: https://github.com/apache/spark/pull/28098#issuecomment-607630148 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination
AngersZh commented on issue #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination URL: https://github.com/apache/spark/pull/28098#issuecomment-607630040 @dongjoon-hyun For error I meet in. https://github.com/apache/spark/pull/27983#issuecomment-607599637 I make a new PR in origin master, if this problem fixed, I will merge to https://github.com/apache/spark/pull/27983 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu opened a new pull request #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination
AngersZh opened a new pull request #28098: [SPARK-30973][SQL]ScriptTransformationExec should wait for the termination URL: https://github.com/apache/spark/pull/28098 ### What changes were proposed in this pull request? In `org.apache.spark.sql.hive.execution.ScriptTransformationExec`, when check error, sometimes we can't catch write error in subproc, when call `checkFailureAndPropagate` we should always wait for subproc stop ### Why are the changes needed? Catch error in script transform ### Does this PR introduce any user-facing change? NO ### How was this patch tested? Added UT This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #27724: [SPARK-30973][SQL] ScriptTransformationExec should wait for the termination …
AngersZh commented on issue #27724: [SPARK-30973][SQL] ScriptTransformationExec should wait for the termination … URL: https://github.com/apache/spark/pull/27724#issuecomment-607628004 with below test, `noserde` will meet similar problem ``` test("SPARK-30973 ScriptTransformationExec should wait for the termination") { (0 until 10).foreach { index => assume(TestUtils.testCommandAvailable("/bin/bash")) val rowsDf = Seq("a", "b", "c").map(Tuple1.apply).toDF("a") val e = intercept[SparkException] { val plan = new ScriptTransformationExec( input = Seq(rowsDf.col("a").expr), script = "some_non_existent_command", output = Seq(AttributeReference("a", StringType)()), child = rowsDf.queryExecution.sparkPlan, ioschema = noSerdeIOSchema) SparkPlanTest.executePlan(plan, hiveContext) } assert(e.getMessage.contains("Subprocess exited with status")) assert(uncaughtExceptionHandler.exception.isEmpty) } } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
AmplabJenkins removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607627435 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
AmplabJenkins removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607627444 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120706/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
AmplabJenkins commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607627444 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120706/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
AmplabJenkins commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607627435 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
SparkQA removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607589044 **[Test build #120706 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120706/testReport)** for PR 27928 at commit [`b088d15`](https://github.com/apache/spark/commit/b088d151c6102e13f124318d8b14e9279da3287a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
SparkQA commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607626894 **[Test build #120706 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120706/testReport)** for PR 27928 at commit [`b088d15`](https://github.com/apache/spark/commit/b088d151c6102e13f124318d8b14e9279da3287a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402057017 ## File path: docs/sql-ref-functions-udf-scalar.md ## @@ -1,22 +1,187 @@ --- layout: global -title: User defined Scalar Functions (UDF) -displayTitle: User defined Scalar Functions (UDF) +title: Scalar User Defined Functions (UDFs) +displayTitle: Scalar User Defined Functions (UDFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Functions (UDFs) are user-programmable routines that act on one row. This documentation lists the classes that are required for creating and registering UDFs. It also contains examples that demonstrate how to define, register UDFs and invoke them in Spark SQL. + + +### org.apache.spark.sql.expressions.UserDefinedFunction + +A user-defined function. To create one, use the `udf` functions in `functions`. + + + asNonNullable(): UserDefinedFunction + +Updates UserDefinedFunction to non-nullable. + + + + + asNondeterministic(): UserDefinedFunction + +Updates UserDefinedFunction to nondeterministic. + + + + + deterministic: Boolean + +Returns true iff the UDF is deterministic, i.e. the UDF produces the same output given the same input. + + + + + nullable: Boolean + +Returns true when the UDF can return a nullable value. + + + + + withName(name: String): UserDefinedFunction + +Updates UserDefinedFunction with a given name. + + + +### org.apache.spark.sql.UDFRegistration + +Functions for registering user-defined functions. Use `SparkSession.udf` to access this: `spark.udf` + + + register(name: String, udf: UserDefinedFunction): UserDefinedFunction + +Registers a user-defined function (UDF). + + + +### Examples + +{% highlight sql %} + +// Define and register a zero argument non-deterministic UDF +// UDF is deterministic by default, i.e. produces the same result for the same input. +// Scala +import org.apache.spark.sql.functions.udf + +val foo = udf(() => Math.random()) +spark.udf.register("random", foo.asNondeterministic()) + +-- SQL +SELECT random(); + ++--+ +|UDF() | ++--+ +|0.9199799737037972| ++--+ + +// Define and register a one argument UDF +// Scala +import org.apache.spark.sql.functions.udf + +val plusOne = udf((x: Int) => x + 1) +spark.udf.register("plusOne", plusOne) + +-- SQL +SELECT plusOne(5); + ++--+ +|UDF(5)| ++--+ +| 6| ++--+ + +// Define a two arguments UDF and register it with Spark in one step Review comment: `Define a two arguments` -> `Define two arguments`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402052896 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). Review comment: `as user-defined function (UDF).` -> `as an user-defined function (UDF).`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402055366 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. + + + + + zero: BUF + +A zero value for this aggregation. + + + +### org.apache.spark.sql.UDFRegistration + +Functions for registering user-defined functions. Use `SparkSession.udf` to access this: `spark.udf` + + + register(name: String, udf: UserDefinedFunction): UserDefinedFunction + +Registers a user-defined function (UDF). Review comment: nit: `a` -> `an`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402055366 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. + + + + + zero: BUF + +A zero value for this aggregation. + + + +### org.apache.spark.sql.UDFRegistration + +Functions for registering user-defined functions. Use `SparkSession.udf` to access this: `spark.udf` + + + register(name: String, udf: UserDefinedFunction): UserDefinedFunction + +Registers a user-defined function (UDF). Review comment: nit: `a` -> `an`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402055366 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. + + + + + zero: BUF + +A zero value for this aggregation. + + + +### org.apache.spark.sql.UDFRegistration + +Functions for registering user-defined functions. Use `SparkSession.udf` to access this: `spark.udf` + + + register(name: String, udf: UserDefinedFunction): UserDefinedFunction + +Registers a user-defined function (UDF). Review comment: nit: `a` -> `an` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402055226 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. + + + + + zero: BUF + +A zero value for this aggregation. + + + +### org.apache.spark.sql.UDFRegistration Review comment: ditto: how about "Inerface of `UDFRegistration`"? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402055097 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] Review comment: How about "Interface of `Aggregator[-IN, BUF, OUT]`"? We need the full-qualifier here? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kiszk edited a comment on issue #28041: [SPARK-30564][SQL] Improved extra new line and comment remove
kiszk edited a comment on issue #28041: [SPARK-30564][SQL] Improved extra new line and comment remove URL: https://github.com/apache/spark/pull/28041#issuecomment-607624670 Got it. @maropu suggested me that `registerComment` translates a comment to empty string as default. I overlooked that `spark.sql.codegen.comments` is false as default. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] kiszk commented on issue #28041: [SPARK-30564][SQL] Improved extra new line and comment remove
kiszk commented on issue #28041: [SPARK-30564][SQL] Improved extra new line and comment remove URL: https://github.com/apache/spark/pull/28041#issuecomment-607624670 Got it. @maropu suggested me that `registerComment` translates a comment to empty string as default. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402054064 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. Review comment: better to wrap b with \`, e.g., \`b\`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402053134 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. Review comment: `a new value` -> `a new aggregated value`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402052896 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). Review comment: `as user-defined function (UDF).` -> `as an user-defined function (UDF).`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference
maropu commented on a change in pull request #28087: [SPARK-31319][SQL][DOCS] Document UDFs/UDAFs in SQL Reference URL: https://github.com/apache/spark/pull/28087#discussion_r402052361 ## File path: docs/sql-ref-functions-udf-aggregate.md ## @@ -1,22 +1,130 @@ --- layout: global -title: User defined Aggregate Functions (UDAF) -displayTitle: User defined Aggregate Functions (UDAF) +title: User Defined Aggregate Functions (UDAFs) +displayTitle: User Defined Aggregate Functions (UDAFs) license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at - + http://www.apache.org/licenses/LICENSE-2.0 - + Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --- -**This page is under construction** +### Description + +User-Defined Aggregate Functions (UDAFs) are user-programmable routines that act on multiple rows at once and return a single aggregated value as a result. This documentation lists the classes that are required for creating and registering UDAFs. It also contains examples that demonstrate how to define and register UDAFs in Scala and invoke them in Spark SQL. + +### org.apache.spark.sql.expressions.Aggregator[-IN, BUF, OUT] + +A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value. +- IN The input type for the aggregation. + +- BUF The type of the intermediate value of the reduction. + +- OUT The type of the final output result. + + + bufferEncoder: Encoder[BUF] + +Register a deterministic Java UDF(0-22) instance as user-defined function (UDF). + + + + + + finish(reduction: BUF): OUT + +Transform the output of the reduction. + + + + + merge(b1: BUF, b2: BUF): BUF + +Merge two intermediate values. + + + + + outputEncoder: Encoder[OUT] + +Specifies the Encoder for the final output value type. + + + + + reduce(b: BUF, a: IN): BUF + +Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b. + + + + + zero: BUF + +A zero value for this aggregation. + + + +### org.apache.spark.sql.UDFRegistration + +Functions for registering user-defined functions. Use `SparkSession.udf` to access this: `spark.udf` + + + register(name: String, udf: UserDefinedFunction): UserDefinedFunction + +Registers a user-defined function (UDF). + + + +### Examples + +{% highlight sql %} + +// Define and register a UDAF to calculate the sum of product of two columns +// Scala +import org.apache.spark.sql.expressions.Aggregator +import org.apache.spark.sql.functions.udaf + +val agg = udaf(new Aggregator[(Long, Long), Long, Long] { + def zero: Long = 0L + def reduce(b: Long, a: (Long, Long)): Long = b + (a._1 * a._2) + def merge(b1: Long, b2: Long): Long = b1 + b2 + def finish(r: Long): Long = r + def bufferEncoder: Encoder[Long] = Encoders.scalaLong + def outputEncoder: Encoder[Long] = Encoders.scalaLong +}) + +spark.udf.register("agg", agg) + +val df = Seq( + (1, 1), + (1, 5), + (2, 10), + (2, -1), + (4, 7), + (3, 8), + (2, 4)) + .toDF("a", "b") + +df.createOrReplaceTempView("testUDAF") + +-- SQL +SELECT agg(a, b) FROM testUDAF; Review comment: If this is a Scala case, `sql("SELECT agg(a, b) FROM testUDAF").show()` is better? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
maropu commented on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf URL: https://github.com/apache/spark/pull/28097#issuecomment-607620515 @dongjoon-hyun @cloud-fan @gengliangwang How about this update? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
AmplabJenkins removed a comment on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf URL: https://github.com/apache/spark/pull/28097#issuecomment-607620214 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
AmplabJenkins removed a comment on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf URL: https://github.com/apache/spark/pull/28097#issuecomment-607620219 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25407/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
AmplabJenkins commented on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf URL: https://github.com/apache/spark/pull/28097#issuecomment-607620219 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25407/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
AmplabJenkins commented on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf URL: https://github.com/apache/spark/pull/28097#issuecomment-607620214 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
SparkQA commented on issue #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf URL: https://github.com/apache/spark/pull/28097#issuecomment-607619951 **[Test build #120709 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120709/testReport)** for PR 28097 at commit [`0cad76a`](https://github.com/apache/spark/commit/0cad76abbd6c6bf214d020110daea4afb8a4e940). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu opened a new pull request #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf
maropu opened a new pull request #28097: [SPARK-31325][SQL][Web UI] Control a plan explain mode in the events of SQL listeners via SQLConf URL: https://github.com/apache/spark/pull/28097 ### What changes were proposed in this pull request? This PR intends to add a new SQL config for controlling a plan explain mode in the events of (e.g., `SparkListenerSQLExecutionStart` and `SparkListenerSQLAdaptiveExecutionUpdate`) SQL listeners. In the current master, the output of `QueryExecution.toString` (this is equivalent to the "extended" explain mode) is stored in these events. I think it is useful to control the content via `SQLConf`. For example, the query "Details" content (TPCDS q66 query) of a SQL tab in a Spark web UI is changed as follows; Before this PR: ![q66-extended](https://user-images.githubusercontent.com/692303/78211668-950b4580-74e8-11ea-90c6-db52d437534b.png) After this PR: ![q66-formatted](https://user-images.githubusercontent.com/692303/78211674-9ccaea00-74e8-11ea-9d1d-43c7e2b0f314.png) ### Why are the changes needed? For better usability. ### Does this PR introduce any user-facing change? Yes; since Spark 3.1, the `formatted` query explain mode is used in the events (e.g., `SparkListenerSQLExecutionStart` and `SparkListenerSQLAdaptiveExecutionUpdate`) of SQL listeners. To restore the behavior before Spark 3.0, you can set `spark.sql.queryDescriptionModeInListeners` to `extended`. ### How was this patch tested? Added unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
HeartSaVioR commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607607804 Looks like the flaky test is below (and this PR made it flaky), not in streaming aggregation. ``` org.apache.spark.sql.kafka010.KafkaSinkMicroBatchStreamingSuite.streaming - sink progress is produced ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
AmplabJenkins removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607604786 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120693/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
AmplabJenkins removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607604784 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
SparkQA removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607488934 **[Test build #120693 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120693/testReport)** for PR 28085 at commit [`2d754f7`](https://github.com/apache/spark/commit/2d754f7e630bab3b10dfebe142698067e667121e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
AmplabJenkins commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607604786 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120693/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
AmplabJenkins commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607604784 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
SparkQA commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607604514 **[Test build #120693 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120693/testReport)** for PR 28085 at commit [`2d754f7`](https://github.com/apache/spark/commit/2d754f7e630bab3b10dfebe142698067e667121e). * This patch **fails from timeout after a configured wait of `400m`**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
AmplabJenkins removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607603105 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120692/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607603286 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607603286 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607603291 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120697/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607603291 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120697/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #28090: [SPARK-31321][SQL] Remove SaveMode check in v2 FileWriteBuilder
cloud-fan commented on issue #28090: [SPARK-31321][SQL] Remove SaveMode check in v2 FileWriteBuilder URL: https://github.com/apache/spark/pull/28090#issuecomment-607603165 I'm OK to remove dead code, this was probably leftover when we remove `SaveMode` from the v2 API. Can we make the PR description clearer? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
AmplabJenkins commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607603099 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
AmplabJenkins removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607603099 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
AmplabJenkins commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607603105 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120692/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
HyukjinKwon closed pull request #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
HyukjinKwon commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607602945 Merged to master and branch-3.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
SparkQA commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607602790 **[Test build #120697 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120697/testReport)** for PR 28095 at commit [`062398e`](https://github.com/apache/spark/commit/062398e0f68d238c2d3c20276d9345053cc21083). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
SparkQA removed a comment on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607485698 **[Test build #120692 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120692/testReport)** for PR 28085 at commit [`6a90fbe`](https://github.com/apache/spark/commit/6a90fbed75bfdc67d45807295d6d093e6afaf778). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
SparkQA removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607537035 **[Test build #120697 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120697/testReport)** for PR 28095 at commit [`062398e`](https://github.com/apache/spark/commit/062398e0f68d238c2d3c20276d9345053cc21083). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests
SparkQA commented on issue #28085: [SPARK-29641][PYTHON][CORE] Stage Level Sched: Add python api's and tests URL: https://github.com/apache/spark/pull/28085#issuecomment-607602819 **[Test build #120692 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120692/testReport)** for PR 28085 at commit [`6a90fbe`](https://github.com/apache/spark/commit/6a90fbed75bfdc67d45807295d6d093e6afaf778). * This patch **fails from timeout after a configured wait of `400m`**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
AmplabJenkins removed a comment on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607602162 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120701/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607602313 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607602313 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
AmplabJenkins removed a comment on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607602159 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607602318 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25406/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607602318 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25406/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
SparkQA commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607602021 **[Test build #120701 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120701/testReport)** for PR 28040 at commit [`d501127`](https://github.com/apache/spark/commit/d50112747aa14a17e8ebe52bf8935736fa25edf0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
AmplabJenkins commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607602162 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120701/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
AmplabJenkins commented on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607602159 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric
SparkQA removed a comment on issue #28040: [DO NOT MERGE][SPARK-31278][SS] Fix StreamingQuery output rows metric URL: https://github.com/apache/spark/pull/28040#issuecomment-607557366 **[Test build #120701 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120701/testReport)** for PR 28040 at commit [`d501127`](https://github.com/apache/spark/commit/d50112747aa14a17e8ebe52bf8935736fa25edf0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
SparkQA commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607601973 **[Test build #120708 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120708/testReport)** for PR 28095 at commit [`062398e`](https://github.com/apache/spark/commit/062398e0f68d238c2d3c20276d9345053cc21083). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Eric5553 commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
Eric5553 commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#discussion_r402031682 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ## @@ -295,6 +295,41 @@ case class AlterViewAsCommand( } } +/** + * A command for users to get views in the given database. + * If a databaseName is not given, the current database will be used. + * The syntax of using this command in SQL is: + * {{{ + * SHOW VIEWS [(IN|FROM) database_name] [[LIKE] 'identifier_with_wildcards']; + * }}} + */ +case class ShowViewsCommand( +databaseName: Option[String], +tableIdentifierPattern: Option[String]) extends RunnableCommand { + + // The result of SHOW VIEWS has three basic columns: database, viewName and isTemporary. + override val output: Seq[Attribute] = Seq( +AttributeReference("namespace", StringType, nullable = false)(), Review comment: IMO, maybe we should use `namespace` for consistency? Anyway, I'm changing the comments(line 310) to get along with current code/doc, and waiting for final decision. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
AmplabJenkins removed a comment on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607537274 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message
HyukjinKwon commented on issue #28095: [SPARK-31324][SS] Include stream ID in the termination timeout error message URL: https://github.com/apache/spark/pull/28095#issuecomment-607601318 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Eric5553 commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
Eric5553 commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#discussion_r402031682 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ## @@ -295,6 +295,41 @@ case class AlterViewAsCommand( } } +/** + * A command for users to get views in the given database. + * If a databaseName is not given, the current database will be used. + * The syntax of using this command in SQL is: + * {{{ + * SHOW VIEWS [(IN|FROM) database_name] [[LIKE] 'identifier_with_wildcards']; + * }}} + */ +case class ShowViewsCommand( +databaseName: Option[String], +tableIdentifierPattern: Option[String]) extends RunnableCommand { + + // The result of SHOW VIEWS has three basic columns: database, viewName and isTemporary. + override val output: Seq[Attribute] = Seq( +AttributeReference("namespace", StringType, nullable = false)(), Review comment: IMO, as cloud-fan suggested, maybe we should use `namespace` for consistency? Anyway, I'm changing the comments(line 310) to get along with current code/doc, and waiting for final decision. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
AmplabJenkins commented on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#issuecomment-607600424 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25405/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
AmplabJenkins removed a comment on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#issuecomment-607600417 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
AmplabJenkins commented on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#issuecomment-607600417 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
AmplabJenkins removed a comment on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#issuecomment-607600424 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25405/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
SparkQA commented on issue #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#issuecomment-607600026 **[Test build #120707 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120707/testReport)** for PR 27897 at commit [`55a5997`](https://github.com/apache/spark/commit/55a5997bdfe34439a171d3603fc98975b7c18129). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Eric5553 commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
Eric5553 commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#discussion_r402030769 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -860,7 +860,7 @@ case class ShowTablesCommand( } override def run(sparkSession: SparkSession): Seq[Row] = { -// Since we need to return a Seq of rows, we will call getTables directly +// Since we need to return a Seq of rows, we will call listTables directly Review comment: Sure, reverted the changes in 55a5997bdfe34439a171d3603fc98975b7c18129. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core
AngersZh commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-607599637 @dongjoon-hyun In latest test, ``` org.apache.spark.sql.execution.script.ScriptTransformationSuite.SPARK-14400 script transformation should fail for bad script command ``` Seem sometimes it won't catch the subprocess's exception. I tried a lot of times, sometimes failed, sometimes success. May have problem about catch subproc's exception. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Eric5553 commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command
Eric5553 commented on a change in pull request #27897: [SPARK-31113][SQL] Add SHOW VIEWS command URL: https://github.com/apache/spark/pull/27897#discussion_r402030477 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -230,22 +230,19 @@ class ResolveSessionCatalog( case d @ DescribeNamespace(SessionCatalogAndNamespace(_, ns), _) => if (ns.length != 1) { -throw new AnalysisException( - s"The database name is not valid: ${ns.quoted}") +throw new AnalysisException(s"The database name is not valid: ${ns.quoted}") Review comment: Sure, I've reverted these untouched lines. Thanks as always @dongjoon-hyun @maropu ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core
AngersZh commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-607593288 > Hi, @AngersZh . This is too big as a follow-up. Please create a new JIRA issue for this. Seems not need to create new jira? just delete follow-up? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc
AmplabJenkins removed a comment on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc URL: https://github.com/apache/spark/pull/28096#issuecomment-607592191 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc
SparkQA removed a comment on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc URL: https://github.com/apache/spark/pull/28096#issuecomment-607589045 **[Test build #120705 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120705/testReport)** for PR 28096 at commit [`88d955f`](https://github.com/apache/spark/commit/88d955fcb061461743575833677fbe8388af845d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc
AmplabJenkins removed a comment on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc URL: https://github.com/apache/spark/pull/28096#issuecomment-607592194 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120705/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc
AmplabJenkins commented on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc URL: https://github.com/apache/spark/pull/28096#issuecomment-607592194 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120705/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc
AmplabJenkins commented on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc URL: https://github.com/apache/spark/pull/28096#issuecomment-607592191 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc
SparkQA commented on issue #28096: [SPARK-31295][DOC][FOLLOWUP] Supplement version for configuration appear in doc URL: https://github.com/apache/spark/pull/28096#issuecomment-607592096 **[Test build #120705 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120705/testReport)** for PR 28096 at commit [`88d955f`](https://github.com/apache/spark/commit/88d955fcb061461743575833677fbe8388af845d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core
AmplabJenkins removed a comment on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-607589889 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120696/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core
AmplabJenkins removed a comment on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-607589882 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core
AmplabJenkins commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-607589889 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/120696/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core
SparkQA commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-607589680 **[Test build #120696 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120696/testReport)** for PR 27983 at commit [`0d9c437`](https://github.com/apache/spark/commit/0d9c43788b0dfd33a9e37fe7c0dc6d1e1d238f86). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core
SparkQA removed a comment on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-607521027 **[Test build #120696 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120696/testReport)** for PR 27983 at commit [`0d9c437`](https://github.com/apache/spark/commit/0d9c43788b0dfd33a9e37fe7c0dc6d1e1d238f86). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core
AmplabJenkins commented on issue #27983: [SPARK-15694][SQL]Implement ScriptTransformation in sql/core URL: https://github.com/apache/spark/pull/27983#issuecomment-607589882 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
AmplabJenkins removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607589450 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
AmplabJenkins commented on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607589450 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies
AmplabJenkins removed a comment on issue #27928: [SPARK-31167][BUILD] Refactor how we track Python test/build dependencies URL: https://github.com/apache/spark/pull/27928#issuecomment-607589458 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/25404/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org