[GitHub] [spark] SparkQA removed a comment on pull request #32552: [SPARK-34819][SQL] MapType supports comparable semantics
SparkQA removed a comment on pull request #32552: URL: https://github.com/apache/spark/pull/32552#issuecomment-871011016 **[Test build #140411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140411/testReport)** for PR 32552 at commit [`29dd475`](https://github.com/apache/spark/commit/29dd475457d1285257478ca866cff600e3f34a26). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
SparkQA removed a comment on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-871030899 **[Test build #140413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140413/testReport)** for PR 33093 at commit [`f221358`](https://github.com/apache/spark/commit/f221358daab3c8fcc3fb178b8c76ee0861bfbdae). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32552: [SPARK-34819][SQL] MapType supports comparable semantics
SparkQA commented on pull request #32552: URL: https://github.com/apache/spark/pull/32552#issuecomment-871118314 **[Test build #140411 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140411/testReport)** for PR 32552 at commit [`29dd475`](https://github.com/apache/spark/commit/29dd475457d1285257478ca866cff600e3f34a26). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class SortMapKeys(child: Expression) extends UnaryExpression with ExpectsInputTypes ` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
SparkQA commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-871117789 **[Test build #140413 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140413/testReport)** for PR 33093 at commit [`f221358`](https://github.com/apache/spark/commit/f221358daab3c8fcc3fb178b8c76ee0861bfbdae). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #33130: [SPARK-35928][BUILD] Upgrade ASM to 9.1
dongjoon-hyun commented on pull request #33130: URL: https://github.com/apache/spark/pull/33130#issuecomment-871116923 Here is some update. Although the Jenkins status is super noisy due to the timeout, we got the green light for the following at least. - Maven with Hadoop 2.7/Java11: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7-jdk-11/ - SBT with Hadoop 3.2: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/ I'll update here when I collect more info. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang closed pull request #33138: [SPARK-35937][SQL] Extracting date field from timestamp should work in ANSI mode
gengliangwang closed pull request #33138: URL: https://github.com/apache/spark/pull/33138 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #33138: [SPARK-35937][SQL] Extracting date field from timestamp should work in ANSI mode
gengliangwang commented on pull request #33138: URL: https://github.com/apache/spark/pull/33138#issuecomment-871116358 Merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cfmcgrady commented on a change in pull request #33146: [WIP][SPARK-35912][SQL] Fix cast struct contains null value to string
cfmcgrady commented on a change in pull request #33146: URL: https://github.com/apache/spark/pull/33146#discussion_r661147901 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -1098,29 +1100,37 @@ abstract class CastBase extends UnaryExpression with TimeZoneAwareExpression wit } private def writeStructToStringBuilder( - st: Seq[DataType], + st: Seq[StructField], row: ExprValue, buffer: ExprValue, ctx: CodegenContext): Block = { -val structToStringCode = st.zipWithIndex.map { case (ft, i) => - val fieldToStringCode = castToStringCode(ft, ctx) - val field = ctx.freshVariable("field", ft) - val fieldStr = ctx.freshVariable("fieldStr", StringType) - val javaType = JavaCode.javaType(ft) - code""" - |${if (i != 0) code"""$buffer.append(",");""" else EmptyBlock} - |if ($row.isNullAt($i)) { Review comment: When the actual value is null, for primitive type field, `row.isNullAt(i)` return ture, but `row.getXXX` return a default value. For exmaple: ```scala val r = new org.apache.spark.sql.catalyst.expressions.GenericInternalRow(Array(1, null)) println(r.getInt(0)) // 1 println(r.getInt(1)) // 0 println(r.isNullAt(1)) // true ``` so we cann't only check `row.isNullAt(i)` here, we need to do the same logical like `BoundReference.doGenCode()`, add nullable check. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre
SparkQA commented on pull request #29326: URL: https://github.com/apache/spark/pull/29326#issuecomment-871110977 **[Test build #140430 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140430/testReport)** for PR 29326 at commit [`4e6da9c`](https://github.com/apache/spark/commit/4e6da9c6e730e5564c33012c2a1d72ac7c383cda). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33105: [SPARK-35908][SQL] Remove repartition if the child maximum number of rows less than or equal to 1
AmplabJenkins removed a comment on pull request #33105: URL: https://github.com/apache/spark/pull/33105#issuecomment-870942164 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140363/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33105: [SPARK-35908][SQL] Remove repartition if the child maximum number of rows less than or equal to 1
SparkQA commented on pull request #33105: URL: https://github.com/apache/spark/pull/33105#issuecomment-871110491 **[Test build #140429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140429/testReport)** for PR 33105 at commit [`ae4531b`](https://github.com/apache/spark/commit/ae4531bdfd4cff5a9bcb87c7cb7cd649b7c21986). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33137: [SPARK-35935][SQL] Prevent failure of `MSCK REPAIR TABLE` on table refreshing
SparkQA commented on pull request #33137: URL: https://github.com/apache/spark/pull/33137#issuecomment-871110470 **[Test build #140428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140428/testReport)** for PR 33137 at commit [`85954ae`](https://github.com/apache/spark/commit/85954aef15ec30a14c4b4ec762d32b85d69bd133). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #33133: [SPARK-35930][BUILD] Upgrade kinesis-client to 1.14.4
dongjoon-hyun commented on pull request #33133: URL: https://github.com/apache/spark/pull/33133#issuecomment-871110316 Since SPARK-34549 is reverted, I'll close this PR, @sarutak . Thank you for checking. After Guava issue is resolved, feel free to reopen this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #33133: [SPARK-35930][BUILD] Upgrade kinesis-client to 1.14.4
dongjoon-hyun closed pull request #33133: URL: https://github.com/apache/spark/pull/33133 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
AmplabJenkins removed a comment on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871109578 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44940/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
AmplabJenkins commented on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871109578 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44940/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
AmplabJenkins removed a comment on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-871108990 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
AmplabJenkins removed a comment on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-871071885 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE
AmplabJenkins removed a comment on pull request #32850: URL: https://github.com/apache/spark/pull/32850#issuecomment-871108995 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
AmplabJenkins removed a comment on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871108992 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140425/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress
AmplabJenkins removed a comment on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-871108991 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
AmplabJenkins commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-871108990 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
AmplabJenkins commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-871108994 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44931/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
AmplabJenkins commented on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871108992 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140425/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE
AmplabJenkins commented on pull request #32850: URL: https://github.com/apache/spark/pull/32850#issuecomment-871108997 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress
AmplabJenkins commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-871109000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress
SparkQA commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-871107428 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44936/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33138: [SPARK-35937][SQL] Extracting date field from timestamp should work in ANSI mode
SparkQA commented on pull request #33138: URL: https://github.com/apache/spark/pull/33138#issuecomment-871106259 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44938/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33148: [SPARK-33298][CORE][FOLLOWUP] Add Unstable annotation to `FileCommitProtocol`
SparkQA commented on pull request #33148: URL: https://github.com/apache/spark/pull/33148#issuecomment-871106119 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44939/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE
SparkQA commented on pull request #32850: URL: https://github.com/apache/spark/pull/32850#issuecomment-871105704 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44937/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #33105: [SPARK-35908][SQL] Remove repartition if the child maximum number of rows less than or equal to 1
wangyum commented on a change in pull request #33105: URL: https://github.com/apache/spark/pull/33105#discussion_r661139555 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -24,7 +24,7 @@ import org.apache.spark.sql.catalyst.catalog.{InMemoryCatalog, SessionCatalog} import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.expressions.aggregate._ import org.apache.spark.sql.catalyst.plans._ -import org.apache.spark.sql.catalyst.plans.logical._ +import org.apache.spark.sql.catalyst.plans.logical.{RepartitionOperation, _} Review comment: Removed it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33143: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"
HyukjinKwon commented on a change in pull request #33143: URL: https://github.com/apache/spark/pull/33143#discussion_r661139243 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ## @@ -2928,31 +2928,6 @@ object functions { // DateTime functions // - /** - * (Scala-specific) Creates a datetime interval - * - * @param years Number of years - * @param months Number of months - * @param weeks Number of weeks - * @param days Number of days - * @param hours Number of hours - * @param mins Number of mins - * @param secs Number of secs - * @return A datetime interval - * @group datetime_funcs - * @since 3.2.0 - */ - def make_interval( Review comment: okie then i am good -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #33142: [SPARK-35940][SQL] Refactor EquivalentExpressions to make it more efficient
viirya commented on pull request #33142: URL: https://github.com/apache/spark/pull/33142#issuecomment-871104576 > Can you briefly introduce your idea? Sorting by height is stable and fast now. Basically, the steps are: 1. Propagate the `SubExprEliminationState` map for all subexprs (no needed to be sorted). Only create the value and isNull variables, don't do codegen yet. 2. Iterate all subexprs to do codegen. Because expression codegen will look at the map to replace subexprs, any subexpr in children will be replaced and chained. So we don't need to sort subexprs in advance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
HyukjinKwon edited a comment on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871104226 Thank you @dongjoon-hyun !! > Since this is only at utill.py, we are still able to turn off the pined mode by PYSPARK_PIN_THREAD=false via java_gateway and Py4JServer.scala, right? Yes. This fix is in case other projects (e.g., Zeppelin) create the Java Gateway by themselves (and set it to `SparkContext`). In this case, `PYSPARK_PIN_THREAD` env won't be respected. (server side) I fixed `util.py` to respect the created the Java Gateway (instead of `PYSPARK_PIN_THREAD` environment variable) so it won't cause breakage in the case. (client side) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
HyukjinKwon commented on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871104226 Thank you @dongjoon-hyun !! > Since this is only at utill.py, we are still able to turn off the pined mode by PYSPARK_PIN_THREAD=false via java_gateway and Py4JServer.scala, right? Yes. This fix is in case other projects (e.g., Zeppelin) create the Java Gateway by themselves (and set it to `SparkContext`). In this case, `PYSPARK_PIN_THREAD` env won't be respected. (server side) I fixed `util.py` to respect the created the Java Gateway so it won't cause breakage in the case. (client side) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
dongjoon-hyun commented on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871104019 Thank you, @HyukjinKwon and @WeichenXu123 . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
dongjoon-hyun closed pull request #33147: URL: https://github.com/apache/spark/pull/33147 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya closed pull request #32980: [SPARK-35829][SQL] Clean up evaluates subexpressions and add more flexibility to evaluate particular subexpressoin
viirya closed pull request #32980: URL: https://github.com/apache/spark/pull/32980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #32980: [SPARK-35829][SQL] Clean up evaluates subexpressions and add more flexibility to evaluate particular subexpressoin
viirya commented on pull request #32980: URL: https://github.com/apache/spark/pull/32980#issuecomment-871102030 Thanks for review! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre
viirya commented on pull request #29326: URL: https://github.com/apache/spark/pull/29326#issuecomment-871101530 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress
SparkQA commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-871100812 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44932/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-871100625 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44933/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
SparkQA removed a comment on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871088117 **[Test build #140425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140425/testReport)** for PR 33147 at commit [`27199ac`](https://github.com/apache/spark/commit/27199acb240d16311a44cc953dbe1098fd9340bf). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
SparkQA commented on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871099454 **[Test build #140425 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140425/testReport)** for PR 33147 at commit [`27199ac`](https://github.com/apache/spark/commit/27199acb240d16311a44cc953dbe1098fd9340bf). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA removed a comment on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-871051176 **[Test build #140418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140418/testReport)** for PR 33028 at commit [`54ed99e`](https://github.com/apache/spark/commit/54ed99e5780c36c834d2a76b34adb80e01ce7d4a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE
SparkQA removed a comment on pull request #32850: URL: https://github.com/apache/spark/pull/32850#issuecomment-871051277 **[Test build #140419 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140419/testReport)** for PR 32850 at commit [`d73bb83`](https://github.com/apache/spark/commit/d73bb83fa61ad551f634018f6e8cbf3d2b45842c). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE
SparkQA commented on pull request #32850: URL: https://github.com/apache/spark/pull/32850#issuecomment-871098737 **[Test build #140419 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140419/testReport)** for PR 32850 at commit [`d73bb83`](https://github.com/apache/spark/commit/d73bb83fa61ad551f634018f6e8cbf3d2b45842c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-871098634 **[Test build #140418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140418/testReport)** for PR 33028 at commit [`54ed99e`](https://github.com/apache/spark/commit/54ed99e5780c36c834d2a76b34adb80e01ce7d4a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE
SparkQA commented on pull request #32850: URL: https://github.com/apache/spark/pull/32850#issuecomment-871098074 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44934/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #33143: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"
MaxGekk commented on a change in pull request #33143: URL: https://github.com/apache/spark/pull/33143#discussion_r661131121 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ## @@ -2928,31 +2928,6 @@ object functions { // DateTime functions // - /** - * (Scala-specific) Creates a datetime interval - * - * @param years Number of years - * @param months Number of months - * @param weeks Number of weeks - * @param days Number of days - * @param hours Number of hours - * @param mins Number of mins - * @param secs Number of secs - * @return A datetime interval - * @group datetime_funcs - * @since 3.2.0 - */ - def make_interval( Review comment: > do we have make_ym_interval and make_dt_interval now? @cloud-fan As SQL functions, see https://github.com/apache/spark/pull/32645 and https://github.com/apache/spark/pull/32601 . The functions haven't been exposed to Scala/Python/R APIs yet. > Will we deprecate and remove CalendarIntervalType? @HyukjinKwon Yes, we will I hope. The question is only when. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions
SparkQA removed a comment on pull request #32883: URL: https://github.com/apache/spark/pull/32883#issuecomment-871091753 **[Test build #140427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140427/testReport)** for PR 32883 at commit [`409688d`](https://github.com/apache/spark/commit/409688db1ee4d5b82fb4ec7920594b5338ba9c45). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on a change in pull request #33143: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"
MaxGekk commented on a change in pull request #33143: URL: https://github.com/apache/spark/pull/33143#discussion_r661131121 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ## @@ -2928,31 +2928,6 @@ object functions { // DateTime functions // - /** - * (Scala-specific) Creates a datetime interval - * - * @param years Number of years - * @param months Number of months - * @param weeks Number of weeks - * @param days Number of days - * @param hours Number of hours - * @param mins Number of mins - * @param secs Number of secs - * @return A datetime interval - * @group datetime_funcs - * @since 3.2.0 - */ - def make_interval( Review comment: > do we have make_ym_interval and make_dt_interval now? @cloud-fan As SQL functions, see https://github.com/apache/spark/pull/32645 and https://github.com/apache/spark/pull/32601 . The functions haven't been exposed to Scala/Python/R APIs yet. > Will we deprecate and remove CalendarIntervalType? @HyukjinKwon Yes, we will I hope. The questions is only when. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
SparkQA commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-871095284 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44931/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions
SparkQA commented on pull request #32883: URL: https://github.com/apache/spark/pull/32883#issuecomment-871093680 **[Test build #140427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140427/testReport)** for PR 32883 at commit [`409688d`](https://github.com/apache/spark/commit/409688db1ee4d5b82fb4ec7920594b5338ba9c45). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions
AmplabJenkins commented on pull request #32883: URL: https://github.com/apache/spark/pull/32883#issuecomment-871093708 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140427/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33143: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"
HyukjinKwon commented on a change in pull request #33143: URL: https://github.com/apache/spark/pull/33143#discussion_r661128457 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ## @@ -2928,31 +2928,6 @@ object functions { // DateTime functions // - /** - * (Scala-specific) Creates a datetime interval - * - * @param years Number of years - * @param months Number of months - * @param weeks Number of weeks - * @param days Number of days - * @param hours Number of hours - * @param mins Number of mins - * @param secs Number of secs - * @return A datetime interval - * @group datetime_funcs - * @since 3.2.0 - */ - def make_interval( Review comment: Will we deprecate and remove `CalendarIntervalType`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions
SparkQA commented on pull request #32883: URL: https://github.com/apache/spark/pull/32883#issuecomment-871091753 **[Test build #140427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140427/testReport)** for PR 32883 at commit [`409688d`](https://github.com/apache/spark/commit/409688db1ee4d5b82fb4ec7920594b5338ba9c45). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #33143: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"
yaooqinn commented on a change in pull request #33143: URL: https://github.com/apache/spark/pull/33143#discussion_r661125753 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ## @@ -2928,31 +2928,6 @@ object functions { // DateTime functions // - /** - * (Scala-specific) Creates a datetime interval - * - * @param years Number of years - * @param months Number of months - * @param weeks Number of weeks - * @param days Number of days - * @param hours Number of hours - * @param mins Number of mins - * @param secs Number of secs - * @return A datetime interval - * @group datetime_funcs - * @since 3.2.0 - */ - def make_interval( Review comment: Is it better to remove it before a release? `@since 3.2.0` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak commented on pull request #33145: Revert "[SPARK-34549][BUILD] Upgrade aws kinesis to 1.14.0 and java sdk 1.11.844"
sarutak commented on pull request #33145: URL: https://github.com/apache/spark/pull/33145#issuecomment-871090622 Let's try to upgrade again once Guava is successfully upgraded. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR closed pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress
HeartSaVioR closed pull request #33091: URL: https://github.com/apache/spark/pull/33091 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress
HeartSaVioR commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-871089628 GA passed. Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cxzl25 commented on a change in pull request #33114: [SPARK-35913][SQL] Create hive permanent function with owner name
cxzl25 commented on a change in pull request #33114: URL: https://github.com/apache/spark/pull/33114#discussion_r661123735 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -925,19 +925,19 @@ private[hive] class HiveClientImpl( } override def createFunction(db: String, func: CatalogFunction): Unit = withHiveState { -shim.createFunction(client, db, func) +shim.createFunction(client, db, func, userName) } override def dropFunction(db: String, name: String): Unit = withHiveState { shim.dropFunction(client, db, name) } override def renameFunction(db: String, oldName: String, newName: String): Unit = withHiveState { -shim.renameFunction(client, db, oldName, newName) +shim.renameFunction(client, db, oldName, newName, userName) } override def alterFunction(db: String, func: CatalogFunction): Unit = withHiveState { -shim.alterFunction(client, db, func) +shim.alterFunction(client, db, func, userName) Review comment: `renameFunction` can keep the original owner name, and the function definition has been modified except for the function name of `alterFunction`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33136: [SPARK-35932][SQL] Support extracting hour/minute/second from timestamp without time zone
SparkQA commented on pull request #33136: URL: https://github.com/apache/spark/pull/33136#issuecomment-871088160 **[Test build #140426 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140426/testReport)** for PR 33136 at commit [`0596534`](https://github.com/apache/spark/commit/05965340184900acddd043a7d453741dd148e9bb). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API
SparkQA commented on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871088117 **[Test build #140425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140425/testReport)** for PR 33147 at commit [`27199ac`](https://github.com/apache/spark/commit/27199acb240d16311a44cc953dbe1098fd9340bf). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33148: [SPARK-33298][CORE][FOLLOWUP] Add Unstable annotation to `FileCommitProtocol`
SparkQA commented on pull request #33148: URL: https://github.com/apache/spark/pull/33148#issuecomment-871088106 **[Test build #140424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140424/testReport)** for PR 33148 at commit [`d41ba14`](https://github.com/apache/spark/commit/d41ba14b7789e8d0806d248b199b983dfa4946ce). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress
SparkQA commented on pull request #33091: URL: https://github.com/apache/spark/pull/33091#issuecomment-871088025 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44932/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions
AmplabJenkins removed a comment on pull request #32883: URL: https://github.com/apache/spark/pull/32883#issuecomment-871087817 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44929/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
AmplabJenkins removed a comment on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-871087814 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44930/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32552: [SPARK-34819][SQL] MapType supports comparable semantics
AmplabJenkins removed a comment on pull request #32552: URL: https://github.com/apache/spark/pull/32552#issuecomment-871087818 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44926/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32365: [SPARK-35228][SQL] Add expression ToHiveString for keep consistent between hive/spark format in df.show
AmplabJenkins removed a comment on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-871087815 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33145: Revert "[SPARK-34549][BUILD] Upgrade aws kinesis to 1.14.0 and java sdk 1.11.844"
AmplabJenkins removed a comment on pull request #33145: URL: https://github.com/apache/spark/pull/33145#issuecomment-871087813 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140412/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32552: [SPARK-34819][SQL] MapType supports comparable semantics
AmplabJenkins commented on pull request #32552: URL: https://github.com/apache/spark/pull/32552#issuecomment-871087818 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44926/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32365: [SPARK-35228][SQL] Add expression ToHiveString for keep consistent between hive/spark format in df.show
AmplabJenkins commented on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-871087816 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
AmplabJenkins commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-871087814 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44930/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions
AmplabJenkins commented on pull request #32883: URL: https://github.com/apache/spark/pull/32883#issuecomment-871087817 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44929/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33145: Revert "[SPARK-34549][BUILD] Upgrade aws kinesis to 1.14.0 and java sdk 1.11.844"
AmplabJenkins commented on pull request #33145: URL: https://github.com/apache/spark/pull/33145#issuecomment-871087813 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140412/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32365: [SPARK-35228][SQL] Add expression ToHiveString for keep consistent between hive/spark format in df.show
SparkQA commented on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-871087620 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44935/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33028: [SPARK-35714][FOLLOW-UP][CORE] Use a shared stopping flag for WorkerWatcher to avoid the duplicate System.exit
SparkQA commented on pull request #33028: URL: https://github.com/apache/spark/pull/33028#issuecomment-871087512 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44933/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #33131: [SPARK-35920][FOLLOWUP][BUILD] Fix Kryo Shaded dependency
dongjoon-hyun commented on pull request #33131: URL: https://github.com/apache/spark/pull/33131#issuecomment-871086897 Thank you for the confirmation, @gaoyajun02 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE
SparkQA commented on pull request #32850: URL: https://github.com/apache/spark/pull/32850#issuecomment-871085966 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44934/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33145: Revert "[SPARK-34549][BUILD] Upgrade aws kinesis to 1.14.0 and java sdk 1.11.844"
SparkQA removed a comment on pull request #33145: URL: https://github.com/apache/spark/pull/33145#issuecomment-871030849 **[Test build #140412 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140412/testReport)** for PR 33145 at commit [`ef6f752`](https://github.com/apache/spark/commit/ef6f752a597449c8c8a86192e120800fe9810f2e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33145: Revert "[SPARK-34549][BUILD] Upgrade aws kinesis to 1.14.0 and java sdk 1.11.844"
SparkQA commented on pull request #33145: URL: https://github.com/apache/spark/pull/33145#issuecomment-871085470 **[Test build #140412 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140412/testReport)** for PR 33145 at commit [`ef6f752`](https://github.com/apache/spark/commit/ef6f752a597449c8c8a86192e120800fe9810f2e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec
SparkQA commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-871084337 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44931/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions
SparkQA commented on pull request #32883: URL: https://github.com/apache/spark/pull/32883#issuecomment-871084038 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44929/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming
SparkQA commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-871083639 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44930/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 commented on a change in pull request #33148: [SPARK-33298][CORE][FOLLOWUP] Add Unstable annotation to `FileCommitProtocol`
c21 commented on a change in pull request #33148: URL: https://github.com/apache/spark/pull/33148#discussion_r661118652 ## File path: core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala ## @@ -41,7 +42,11 @@ import org.apache.spark.util.Utils *(or abortTask if task failed). * 3. When all necessary tasks completed successfully, the driver calls commitJob. If the job *failed to execute (e.g. too many failed tasks), the job should call abortJob. + * + * NOTE: this class is exposed as an API considering the usage of many downstream custom Review comment: sure, updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33148: [SPARK-33298][CORE][FOLLOWUP] Add Unstable annotation to `FileCommitProtocol`
HyukjinKwon commented on a change in pull request #33148: URL: https://github.com/apache/spark/pull/33148#discussion_r661117977 ## File path: core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala ## @@ -41,7 +42,11 @@ import org.apache.spark.util.Utils *(or abortTask if task failed). * 3. When all necessary tasks completed successfully, the driver calls commitJob. If the job *failed to execute (e.g. too many failed tasks), the job should call abortJob. + * + * NOTE: this class is exposed as an API considering the usage of many downstream custom Review comment: no big deal but I would use `@note` instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 commented on a change in pull request #33148: [SPARK-33298][CORE][FOLLOWUP] Add Unstable annotation to newly added methods of `FileCommitProtocol`
c21 commented on a change in pull request #33148: URL: https://github.com/apache/spark/pull/33148#discussion_r661117657 ## File path: core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala ## @@ -107,10 +111,9 @@ abstract class FileCommitProtocol extends Logging { * if a task is going to write out multiple files to the same dir. The file commit protocol only * guarantees that files written by different tasks will not conflict. * - * This API should be implemented and called, instead of - * [[newTaskTempFile(taskContest, dir, ext)]]. Provide a default implementation here to be - * backward compatible with custom [[FileCommitProtocol]] implementations before Spark 3.2.0. + * @since 3.2.0 */ + @Unstable Review comment: @HyukjinKwon - ah I see, updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33143: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"
HyukjinKwon commented on a change in pull request #33143: URL: https://github.com/apache/spark/pull/33143#discussion_r661117394 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ## @@ -2928,31 +2928,6 @@ object functions { // DateTime functions // - /** - * (Scala-specific) Creates a datetime interval - * - * @param years Number of years - * @param months Number of months - * @param weeks Number of weeks - * @param days Number of days - * @param hours Number of hours - * @param mins Number of mins - * @param secs Number of secs - * @return A datetime interval - * @group datetime_funcs - * @since 3.2.0 - */ - def make_interval( Review comment: I wouldn't remove until we decide to deprecate `CalendarIntervalType`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33148: [SPARK-33298][CORE][FOLLOWUP] Add Unstable annotation to newly added methods of `FileCommitProtocol`
HyukjinKwon commented on a change in pull request #33148: URL: https://github.com/apache/spark/pull/33148#discussion_r661117144 ## File path: core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala ## @@ -107,10 +111,9 @@ abstract class FileCommitProtocol extends Logging { * if a task is going to write out multiple files to the same dir. The file commit protocol only * guarantees that files written by different tasks will not conflict. * - * This API should be implemented and called, instead of - * [[newTaskTempFile(taskContest, dir, ext)]]. Provide a default implementation here to be - * backward compatible with custom [[FileCommitProtocol]] implementations before Spark 3.2.0. + * @since 3.2.0 */ + @Unstable Review comment: @c21 can we mark this whole class as `@Unstable`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 commented on pull request #33148: [SPARK-33298][CORE][FOLLOWUP] Add Unstable annotation to newly added methods of `FileCommitProtocol`
c21 commented on pull request #33148: URL: https://github.com/apache/spark/pull/33148#issuecomment-871081348 cc @HyukjinKwon and @cloud-fan please take a look when you have time, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 opened a new pull request #33148: [SPARK-33298][CORE][FOLLOWUP] Add Unstable annotation to newly added methods of `FileCommitProtocol`
c21 opened a new pull request #33148: URL: https://github.com/apache/spark/pull/33148 ### What changes were proposed in this pull request? This is the followup from https://github.com/apache/spark/pull/33012#discussion_r659440833, where we want to add `@Unstable` to newly added methods of `FileCommitProtocol`, to give people a better idea of API. ### Why are the changes needed? Make it easier for people to follow and understand code. Clean up code. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit tests, as no real logic change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #33114: [SPARK-35913][SQL] Create hive permanent function with owner name
yaooqinn commented on a change in pull request #33114: URL: https://github.com/apache/spark/pull/33114#discussion_r661116166 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -925,19 +925,19 @@ private[hive] class HiveClientImpl( } override def createFunction(db: String, func: CatalogFunction): Unit = withHiveState { -shim.createFunction(client, db, func) +shim.createFunction(client, db, func, userName) } override def dropFunction(db: String, name: String): Unit = withHiveState { shim.dropFunction(client, db, name) } override def renameFunction(db: String, oldName: String, newName: String): Unit = withHiveState { -shim.renameFunction(client, db, oldName, newName) +shim.renameFunction(client, db, oldName, newName, userName) } override def alterFunction(db: String, func: CatalogFunction): Unit = withHiveState { -shim.alterFunction(client, db, func) +shim.alterFunction(client, db, func, userName) Review comment: we shall keep the original owner name -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33143: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"
cloud-fan commented on a change in pull request #33143: URL: https://github.com/apache/spark/pull/33143#discussion_r661116046 ## File path: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ## @@ -2928,31 +2928,6 @@ object functions { // DateTime functions // - /** - * (Scala-specific) Creates a datetime interval - * - * @param years Number of years - * @param months Number of months - * @param weeks Number of weeks - * @param days Number of days - * @param hours Number of hours - * @param mins Number of mins - * @param secs Number of secs - * @return A datetime interval - * @group datetime_funcs - * @since 3.2.0 - */ - def make_interval( Review comment: do we have `make_ym_interval` and `make_dt_interval` now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MrPowers commented on pull request #33143: Revert "[SPARK-33995][SQL] Expose make_interval as a Scala function"
MrPowers commented on pull request #33143: URL: https://github.com/apache/spark/pull/33143#issuecomment-871080235 Thanks for creating something better @MaxGekk and thanks for the ping @dongjoon-hyun. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server if InheritableThread API
HyukjinKwon commented on pull request #33147: URL: https://github.com/apache/spark/pull/33147#issuecomment-871077382 @WeichenXu123 can you take a quick look please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 commented on a change in pull request #33012: [SPARK-33298][CORE] Introduce new API to FileCommitProtocol allow flexible file naming
c21 commented on a change in pull request #33012: URL: https://github.com/apache/spark/pull/33012#discussion_r661112797 ## File path: core/src/main/scala/org/apache/spark/internal/io/FileCommitProtocol.scala ## @@ -92,6 +92,35 @@ abstract class FileCommitProtocol extends Logging { */ def newTaskTempFile(taskContext: TaskAttemptContext, dir: Option[String], ext: String): String + /** + * Notifies the commit protocol to add a new file, and gets back the full path that should be + * used. Must be called on the executors when running tasks. + * + * Note that the returned temp file may have an arbitrary path. The commit protocol only + * promises that the file will be at the location specified by the arguments after job commit. + * + * The "dir" parameter specifies the sub-directory within the base path, used to specify + * partitioning. The "spec" parameter specifies the file name. The rest are left to the commit + * protocol implementation to decide. + * + * Important: it is the caller's responsibility to add uniquely identifying content to "spec" + * if a task is going to write out multiple files to the same dir. The file commit protocol only + * guarantees that files written by different tasks will not conflict. + * + * This API should be implemented and called, instead of + * [[newTaskTempFile(taskContest, dir, ext)]]. Provide a default implementation here to be + * backward compatible with custom [[FileCommitProtocol]] implementations before Spark 3.2.0. Review comment: Sounds good. Let me create a PR now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon opened a new pull request #33147: [SPARK-35946][PYTHON] Respect Py4J server if InheritableThread API
HyukjinKwon opened a new pull request #33147: URL: https://github.com/apache/spark/pull/33147 ### What changes were proposed in this pull request? Currently ,we sets the environment variable `PYSPARK_PIN_THREAD` at the client side of `InhertiableThread` API for Py4J (`python/pyspark/util.py`). If the Py4J gateway is created somewhere else (e.g., Zeppelin, etc), it could introduce a breakage at: ```python from pyspark import SparkContext jvm = SparkContext._jvm thread_connection = jvm._gateway_client.get_thread_connection() # `AttributeError: 'GatewayClient' object has no attribute 'get_thread_connection'` (non-pinned thread mode) # `get_thread_connection` is only in 'ClientServer' (pinned thread mode) ``` This PR proposes to check the given gateway created, and do the pinned thread mode behaviour accordingly so we can avoid any breakage when Py4J server/gateway is created separately from somewhere else without a pinned thread mode. ### Why are the changes needed? To avoid any potential breakage. ### Does this PR introduce _any_ user-facing change? No, the change happened only in the master (https://github.com/apache/spark/commit/fdd7ca5f4e35a906090f3c6b160bdba9ac9fd0ca). ### How was this patch tested? This is actually a partial revert of https://github.com/apache/spark/commit/fdd7ca5f4e35a906090f3c6b160bdba9ac9fd0ca. As long as the existing tests pass, I guess we're all good. It's difficult to test also. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32365: [SPARK-35228][SQL] Add expression ToHiveString for keep consistent between hive/spark format in df.show
SparkQA removed a comment on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-871052721 **[Test build #140420 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140420/testReport)** for PR 32365 at commit [`65bf300`](https://github.com/apache/spark/commit/65bf30089e475098c16349ca45a08003badaa049). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32552: [SPARK-34819][SQL] MapType supports comparable semantics
SparkQA commented on pull request #32552: URL: https://github.com/apache/spark/pull/32552#issuecomment-871076683 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44926/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org