[GitHub] [spark] AmplabJenkins removed a comment on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
AmplabJenkins removed a comment on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873526983 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140616/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
AmplabJenkins commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873526983 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140616/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA removed a comment on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873504243 **[Test build #140616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140616/testReport)** for PR 33203 at commit [`ba54bd2`](https://github.com/apache/spark/commit/ba54bd2118f0660108c1bab6235b47593da4b206). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873524618 **[Test build #140616 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140616/testReport)** for PR 33203 at commit [`ba54bd2`](https://github.com/apache/spark/commit/ba54bd2118f0660108c1bab6235b47593da4b206). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
AmplabJenkins removed a comment on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873521212 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
AmplabJenkins commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873521212 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA removed a comment on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873499056 **[Test build #140615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140615/testReport)** for PR 33203 at commit [`e635b8e`](https://github.com/apache/spark/commit/e635b8e860b72d256c2a2c81fec015cca5745554). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873518711 **[Test build #140615 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140615/testReport)** for PR 33203 at commit [`e635b8e`](https://github.com/apache/spark/commit/e635b8e860b72d256c2a2c81fec015cca5745554). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873517385 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45129/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
AmplabJenkins removed a comment on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873515749 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45128/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
AmplabJenkins commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873515749 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45128/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873509186 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45129/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on pull request #33199: [SPARK-36004][INFRA] Update MiMa and audit API changes
srowen commented on pull request #33199: URL: https://github.com/apache/spark/pull/33199#issuecomment-873506320 I think the current state is that this passes only because some exclusions have to be retained that seem unintended - they were excluded for 3.0 - 3.1 not 3.1 - 3.2, but we were still using them. So I don't know if we should merge it ; could do but need to verify the exclusions are legitimate again or else undo the unintended api change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873505692 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45128/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873504243 **[Test build #140616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140616/testReport)** for PR 33203 at commit [`ba54bd2`](https://github.com/apache/spark/commit/ba54bd2118f0660108c1bab6235b47593da4b206). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873502895 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45128/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #32081: [SPARK-34674][CORE][K8S] Close SparkContext after the Main method has finished
HyukjinKwon commented on a change in pull request #32081: URL: https://github.com/apache/spark/pull/32081#discussion_r663434317 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -952,6 +952,12 @@ private[spark] class SparkSubmit extends Logging { } catch { case t: Throwable => throw findCause(t) +} finally { + try { +SparkContext.getActive.foreach(_.stop()) Review comment: cc @sunpe too FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #32081: [SPARK-34674][CORE][K8S] Close SparkContext after the Main method has finished
HyukjinKwon commented on a change in pull request #32081: URL: https://github.com/apache/spark/pull/32081#discussion_r663434289 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ## @@ -952,6 +952,12 @@ private[spark] class SparkSubmit extends Logging { } catch { case t: Throwable => throw findCause(t) +} finally { + try { +SparkContext.getActive.foreach(_.stop()) Review comment: Just reading this and https://github.com/apache/spark/pull/33154, shouldn't we enable this only w/ Kubernates (and also when it's not a Thirftserver, shall, etc.)? Also, I think we might have to add some comments on that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] pingsutw commented on a change in pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
pingsutw commented on a change in pull request #33203: URL: https://github.com/apache/spark/pull/33203#discussion_r663433955 ## File path: .github/workflows/benchmark.yml ## @@ -85,6 +85,7 @@ jobs: # In benchmark, we use local as master so set driver memory only. Note that GitHub Actions has 7 GB memory limit. bin/spark-submit \ Review comment: Thanks for the review. Updated it and retrigger the benchmark workflow in my fork. https://github.com/pingsutw/spark/actions/runs/997492502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33154: [SPARK-35949][CORE]Add `is-server` arg for to prevent closing spark context when starting as a server.
HyukjinKwon commented on a change in pull request #33154: URL: https://github.com/apache/spark/pull/33154#discussion_r663433603 ## File path: python/pyspark/pandas/tests/test_stats.py ## @@ -283,7 +283,7 @@ def test_cov_corr_meta(self): index=pd.Index([1, 2, 3], name="myindex"), ) psdf = ps.from_pandas(pdf) -self.assert_eq(psdf.corr(), pdf.corr()) +self.assert_eq(psdf.corr(), pdf.corr(), check_exact=False) Review comment: Looks like it was mistakenly cherry-picked together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
HyukjinKwon commented on a change in pull request #33188: URL: https://github.com/apache/spark/pull/33188#discussion_r663433014 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala ## @@ -133,4 +134,27 @@ class EnsureRequirementsSuite extends SharedSparkSession { }.size == 2) } } + + test("SPARK-35989: Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled") { +import testImplicits._ +Seq(true, false).foreach { enableAqe => + withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> enableAqe.toString, +SQLConf.SHUFFLE_PARTITIONS.key -> "3", +SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") { +val df1 = Seq((1, 2)).toDF("c1", "c2") +val df2 = Seq((1, 3)).toDF("c3", "c4") +val res = df1.join(df2, $"c1" === $"c3").repartition(3, $"c1") +res.collect() Review comment: Why should we call collect? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
HyukjinKwon commented on a change in pull request #33188: URL: https://github.com/apache/spark/pull/33188#discussion_r663432910 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala ## @@ -133,4 +134,27 @@ class EnsureRequirementsSuite extends SharedSparkSession { }.size == 2) } } + + test("SPARK-35989: Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled") { +import testImplicits._ +Seq(true, false).foreach { enableAqe => + withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> enableAqe.toString, +SQLConf.SHUFFLE_PARTITIONS.key -> "3", +SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") { Review comment: nit: ```suggestion SQLConf.SHUFFLE_PARTITIONS.key -> "3", SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") { ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
HyukjinKwon commented on pull request #33188: URL: https://github.com/apache/spark/pull/33188#issuecomment-873499590 @ulysses-you mind updating PR description and title too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
AmplabJenkins removed a comment on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873379366 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
SparkQA commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873499056 **[Test build #140615 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140615/testReport)** for PR 33203 at commit [`e635b8e`](https://github.com/apache/spark/commit/e635b8e860b72d256c2a2c81fec015cca5745554). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #32944: [SPARK-35794][SQL] Allow custom plugin for AQE cost evaluator
HyukjinKwon commented on a change in pull request #32944: URL: https://github.com/apache/spark/pull/32944#discussion_r663432340 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -130,7 +130,11 @@ case class AdaptiveSparkPlanExec( } } - @transient private val costEvaluator = SimpleCostEvaluator + @transient private val costEvaluator = +conf.getConf(SQLConf.ADAPTIVE_CUSTOM_COST_EVALUATOR_CLASS) match { Review comment: Yeah, we can add `@Unstable` for now but I would also add a note that this class is supposed to be moved or changed in the near future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #33178: [SPARK-35980][CORE] ThreadAudit logs whether thread is daemon
HyukjinKwon closed pull request #33178: URL: https://github.com/apache/spark/pull/33178 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #33178: [SPARK-35980][CORE] ThreadAudit logs whether thread is daemon
HyukjinKwon commented on pull request #33178: URL: https://github.com/apache/spark/pull/33178#issuecomment-873498265 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #33201: [SPARK-36005][SQL] The canCast method of type of char/varchar is modified to be consistent with StringType
HyukjinKwon commented on pull request #33201: URL: https://github.com/apache/spark/pull/33201#issuecomment-873498125 @zheniantoushipashi does it cause any user-facing behaviour change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
HyukjinKwon commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873497353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
HyukjinKwon commented on a change in pull request #33203: URL: https://github.com/apache/spark/pull/33203#discussion_r663430840 ## File path: .github/workflows/benchmark.yml ## @@ -85,6 +85,7 @@ jobs: # In benchmark, we use local as master so set driver memory only. Note that GitHub Actions has 7 GB memory limit. bin/spark-submit \ Review comment: I think it's better to set the `SPARK_HOME` env instead. Can you add: ``` SPARK_HOME: ${{ github.workspace }} ``` under `env` with a comment that explains why we should set it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #33185: [SPARK-35986][PYSPARK] Fix type hint for RDD.histogram's buckets
HyukjinKwon edited a comment on pull request #33185: URL: https://github.com/apache/spark/pull/33185#issuecomment-873496182 Merged to master, branch-3.2 and branch-3.1. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon edited a comment on pull request #33185: [SPARK-35986][PYSPARK] Fix type hint for RDD.histogram's buckets
HyukjinKwon edited a comment on pull request #33185: URL: https://github.com/apache/spark/pull/33185#issuecomment-873496182 Merged to master, branch-3.2, branch-3.1 and branch-3.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #33185: [SPARK-35986][PYSPARK] Fix type hint for RDD.histogram's buckets
HyukjinKwon closed pull request #33185: URL: https://github.com/apache/spark/pull/33185 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #33185: [SPARK-35986][PYSPARK] Fix type hint for RDD.histogram's buckets
HyukjinKwon commented on pull request #33185: URL: https://github.com/apache/spark/pull/33185#issuecomment-873496182 Merged to master and branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r663428118 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -403,38 +394,78 @@ public MergeStatuses finalizeShuffleMerge(FinalizeShuffleMerge msg) throws IOExc reduceIds.add(partition.reduceId); sizes.add(partition.getLastChunkOffset()); } catch (IOException ioe) { -logger.warn("Exception while finalizing shuffle partition {} {} {}", msg.appId, - msg.shuffleId, partition.reduceId, ioe); +logger.warn("Exception while finalizing shuffle partition {}_{} {} {}", msg.appId, + msg.attemptId, msg.shuffleId, partition.reduceId, ioe); } finally { partition.closeAllFiles(); -// The partition should be removed after the files are written so that any new stream -// for the same reduce partition will see that the data file exists. -partitionsIter.remove(); } } } mergeStatuses = new MergeStatuses(msg.shuffleId, bitmaps.toArray(new RoaringBitmap[bitmaps.size()]), Ints.toArray(reduceIds), Longs.toArray(sizes)); } -partitions.remove(appShuffleId); -logger.info("Finalized shuffle {} from Application {}.", msg.shuffleId, msg.appId); +logger.info("Finalized shuffle {} from Application {}_{}.", + msg.shuffleId, msg.appId, msg.attemptId); return mergeStatuses; } @Override public void registerExecutor(String appId, ExecutorShuffleInfo executorInfo) { if (logger.isDebugEnabled()) { logger.debug("register executor with RemoteBlockPushResolver {} local-dirs {} " -+ "num sub-dirs {}", appId, Arrays.toString(executorInfo.localDirs), - executorInfo.subDirsPerLocalDir); ++ "num sub-dirs {} shuffleManager {}", appId, Arrays.toString(executorInfo.localDirs), +executorInfo.subDirsPerLocalDir, executorInfo.shuffleManager); +} +String shuffleManagerMeta = executorInfo.shuffleManager; +if (shuffleManagerMeta.contains(":")) { + String mergeDirInfo = shuffleManagerMeta.substring(shuffleManagerMeta.indexOf(":") + 1); + try { +ObjectMapper mapper = new ObjectMapper(); +MergeDirectoryMeta mergeDirectoryMeta = + mapper.readValue(mergeDirInfo, MergeDirectoryMeta.class); +if (mergeDirectoryMeta.attemptId == ATTEMPT_ID_UNDEFINED) { + // When attemptId is -1, there is no attemptId stored in the ExecutorShuffleInfo. + // Only the first ExecutorRegister message can register the merge dirs + appsShuffleInfo.computeIfAbsent(appId, id -> +new AppShuffleInfo( + appId, mergeDirectoryMeta.attemptId, + new AppPathsInfo(appId, executorInfo.localDirs, +mergeDirectoryMeta.mergeDir, executorInfo.subDirsPerLocalDir) +)); +} else { + // If attemptId is not -1, there is attemptId stored in the ExecutorShuffleInfo. + // The first ExecutorRegister message from the same application attempt wil register + // the merge dirs in External Shuffle Service. Any later ExecutorRegister message + // from the same application attempt will not override the merge dirs. But it can + // be overridden by ExecutorRegister message from newer application attempt, + // and former attempts' shuffle partitions information will also be cleaned up. + ConcurrentMap appShuffleInfoToBeCleanedUp = Review comment: Not necessary after changing to ConcurrentHashMap. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r663427925 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -112,34 +116,48 @@ public ShuffleIndexInformation load(File file) throws IOException { this.errorHandler = new ErrorHandler.BlockPushErrorHandler(); } + private AppShuffleInfo validateAndGetAppShuffleInfo(String appId) { +// TODO: [SPARK-33236] Change the message when this service is able to handle NM restart +AppShuffleInfo appShuffleInfo = + Preconditions.checkNotNull(appsShuffleInfo.get(appId), +"application " + appId + " is not registered or NM was restarted."); +return appShuffleInfo; + } + /** * Given the appShuffleId and reduceId that uniquely identifies a given shuffle partition of an * application, retrieves the associated metadata. If not present and the corresponding merged * shuffle does not exist, initializes the metadata. */ private AppShufflePartitionInfo getOrCreateAppShufflePartitionInfo( - AppShuffleId appShuffleId, + AppShuffleInfo appShuffleInfo, + int shuffleId, int reduceId) { -File dataFile = getMergedShuffleDataFile(appShuffleId, reduceId); -if (!partitions.containsKey(appShuffleId) && dataFile.exists()) { +File dataFile = appShuffleInfo.getMergedShuffleDataFile(shuffleId, reduceId); +ConcurrentMap> partitions = + appShuffleInfo.partitions; +if (!partitions.containsKey(shuffleId) && dataFile.exists()) { // If this partition is already finalized then the partitions map will not contain // the appShuffleId but the data file would exist. In that case the block is considered late. return null; } Review comment: Updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r663427910 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -403,38 +394,78 @@ public MergeStatuses finalizeShuffleMerge(FinalizeShuffleMerge msg) throws IOExc reduceIds.add(partition.reduceId); sizes.add(partition.getLastChunkOffset()); } catch (IOException ioe) { -logger.warn("Exception while finalizing shuffle partition {} {} {}", msg.appId, - msg.shuffleId, partition.reduceId, ioe); +logger.warn("Exception while finalizing shuffle partition {}_{} {} {}", msg.appId, + msg.attemptId, msg.shuffleId, partition.reduceId, ioe); } finally { partition.closeAllFiles(); -// The partition should be removed after the files are written so that any new stream -// for the same reduce partition will see that the data file exists. -partitionsIter.remove(); } } } mergeStatuses = new MergeStatuses(msg.shuffleId, bitmaps.toArray(new RoaringBitmap[bitmaps.size()]), Ints.toArray(reduceIds), Longs.toArray(sizes)); } -partitions.remove(appShuffleId); -logger.info("Finalized shuffle {} from Application {}.", msg.shuffleId, msg.appId); +logger.info("Finalized shuffle {} from Application {}_{}.", + msg.shuffleId, msg.appId, msg.attemptId); return mergeStatuses; } @Override public void registerExecutor(String appId, ExecutorShuffleInfo executorInfo) { if (logger.isDebugEnabled()) { logger.debug("register executor with RemoteBlockPushResolver {} local-dirs {} " -+ "num sub-dirs {}", appId, Arrays.toString(executorInfo.localDirs), - executorInfo.subDirsPerLocalDir); ++ "num sub-dirs {} shuffleManager {}", appId, Arrays.toString(executorInfo.localDirs), +executorInfo.subDirsPerLocalDir, executorInfo.shuffleManager); +} +String shuffleManagerMeta = executorInfo.shuffleManager; +if (shuffleManagerMeta.contains(":")) { + String mergeDirInfo = shuffleManagerMeta.substring(shuffleManagerMeta.indexOf(":") + 1); + try { +ObjectMapper mapper = new ObjectMapper(); +MergeDirectoryMeta mergeDirectoryMeta = + mapper.readValue(mergeDirInfo, MergeDirectoryMeta.class); +if (mergeDirectoryMeta.attemptId == ATTEMPT_ID_UNDEFINED) { + // When attemptId is -1, there is no attemptId stored in the ExecutorShuffleInfo. + // Only the first ExecutorRegister message can register the merge dirs + appsShuffleInfo.computeIfAbsent(appId, id -> +new AppShuffleInfo( + appId, mergeDirectoryMeta.attemptId, + new AppPathsInfo(appId, executorInfo.localDirs, +mergeDirectoryMeta.mergeDir, executorInfo.subDirsPerLocalDir) +)); +} else { + // If attemptId is not -1, there is attemptId stored in the ExecutorShuffleInfo. + // The first ExecutorRegister message from the same application attempt wil register + // the merge dirs in External Shuffle Service. Any later ExecutorRegister message + // from the same application attempt will not override the merge dirs. But it can + // be overridden by ExecutorRegister message from newer application attempt, + // and former attempts' shuffle partitions information will also be cleaned up. + ConcurrentMap appShuffleInfoToBeCleanedUp = +Maps.newConcurrentMap(); + appsShuffleInfo.compute(appId, (id, appShuffleInfo) -> { +if (appShuffleInfo == null || (appShuffleInfo != null + && mergeDirectoryMeta.attemptId > appShuffleInfo.attemptId)) { + appShuffleInfoToBeCleanedUp.putIfAbsent(appShuffleInfo.attemptId, appShuffleInfo); Review comment: After moving the ConcurrentHashmap for the nested maps, we don't need to use the appShuffleInfoToBeCleanedUp to store multiple AppShuffleInfo. Instead, we use AtomicReference for the single invalid AppShuffleInfo. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] commented on pull request #29935: [SPARK-33055][PYTHON][SQL] Add Python CalendarIntervalType
github-actions[bot] commented on pull request #29935: URL: https://github.com/apache/spark/pull/29935#issuecomment-873487919 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #31598: [SPARK-34478][SQL] When build SparkSession, we should check config keys
github-actions[bot] closed pull request #31598: URL: https://github.com/apache/spark/pull/31598 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873446417 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45127/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
AmplabJenkins commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873446425 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45127/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
AmplabJenkins removed a comment on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873437050 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140614/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873441599 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45127/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA removed a comment on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873436260 **[Test build #140614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140614/testReport)** for PR 33200 at commit [`0c83e37`](https://github.com/apache/spark/commit/0c83e37a9fc9ba22e864a906c322572f6ae41450). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
AmplabJenkins commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873437050 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140614/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873437041 **[Test build #140614 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140614/testReport)** for PR 33200 at commit [`0c83e37`](https://github.com/apache/spark/commit/0c83e37a9fc9ba22e864a906c322572f6ae41450). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33183: [SPARK-35972][SQL] When replace ExtractValue in NestedColumnAliasing we should use semanticEquals
AmplabJenkins removed a comment on pull request #33183: URL: https://github.com/apache/spark/pull/33183#issuecomment-873294316 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140596/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
AmplabJenkins removed a comment on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873358952 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873436260 **[Test build #140614 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140614/testReport)** for PR 33200 at commit [`0c83e37`](https://github.com/apache/spark/commit/0c83e37a9fc9ba22e864a906c322572f6ae41450). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya closed pull request #33142: [SPARK-35940][SQL] Refactor EquivalentExpressions to make it more efficient
viirya closed pull request #33142: URL: https://github.com/apache/spark/pull/33142 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #33142: [SPARK-35940][SQL] Refactor EquivalentExpressions to make it more efficient
viirya commented on pull request #33142: URL: https://github.com/apache/spark/pull/33142#issuecomment-873424131 Thanks! Merging to master/branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
AmplabJenkins removed a comment on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873410235 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140613/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
AmplabJenkins commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873410235 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140613/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA removed a comment on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873373461 **[Test build #140613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140613/testReport)** for PR 33182 at commit [`f3474a2`](https://github.com/apache/spark/commit/f3474a20e712b40997a52f6f23605e7a51524f0f). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873407812 **[Test build #140613 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140613/testReport)** for PR 33182 at commit [`f3474a2`](https://github.com/apache/spark/commit/f3474a20e712b40997a52f6f23605e7a51524f0f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #33196: [SPARK-35996][BUILD] Setting version to 3.3.0-SNAPSHOT
gengliangwang commented on pull request #33196: URL: https://github.com/apache/spark/pull/33196#issuecomment-873405034 @dongjoon-hyun Thank you so much, have fun on your vacation! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
AmplabJenkins removed a comment on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873386756 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45126/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
AmplabJenkins removed a comment on pull request #33188: URL: https://github.com/apache/spark/pull/33188#issuecomment-873386755 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140610/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range
AmplabJenkins removed a comment on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-873386754 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
AmplabJenkins commented on pull request #33188: URL: https://github.com/apache/spark/pull/33188#issuecomment-873386755 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140610/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range
AmplabJenkins commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-873386757 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
AmplabJenkins commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873386756 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45126/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range
SparkQA removed a comment on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-873372513 **[Test build #140612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140612/testReport)** for PR 32959 at commit [`e94885c`](https://github.com/apache/spark/commit/e94885cc2e697031a95df6bccb51fc196296233e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range
SparkQA commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-873386207 **[Test build #140612 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140612/testReport)** for PR 32959 at commit [`e94885c`](https://github.com/apache/spark/commit/e94885cc2e697031a95df6bccb51fc196296233e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zero323 removed a comment on pull request #33185: [SPARK-35986][PYSPARK] Fix type hint for RDD.histogram's buckets
zero323 removed a comment on pull request #33185: URL: https://github.com/apache/spark/pull/33185#issuecomment-873383580 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zero323 commented on pull request #33185: [SPARK-35986][PYSPARK] Fix type hint for RDD.histogram's buckets
zero323 commented on pull request #33185: URL: https://github.com/apache/spark/pull/33185#issuecomment-873383580 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873382869 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45126/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
SparkQA removed a comment on pull request #33188: URL: https://github.com/apache/spark/pull/33188#issuecomment-873352553 **[Test build #140610 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140610/testReport)** for PR 33188 at commit [`b2822d5`](https://github.com/apache/spark/commit/b2822d5e4260d4c5fb002236218db7d9c01755a1). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
SparkQA commented on pull request #33188: URL: https://github.com/apache/spark/pull/33188#issuecomment-873382395 **[Test build #140610 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140610/testReport)** for PR 33188 at commit [`b2822d5`](https://github.com/apache/spark/commit/b2822d5e4260d4c5fb002236218db7d9c01755a1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range
SparkQA commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-873382245 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45125/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
AmplabJenkins commented on pull request #33203: URL: https://github.com/apache/spark/pull/33203#issuecomment-873379366 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.
AmplabJenkins removed a comment on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-873379231 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140611/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.
AmplabJenkins commented on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-873379231 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140611/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873378994 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45126/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] pingsutw opened a new pull request #33203: [SPARK-36007][INFRA] Failed to run benchmark in GA
pingsutw opened a new pull request #33203: URL: https://github.com/apache/spark/pull/33203 ### What changes were proposed in this pull request? When I running the benchmark in GA, I met the below error. https://github.com/pingsutw/spark/runs/2867617238?check_suite_focus=true ``` java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.j ava:1692)java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) 21/06/20 07:40:02 ERROR SparkContext: Error initializing SparkContext.java.lang.AssertionError: assertion failed: spark.test.home is not set! at scala.Predef$.assert(Predef.scala:223) at org.apache.spark.deploy.worker.Worker. (Worker.scala:148) at org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:954) at org.apache.spark.deploy.LocalSparkCluster.$anonfun$start$2(LocalSparkCluster.scala:68) at org.apache.spark.deploy.LocalSparkCluster.$anonfun$start$2$adapted(LocalSparkCluster.scala:65) at scala.collection.immutable.Range.foreach(Range.scala:158) at org.apache.spark.deploy.LocalSparkCluster.start(LocalSparkCluster.scala:65) at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2954) at org.apache.spark.SparkContext.(SparkContext.scala:559) at org.apache.spark.SparkContext. (SparkContext.scala:137) at org.apache.spark.serializer.KryoSerializerBenchmark$.createSparkContext(KryoSerializerBenchmark.scala:86) at org.apache.spark.serializer.KryoSerializerBenchmark$.sc$lzycompute$1(KryoSerializerBenchmark.scala:58) at org.apache.spark.serializer.KryoSerializerBenchmark$.sc$1(KryoSerializerBenchmark.scala:58) at org.apache.spark.serializer.KryoSerializerBenchmark$.$anonfun$run$3(KryoSerializerBenchmark.scala:63) ``` ### Why are the changes needed? Set `spark.test.home` in the benchmark workflow. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Rerun the benchmark in my fork. https://github.com/pingsutw/spark/actions/runs/995783069 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range
SparkQA commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-873377950 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45125/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.
SparkQA removed a comment on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-873357656 **[Test build #140611 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140611/testReport)** for PR 33101 at commit [`6d123ba`](https://github.com/apache/spark/commit/6d123ba1258250e681a6f812303299f5fb67d90c). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.
SparkQA commented on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-873375348 **[Test build #140611 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140611/testReport)** for PR 33101 at commit [`6d123ba`](https://github.com/apache/spark/commit/6d123ba1258250e681a6f812303299f5fb67d90c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA removed a comment on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873351326 **[Test build #140609 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140609/testReport)** for PR 33200 at commit [`f729d8d`](https://github.com/apache/spark/commit/f729d8d7db011f90c286206481ecbac019e2ff1c). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
AmplabJenkins commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873373911 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140609/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873373753 **[Test build #140609 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140609/testReport)** for PR 33200 at commit [`f729d8d`](https://github.com/apache/spark/commit/f729d8d7db011f90c286206481ecbac019e2ff1c). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33182: [SPARK-35984][SQL] Config to force applying shuffled hash join
SparkQA commented on pull request #33182: URL: https://github.com/apache/spark/pull/33182#issuecomment-873373461 **[Test build #140613 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140613/testReport)** for PR 33182 at commit [`f3474a2`](https://github.com/apache/spark/commit/f3474a20e712b40997a52f6f23605e7a51524f0f). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range
SparkQA commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-873372513 **[Test build #140612 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140612/testReport)** for PR 32959 at commit [`e94885c`](https://github.com/apache/spark/commit/e94885cc2e697031a95df6bccb51fc196296233e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.
AmplabJenkins removed a comment on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-873366551 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45124/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.
SparkQA commented on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-873366545 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45124/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.
AmplabJenkins commented on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-873366551 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45124/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhaomin1423 closed pull request #33202: remove redundant code
zhaomin1423 closed pull request #33202: URL: https://github.com/apache/spark/pull/33202 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhaomin1423 opened a new pull request #33202: remove redundant code
zhaomin1423 opened a new pull request #33202: URL: https://github.com/apache/spark/pull/33202 ### What changes were proposed in this pull request? remove redundant code ### Why are the changes needed? keep the code style and avoid avoid unnecessary operations. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33201: [SPARK-36005][SQL] The canCast method of type of char/varchar is modified to be consistent with StringType
AmplabJenkins commented on pull request #33201: URL: https://github.com/apache/spark/pull/33201#issuecomment-873364769 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zheniantoushipashi opened a new pull request #33201: [SPARK-36005][SQL] The canCast method of type of char/varchar is modified to be consistent with StringType
zheniantoushipashi opened a new pull request #33201: URL: https://github.com/apache/spark/pull/33201 ### What changes were proposed in this pull request? The canCast method of type of char/varchar is modified to be consistent with StringType the method cast will change the type char/varchar to StringType def cast(to: DataType): Column = withExpr { val cast = Cast(expr, CharVarcharUtils.replaceCharVarcharWithStringForCast(to)) cast.setTagValue(Cast.USER_SPECIFIED_CAST, true) cast } The canCast method of type of char/varchar must be consistent with StringType ### Why are the changes needed? Before I used stringType instead of char/varchar, my application code has the logic to judge using canCast. There was no problem before, but now it’s changed to char/varchar, and the judgment of canCast fails. If it doesn’t pass, I Need to change a lot of application code ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? i add UT。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Peng-Lei commented on a change in pull request #33175: [SPARK-35973][SQL] DataSourceV2: Support SHOW CATALOGS
Peng-Lei commented on a change in pull request #33175: URL: https://github.com/apache/spark/pull/33175#discussion_r663311232 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala ## @@ -128,6 +128,10 @@ class CatalogManager( } } + def listCatalogs(): mutable.HashMap[String, CatalogPlugin] = synchronized { +catalogs Review comment: Yeah, This behavior is desired for me because of that the displayed catalogs registered are meaningful because they can be loaded correctly. @imback82 @yaooqinn @cloud-fan Do you have any suggestions for this? show all catalogs configed or show catalogs registered? Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
AmplabJenkins removed a comment on pull request #33188: URL: https://github.com/apache/spark/pull/33188#issuecomment-873364246 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45123/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
AmplabJenkins commented on pull request #33188: URL: https://github.com/apache/spark/pull/33188#issuecomment-873364246 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45123/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya edited a comment on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre
viirya edited a comment on pull request #29326: URL: https://github.com/apache/spark/pull/29326#issuecomment-873359259 Encountered some issues. Although we can switch to hive-exec without classifier (shaded version) to get rid of above guava version issue, the shaded hive-exec contains (without relocation) some dependencies like commons-lang3, orc, parquet that are not same version with Spark and so they conflict. Because shaded hive-exec jar already includes these dependency jars, seems dependency exclusions in pom cannot exclude them. Currently seems we can just go back to Hive to shade every included dependencies? Any other thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.
SparkQA commented on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-873362664 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45124/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33188: [SPARK-35989][SQL] Do not remove REPARTITION_BY_NUM shuffle if AQE is enabled
SparkQA commented on pull request #33188: URL: https://github.com/apache/spark/pull/33188#issuecomment-873360438 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45123/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
AmplabJenkins commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873359540 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45122/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33200: [SPARK-36006][SQL] Migrate ALTER TABLE ... ADD/REPLACE COLUMNS commands to use UnresolvedTable to resolve the identifier
SparkQA commented on pull request #33200: URL: https://github.com/apache/spark/pull/33200#issuecomment-873359537 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45122/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #29326: [WIP][SPARK-32502][BUILD] Upgrade Guava to 27.0-jre
viirya commented on pull request #29326: URL: https://github.com/apache/spark/pull/29326#issuecomment-873359259 Encountered some issues. Although we can switch to hive-exec without classifier (shaded version) to get rid of above guava version issue, the shaded hive-exec contains (without relocation) some dependencies like commons-lang3, orc that are not same version with Spark and so they conflict. Because shaded hive-exec jar already includes these dependency jars, seems dependency exclusions in pom cannot exclude them. Currently seems we can just go back to Hive to shade every included dependencies? Any other thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org