[GitHub] [spark] AngersZhuuuu commented on pull request #28034: [SPARK-31268][CORE]Initial Task Executor Metrics with latestMetrics

2021-07-22 Thread GitBox
AngersZh commented on pull request #28034: URL: https://github.com/apache/spark/pull/28034#issuecomment-885439832 gentle ping @jiangxb1987 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [spark] venkata91 commented on a change in pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
venkata91 commented on a change in pull request #33253: URL: https://github.com/apache/spark/pull/33253#discussion_r675343104 ## File path: core/src/main/scala/org/apache/spark/status/api/v1/api.scala ## @@ -89,6 +89,14 @@ class ExecutorStageSummary private[spark]( val pea

[GitHub] [spark] venkata91 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-22 Thread GitBox
venkata91 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r675339561 ## File path: core/src/main/scala/org/apache/spark/Dependency.scala ## @@ -122,6 +119,18 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C: ClassT

[GitHub] [spark] viirya edited a comment on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
viirya edited a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885433617 > So after this change, PRs won't pass Jenkins tests unless we increase the time out of it? Yea, this is also a concern. Let me try two changes. 1. Only set l

[GitHub] [spark] viirya commented on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
viirya commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885433617 > So after this change, PRs won't pass Jenkins tests unless we increase the time out of it? Yea, this is also a concern. Let me try two changes. 1. Only set lower me

[GitHub] [spark] dominikgehl commented on a change in pull request #33481: [SPARK-36258][PYTHON] Exposing functionExists in pyspark sql catalog

2021-07-22 Thread GitBox
dominikgehl commented on a change in pull request #33481: URL: https://github.com/apache/spark/pull/33481#discussion_r675335369 ## File path: python/pyspark/sql/catalog.py ## @@ -132,6 +132,30 @@ def listFunctions(self, dbName=None): isTemporary=jfunction.isTem

[GitHub] [spark] SparkQA commented on pull request #33490: [SPARK-35780][SQL][FOLLOW-UP] Block some invalid datetime string

2021-07-22 Thread GitBox
SparkQA commented on pull request #33490: URL: https://github.com/apache/spark/pull/33490#issuecomment-885430481 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46058/ -- This is an automated message from the A

[GitHub] [spark] SparkQA removed a comment on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE

2021-07-22 Thread GitBox
SparkQA removed a comment on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-885351226 **[Test build #141526 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141526/testReport)** for PR 32776 at commit [`83a4082`](https://gi

[GitHub] [spark] SparkQA commented on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE

2021-07-22 Thread GitBox
SparkQA commented on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-885430071 **[Test build #141526 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141526/testReport)** for PR 32776 at commit [`83a4082`](https://github.co

[GitHub] [spark] EnricoMi commented on a change in pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-22 Thread GitBox
EnricoMi commented on a change in pull request #33484: URL: https://github.com/apache/spark/pull/33484#discussion_r67567 ## File path: python/pyspark/sql/tests/test_dataframe.py ## @@ -389,6 +391,32 @@ def test_extended_hint_types(self): self.assertEqual(1, logical

[GitHub] [spark] EnricoMi commented on a change in pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-22 Thread GitBox
EnricoMi commented on a change in pull request #33484: URL: https://github.com/apache/spark/pull/33484#discussion_r675333083 ## File path: python/pyspark/sql/observation.py ## @@ -0,0 +1,100 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contribu

[GitHub] [spark] SparkQA commented on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
SparkQA commented on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885429557 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46057/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885424941 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141538/ -

[GitHub] [spark] gengliangwang commented on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
gengliangwang commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885425699 > This patch fails from timeout after a configured wait of 500m. So after this change, PRs won't pass Jenkins tests unless we increase the time out of it? -- This

[GitHub] [spark] HyukjinKwon commented on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
HyukjinKwon commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885425378 I prefer whichever way you want to do :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [spark] AmplabJenkins commented on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885424941 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141538/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
SparkQA removed a comment on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885387251 **[Test build #141538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141538/testReport)** for PR 33489 at commit [`6c81618`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885424374 **[Test build #141541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141541/testReport)** for PR 33447 at commit [`6c6930f`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
SparkQA commented on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885424334 **[Test build #141538 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141538/testReport)** for PR 33489 at commit [`6c81618`](https://github.co

[GitHub] [spark] viirya commented on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
viirya commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885423347 `SQLMetricsSuite.SPARK-32629` can pass locally in master branch. The test looks like related to build side size of shuffle hash join. So probably memory setting can affect it.

[GitHub] [spark] viirya removed a comment on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
viirya removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885417386 And, maven's metaspace size config is not big enough for Hive test in Jenkins. Currently I configure it in `build_and_test.yml` for GA. Maybe I need to set it back to 2g

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33460: [SPARK-36242][CORE] Ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33460: URL: https://github.com/apache/spark/pull/33460#issuecomment-885420116 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141530/ -

[GitHub] [spark] SparkQA removed a comment on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
SparkQA removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885241699 **[Test build #141520 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141520/testReport)** for PR 33447 at commit [`6488ca1`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885420113 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA removed a comment on pull request #33487: [SPARK-36268][PYTHON] Set the lowerbound of mypy version to 0.910

2021-07-22 Thread GitBox
SparkQA removed a comment on pull request #33487: URL: https://github.com/apache/spark/pull/33487#issuecomment-885367191 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33487: [SPARK-36268][PYTHON] Set the lowerbound of mypy version to 0.910

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33487: URL: https://github.com/apache/spark/pull/33487#issuecomment-885420111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33484: URL: https://github.com/apache/spark/pull/33484#issuecomment-885420115 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46054/

[GitHub] [spark] SparkQA removed a comment on pull request #33460: [SPARK-36242][CORE] Ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method

2021-07-22 Thread GitBox
SparkQA removed a comment on pull request #33460: URL: https://github.com/apache/spark/pull/33460#issuecomment-885367259 **[Test build #141530 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141530/testReport)** for PR 33460 at commit [`a57e76a`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-885420120 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46055/

[GitHub] [spark] AmplabJenkins commented on pull request #33487: [SPARK-36268][PYTHON] Set the lowerbound of mypy version to 0.910

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33487: URL: https://github.com/apache/spark/pull/33487#issuecomment-885420111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins commented on pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33484: URL: https://github.com/apache/spark/pull/33484#issuecomment-885420115 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46054/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885420113 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-885420120 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46055/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33460: [SPARK-36242][CORE] Ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33460: URL: https://github.com/apache/spark/pull/33460#issuecomment-885420116 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141530/ -- This

[GitHub] [spark] SparkQA commented on pull request #33490: [SPARK-35780][SQL][FOLLOW-UP] Block some invalid datetime string

2021-07-22 Thread GitBox
SparkQA commented on pull request #33490: URL: https://github.com/apache/spark/pull/33490#issuecomment-885418338 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46058/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33460: [SPARK-36242][CORE] Ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method

2021-07-22 Thread GitBox
SparkQA commented on pull request #33460: URL: https://github.com/apache/spark/pull/33460#issuecomment-885417668 **[Test build #141530 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141530/testReport)** for PR 33460 at commit [`a57e76a`](https://github.co

[GitHub] [spark] viirya commented on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
viirya commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885417386 And, maven's metaspace size config is not big enough for Hive test in Jenkins. Currently I configure it in `build_and_test.yml` for GA. Maybe I need to set it back to 2g in `pom.

[GitHub] [spark] SparkQA commented on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
SparkQA commented on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885416561 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46057/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33487: [SPARK-36268][PYTHON] Set the lowerbound of mypy version to 0.910

2021-07-22 Thread GitBox
SparkQA commented on pull request #33487: URL: https://github.com/apache/spark/pull/33487#issuecomment-885416465 **[Test build #141534 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141534/testReport)** for PR 33487 at commit [`290fbab`](https://github.co

[GitHub] [spark] viirya commented on pull request #33447: [SPARK-36270][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
viirya commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885416085 > Yeah, is it not related to this PR? I think the test failure in `SQLMetricsSuite.SPARK-32629` seems new to me. This only touched memory settings of building. Sounds unlik

[GitHub] [spark] venkata91 commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
venkata91 commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-885415899 > > Ah I see, currently we are populating killedTasksSummary in SpeculationStageSummary only if the killed reason is another attempt succeeded. For other reasons, we are not p

[GitHub] [spark] SparkQA commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-22 Thread GitBox
SparkQA commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-885415676 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46055/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #33487: [SPARK-36268][PYTHON] Set the lowerbound of mypy version to 0.910

2021-07-22 Thread GitBox
SparkQA commented on pull request #33487: URL: https://github.com/apache/spark/pull/33487#issuecomment-885412734 **[Test build #141529 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141529/testReport)** for PR 33487 at commit [`220c5d6`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-88541 **[Test build #141520 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141520/testReport)** for PR 33447 at commit [`6488ca1`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-22 Thread GitBox
SparkQA commented on pull request #33484: URL: https://github.com/apache/spark/pull/33484#issuecomment-885411519 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46054/ -- This is an automated message from the A

[GitHub] [spark] Ngone51 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-22 Thread GitBox
Ngone51 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r675315413 ## File path: core/src/main/scala/org/apache/spark/Dependency.scala ## @@ -122,6 +119,18 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C: ClassTag

[GitHub] [spark] SparkQA removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
SparkQA removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885233710 **[Test build #141518 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141518/testReport)** for PR 33447 at commit [`593f7c9`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885408691 **[Test build #141518 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141518/testReport)** for PR 33447 at commit [`593f7c9`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33488: [WIP][SPARK-36241][SQL] Support creating tables with void column

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33488: URL: https://github.com/apache/spark/pull/33488#issuecomment-885407123 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46053/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885407262 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46056/

[GitHub] [spark] AmplabJenkins commented on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885407262 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46056/ -- T

[GitHub] [spark] SparkQA commented on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
SparkQA commented on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885407253 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46056/ -- This

[GitHub] [spark] SparkQA commented on pull request #33488: [WIP][SPARK-36241][SQL] Support creating tables with void column

2021-07-22 Thread GitBox
SparkQA commented on pull request #33488: URL: https://github.com/apache/spark/pull/33488#issuecomment-885407105 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46053/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins commented on pull request #33488: [WIP][SPARK-36241][SQL] Support creating tables with void column

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33488: URL: https://github.com/apache/spark/pull/33488#issuecomment-885407123 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46053/ -- T

[GitHub] [spark] sarutak commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
sarutak commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-885405263 > Ah I see, currently we are populating killedTasksSummary in SpeculationStageSummary only if the killed reason is another attempt succeeded. For other reasons, we are not popul

[GitHub] [spark] SparkQA commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-22 Thread GitBox
SparkQA commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-885403605 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46055/ -- This is an automated message from the Apache

[GitHub] [spark] Ngone51 commented on a change in pull request #33460: [SPARK-36242][CORE] Ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method

2021-07-22 Thread GitBox
Ngone51 commented on a change in pull request #33460: URL: https://github.com/apache/spark/pull/33460#discussion_r675309400 ## File path: core/src/test/scala/org/apache/spark/util/collection/ExternalSorterSpillSuite.scala ## @@ -0,0 +1,148 @@ +/* + * Licensed to the Apache Sof

[GitHub] [spark] sarutak commented on a change in pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
sarutak commented on a change in pull request #33253: URL: https://github.com/apache/spark/pull/33253#discussion_r675309447 ## File path: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ## @@ -746,6 +752,24 @@ private[spark] class AppStatusListener(

[GitHub] [spark] SparkQA commented on pull request #33490: [SPARK-35780][SQL][FOLLOW-UP] Block some invalid datetime string

2021-07-22 Thread GitBox
SparkQA commented on pull request #33490: URL: https://github.com/apache/spark/pull/33490#issuecomment-885402485 **[Test build #141540 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141540/testReport)** for PR 33490 at commit [`4a18456`](https://github.com

[GitHub] [spark] Victsm commented on a change in pull request #33340: [SPARK-36266][SHUFFLE] Rename classes in shuffle RPC used for block push operations

2021-07-22 Thread GitBox
Victsm commented on a change in pull request #33340: URL: https://github.com/apache/spark/pull/33340#discussion_r675308898 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/BlockPushingListener.java ## @@ -25,6 +25,8 @@ * code reuse for hand

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-885401004 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46051/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33487: [SPARK-36268][PYTHON] Set the lowerbound of mypy version to 0.910

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33487: URL: https://github.com/apache/spark/pull/33487#issuecomment-885401002 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46047/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885401003 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46049/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33460: [SPARK-36242][CORE] Ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33460: URL: https://github.com/apache/spark/pull/33460#issuecomment-885401006 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46048/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
AmplabJenkins removed a comment on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885401005 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141539/ -

[GitHub] [spark] HyukjinKwon closed pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
HyukjinKwon closed pull request #33485: URL: https://github.com/apache/spark/pull/33485 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-un

[GitHub] [spark] SparkQA removed a comment on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
SparkQA removed a comment on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885388581 **[Test build #141539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141539/testReport)** for PR 33485 at commit [`dc29a39`](https://gi

[GitHub] [spark] HyukjinKwon commented on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
HyukjinKwon commented on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885401282 Merged to master and branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] AmplabJenkins commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885401003 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46049/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33487: [SPARK-36268][PYTHON] Set the lowerbound of mypy version to 0.910

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33487: URL: https://github.com/apache/spark/pull/33487#issuecomment-885401002 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46047/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33460: [SPARK-36242][CORE] Ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33460: URL: https://github.com/apache/spark/pull/33460#issuecomment-885401006 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46048/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885401005 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141539/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-22 Thread GitBox
AmplabJenkins commented on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-885401004 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46051/ -- T

[GitHub] [spark] Victsm commented on pull request #33340: [SPARK-36266][SHUFFLE] Rename classes in shuffle RPC used for block push operations

2021-07-22 Thread GitBox
Victsm commented on pull request #33340: URL: https://github.com/apache/spark/pull/33340#issuecomment-88537 The tests seem to have been abnormally disrupted based on what I can see from the log. Not sure what's causing this. cc @mridulm -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-22 Thread GitBox
SparkQA commented on pull request #33484: URL: https://github.com/apache/spark/pull/33484#issuecomment-885399911 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46054/ -- This is an automated message from the Apache

[GitHub] [spark] c21 commented on a change in pull request #33465: [SPARK-36245][SQL] Deduplicate the right side of left semi/anti join

2021-07-22 Thread GitBox
c21 commented on a change in pull request #33465: URL: https://github.com/apache/spark/pull/33465#discussion_r675305294 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/DeduplicateRightSideOfLeftSemiAntiJoin.scala ## @@ -0,0 +1,41 @@ +/* + * Li

[GitHub] [spark] sarutak commented on a change in pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
sarutak commented on a change in pull request #33253: URL: https://github.com/apache/spark/pull/33253#discussion_r675304991 ## File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ## @@ -524,6 +529,11 @@ private[spark] class AppStatusStore( } els

[GitHub] [spark] SparkQA commented on pull request #33488: [WIP][SPARK-36241][SQL] Support creating tables with void column

2021-07-22 Thread GitBox
SparkQA commented on pull request #33488: URL: https://github.com/apache/spark/pull/33488#issuecomment-885397080 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46053/ -- This is an automated message from the Apache

[GitHub] [spark] sarutak commented on a change in pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
sarutak commented on a change in pull request #33253: URL: https://github.com/apache/spark/pull/33253#discussion_r675304991 ## File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ## @@ -524,6 +529,11 @@ private[spark] class AppStatusStore( } els

[GitHub] [spark] SparkQA commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
SparkQA commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885396871 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46049/ -- This is an automated message from the A

[GitHub] [spark] sarutak commented on a change in pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
sarutak commented on a change in pull request #33253: URL: https://github.com/apache/spark/pull/33253#discussion_r675304991 ## File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala ## @@ -524,6 +529,11 @@ private[spark] class AppStatusStore( } els

[GitHub] [spark] SparkQA commented on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
SparkQA commented on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885396700 **[Test build #141539 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141539/testReport)** for PR 33485 at commit [`dc29a39`](https://github.co

[GitHub] [spark] linhongliu-db commented on a change in pull request #33490: [SPARK-35780][SQL][FOLLOW-UP] Block some invalid datetime string

2021-07-22 Thread GitBox
linhongliu-db commented on a change in pull request #33490: URL: https://github.com/apache/spark/pull/33490#discussion_r675302316 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/util/DateTimeUtilsSuite.scala ## @@ -289,6 +290,11 @@ class DateTimeUtilsSu

[GitHub] [spark] c21 commented on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
c21 commented on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885393499 @cloud-fan could you help take a look when you have time? Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [spark] linhongliu-db opened a new pull request #33490: [SPARK-35780][SQL][FOLLOW-UP] Block some invalid datetime string

2021-07-22 Thread GitBox
linhongliu-db opened a new pull request #33490: URL: https://github.com/apache/spark/pull/33490 ### What changes were proposed in this pull request? In PR #32959, we found some weird datetime strings that can be parsed. ([details](https://github.com/apache/spark/pull/32959#discussion_r66

[GitHub] [spark] SparkQA commented on pull request #33487: [SPARK-36268][PYTHON] Set the lowerbound of mypy version to 0.910

2021-07-22 Thread GitBox
SparkQA commented on pull request #33487: URL: https://github.com/apache/spark/pull/33487#issuecomment-885393140 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46047/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #33460: [SPARK-36242][CORE] Ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method

2021-07-22 Thread GitBox
SparkQA commented on pull request #33460: URL: https://github.com/apache/spark/pull/33460#issuecomment-885392575 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46048/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #33270: [SPARK-35956][K8S] Support auto assigning labels to decommissioning pods

2021-07-22 Thread GitBox
SparkQA commented on pull request #33270: URL: https://github.com/apache/spark/pull/33270#issuecomment-885391389 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46051/ -- This is an automated message from the A

[GitHub] [spark] venkata91 commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
venkata91 commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-885391140 > > killedTaskSummary when original attempt succeeded - https://user-images.githubusercontent.com/8871522/126575315-5e766870-eab4-4afc-a02f-ede63148692b.png > > killedTaskSu

[GitHub] [spark] HyukjinKwon commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
HyukjinKwon commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885391058 ohh can you file a JIRA please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] HyukjinKwon commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
HyukjinKwon commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885390958 Actually, let me just merge this in first. let see if it fails consistently. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] HyukjinKwon commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
HyukjinKwon commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885390894 Yeah, is it not related to this PR? I think the test failure in `SQLMetricsSuite.SPARK-32629` seems new to me. -- This is an automated message from the Apache Git Serv

[GitHub] [spark] sarutak commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
sarutak commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-885389696 > killedTaskSummary when original attempt succeeded - https://user-images.githubusercontent.com/8871522/126575315-5e766870-eab4-4afc-a02f-ede63148692b.png killedTaskSummary wh

[GitHub] [spark] SparkQA commented on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
SparkQA commented on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885388581 **[Test build #141539 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141539/testReport)** for PR 33485 at commit [`dc29a39`](https://github.com

[GitHub] [spark] ueshin commented on pull request #33485: [SPARK-36261][PYTHON] Add remove_unused_categories to CategoricalAccessor and CategoricalIndex

2021-07-22 Thread GitBox
ueshin commented on pull request #33485: URL: https://github.com/apache/spark/pull/33485#issuecomment-885388366 Jenkins, retest this please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [spark] venkata91 edited a comment on pull request #33426: [SPARK-32920][FOLLOW-UP] Fix shuffleMergeFinalized directly calling rdd.getNumPartitions as RDD is not serialized to executor

2021-07-22 Thread GitBox
venkata91 edited a comment on pull request #33426: URL: https://github.com/apache/spark/pull/33426#issuecomment-885387767 > ``` > [error] /home/runner/work/spark/spark/core/src/main/scala/org/apache/spark/shuffle/ShuffleWriteProcessor.scala:24:30: Unused import > [error] import org.a

[GitHub] [spark] venkata91 commented on pull request #33426: [SPARK-32920][FOLLOW-UP] Fix shuffleMergeFinalized directly calling rdd.getNumPartitions as RDD is not serialized to executor

2021-07-22 Thread GitBox
venkata91 commented on pull request #33426: URL: https://github.com/apache/spark/pull/33426#issuecomment-885387767 > ``` > [error] /home/runner/work/spark/spark/core/src/main/scala/org/apache/spark/shuffle/ShuffleWriteProcessor.scala:24:30: Unused import > [error] import org.apache.s

[GitHub] [spark] viirya commented on pull request #33447: [SPARK-xxxxx][BUILD] Change memory settings for enabling GA

2021-07-22 Thread GitBox
viirya commented on pull request #33447: URL: https://github.com/apache/spark/pull/33447#issuecomment-885387520 > Can we skip/ignore the test `SPARK-32629` for now? Otherwise looks pretty good Ignore the test in this PR? -- This is an automated message from the Apache Git Service.

[GitHub] [spark] SparkQA commented on pull request #33489: [SPARK-36269][SQL] Fix only set data columns to Hive column names config

2021-07-22 Thread GitBox
SparkQA commented on pull request #33489: URL: https://github.com/apache/spark/pull/33489#issuecomment-885387251 **[Test build #141538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141538/testReport)** for PR 33489 at commit [`6c81618`](https://github.com

[GitHub] [spark] venkata91 commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-22 Thread GitBox
venkata91 commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-885386967 @sarutak Gentle ping. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

  1   2   3   4   5   6   7   8   9   >