[GitHub] [spark] cloud-fan commented on a change in pull request #34575: [SPARK-37273][SQL] Support hidden file metadata columns in Spark SQL

2021-12-21 Thread GitBox
cloud-fan commented on a change in pull request #34575: URL: https://github.com/apache/spark/pull/34575#discussion_r773672644 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormat.scala ## @@ -171,6 +171,29 @@ trait FileFormat { def sup

[GitHub] [spark] cloud-fan commented on a change in pull request #34575: [SPARK-37273][SQL] Support hidden file metadata columns in Spark SQL

2021-12-21 Thread GitBox
cloud-fan commented on a change in pull request #34575: URL: https://github.com/apache/spark/pull/34575#discussion_r773672049 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -103,6 +115,101 @@ class FileScanRDD(

[GitHub] [spark] SparkQA commented on pull request #34982: [SPARK-37712][YARN] Spark request yarn cluster metrics slow cause delay

2021-12-21 Thread GitBox
SparkQA commented on pull request #34982: URL: https://github.com/apache/spark/pull/34982#issuecomment-999360090 **[Test build #146473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146473/testReport)** for PR 34982 at commit [`bafb1ce`](https://github.com

[GitHub] [spark] AngersZhuuuu commented on pull request #34982: [SPARK-37712][YARN] Spark request yarn cluster metrics slow cause delay

2021-12-21 Thread GitBox
AngersZh commented on pull request #34982: URL: https://github.com/apache/spark/pull/34982#issuecomment-99936 ping @tgravescs @mridulm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [spark] AngersZhuuuu opened a new pull request #34982: [SPARK-37712][YARN] Spark request yarn cluster metrics slow cause delay

2021-12-21 Thread GitBox
AngersZh opened a new pull request #34982: URL: https://github.com/apache/spark/pull/34982 ### What changes were proposed in this pull request? Spark will request yarn cluster metrics and print a log about nodemanager number, it's not so important and this rpc is always slow ![im

[GitHub] [spark] SparkQA commented on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
SparkQA commented on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999358047 **[Test build #146472 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146472/testReport)** for PR 34931 at commit [`b883ffe`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34965: [SPARK-37700][CORE][TEST] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
SparkQA commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999357979 **[Test build #146471 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146471/testReport)** for PR 34965 at commit [`d951d5d`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34965: [SPARK-37700][CORE][TEST] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999356498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999356493 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146466/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999356495 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50939/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999356496 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50938/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34970: URL: https://github.com/apache/spark/pull/34970#issuecomment-999356494 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146469/ -

[GitHub] [spark] c21 commented on a change in pull request #34575: [SPARK-37273][SQL] Support hidden file metadata columns in Spark SQL

2021-12-21 Thread GitBox
c21 commented on a change in pull request #34575: URL: https://github.com/apache/spark/pull/34575#discussion_r773663087 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala ## @@ -103,6 +115,101 @@ class FileScanRDD( conte

[GitHub] [spark] AmplabJenkins commented on pull request #34965: [SPARK-37700][CORE][TEST] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999356498 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999356495 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50939/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #34979: Docker integration tests: Tweak docs and remove unneeded dependency

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34979: URL: https://github.com/apache/spark/pull/34979#issuecomment-999356511 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] AmplabJenkins commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999356496 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50938/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999356493 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146466/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34970: URL: https://github.com/apache/spark/pull/34970#issuecomment-999356494 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146469/ -- This

[GitHub] [spark] SparkQA commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
SparkQA commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-999355248 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50945/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
SparkQA commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999355027 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50938/ -- This is an automated message from the A

[GitHub] [spark] cloud-fan commented on a change in pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
cloud-fan commented on a change in pull request #32875: URL: https://github.com/apache/spark/pull/32875#discussion_r773663230 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala ## @@ -135,4 +137,489 @@ class EnsureRequir

[GitHub] [spark] SparkQA commented on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
SparkQA commented on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999351715 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50944/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions

2021-12-21 Thread GitBox
SparkQA commented on pull request #34970: URL: https://github.com/apache/spark/pull/34970#issuecomment-999351318 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50943/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
SparkQA commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-999350573 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50942/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999348942 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50939/ -- This is an automated message from the A

[GitHub] [spark] itholic commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
itholic commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773657795 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame":

[GitHub] [spark] SparkQA commented on pull request #34965: [SPARK-37700][CORE][TEST] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
SparkQA commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999347717 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50940/ -- This is an automated message from the A

[GitHub] [spark] SparkQA removed a comment on pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions

2021-12-21 Thread GitBox
SparkQA removed a comment on pull request #34970: URL: https://github.com/apache/spark/pull/34970#issuecomment-999330103 **[Test build #146469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146469/testReport)** for PR 34970 at commit [`7d7ab2d`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions

2021-12-21 Thread GitBox
SparkQA commented on pull request #34970: URL: https://github.com/apache/spark/pull/34970#issuecomment-999344378 **[Test build #146469 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146469/testReport)** for PR 34970 at commit [`7d7ab2d`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
SparkQA removed a comment on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999328440 **[Test build #146466 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146466/testReport)** for PR 34931 at commit [`440fb56`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
SparkQA commented on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999341090 **[Test build #146466 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146466/testReport)** for PR 34931 at commit [`440fb56`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #34981: [SPARK-35437][SQL][FOLLOWUP] Relax cast if does not need timezone with PrunePartitionsFastFallback

2021-12-21 Thread GitBox
SparkQA commented on pull request #34981: URL: https://github.com/apache/spark/pull/34981#issuecomment-999335434 **[Test build #146470 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146470/testReport)** for PR 34981 at commit [`3a70db0`](https://github.com

[GitHub] [spark] itholic commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
itholic commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773645465 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame":

[GitHub] [spark] ulysses-you opened a new pull request #34981: [SPARK-35437][SQL][FOLLOWUP] Relax cast if does not need timezone with PrunePartitionsFastFallback

2021-12-21 Thread GitBox
ulysses-you opened a new pull request #34981: URL: https://github.com/apache/spark/pull/34981 ### What changes were proposed in this pull request? Allow `Cast` during `prunePartitionsFastFallback` if it actually does not require the timezone. ### Why are the changes ne

[GitHub] [spark] itholic commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
itholic commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773645465 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame":

[GitHub] [spark] SparkQA commented on pull request #34970: [DO NOT MERGE] investigate test failures if we test ANSI mode in github actions

2021-12-21 Thread GitBox
SparkQA commented on pull request #34970: URL: https://github.com/apache/spark/pull/34970#issuecomment-999330103 **[Test build #146469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146469/testReport)** for PR 34970 at commit [`7d7ab2d`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999330036 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50941/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
SparkQA commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-999329169 **[Test build #146468 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146468/testReport)** for PR 32875 at commit [`a6f4a89`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999328531 **[Test build #146467 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146467/testReport)** for PR 34904 at commit [`4575c71`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
SparkQA commented on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999328440 **[Test build #146466 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146466/testReport)** for PR 34931 at commit [`440fb56`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999327773 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50937/

[GitHub] [spark] AmplabJenkins commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999327773 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50937/ -- T

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773637338 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame"

[GitHub] [spark] SparkQA commented on pull request #34965: [SPARK-37700][CORE][TEST] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
SparkQA commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999326788 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50940/ -- This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773636875 ## File path: python/pyspark/pandas/frame.py ## @@ -8809,16 +8814,30 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame":

[GitHub] [spark] SparkQA commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
SparkQA commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999323800 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50938/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
SparkQA commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999323571 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50937/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999323083 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50939/ -- This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773632275 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame"

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773631933 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame"

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773631706 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame"

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773631455 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame"

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
HyukjinKwon commented on a change in pull request #34931: URL: https://github.com/apache/spark/pull/34931#discussion_r773630902 ## File path: python/pyspark/pandas/frame.py ## @@ -8828,22 +8847,138 @@ def describe(self, percentiles: Optional[List[float]] = None) -> "DataFrame"

[GitHub] [spark] cloud-fan commented on pull request #34866: [SPARK-27974][SQL] Support ANSI Aggregate Function: array_agg

2021-12-21 Thread GitBox
cloud-fan commented on pull request #34866: URL: https://github.com/apache/spark/pull/34866#issuecomment-999318984 > Does this PR introduce any user-facing change? > 'No'. New feature. Please be more careful when writing PR descriptions. How can this be not user-facing? -- This

[GitHub] [spark] viirya commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
viirya commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999315639 I checked the maven test on Jenkins, `hive-thriftserver` module was passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] viirya commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
viirya commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-999312380 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [spark] viirya commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
viirya commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-999310092 I think it is due to log4j log event is mutable. It makes a few tests a bit flaky. It's less likely happened but I'm addressing it in #34965. -- This is an automated message fr

[GitHub] [spark] SparkQA commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
SparkQA commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999308487 **[Test build #146465 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146465/testReport)** for PR 34965 at commit [`2efd418`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999304786 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146464/ -

[GitHub] [spark] SparkQA removed a comment on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA removed a comment on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999304275 **[Test build #146464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146464/testReport)** for PR 34904 at commit [`448bc7f`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999304771 **[Test build #146464 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146464/testReport)** for PR 34904 at commit [`448bc7f`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999304786 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146464/ -- This

[GitHub] [spark] SparkQA commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999304275 **[Test build #146464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146464/testReport)** for PR 34904 at commit [`448bc7f`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
SparkQA commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999304200 **[Test build #146463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146463/testReport)** for PR 34976 at commit [`1add49d`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999303958 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50936/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999303957 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146462/ -

[GitHub] [spark] AmplabJenkins commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999303958 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50936/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999303957 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146462/ -- This

[GitHub] [spark] SparkQA commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
SparkQA commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999302371 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50936/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment and implicit cast among datetime types

2021-12-21 Thread GitBox
SparkQA commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999299386 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50937/ -- This is an automated message from the Apache

[GitHub] [spark] gengliangwang commented on a change in pull request #34976: [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp

2021-12-21 Thread GitBox
gengliangwang commented on a change in pull request #34976: URL: https://github.com/apache/spark/pull/34976#discussion_r773609089 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ## @@ -966,9 +966,7 @@ object TypeCoercion exte

[GitHub] [spark] cloud-fan commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
cloud-fan commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-999297491 @sunchao seems a real test failure? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [spark] SparkQA removed a comment on pull request #34976: [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp

2021-12-21 Thread GitBox
SparkQA removed a comment on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999284771 **[Test build #146462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146462/testReport)** for PR 34976 at commit [`03bc6f8`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp

2021-12-21 Thread GitBox
SparkQA commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999293511 **[Test build #146462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146462/testReport)** for PR 34976 at commit [`03bc6f8`](https://github.co

[GitHub] [spark] viirya commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
viirya commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999288711 Ok, let me fix these together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
LuciferYang commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999287098 Because this PR can also contain other `some improvements`, can the following places be fixed in this PR? 1. https://github.com/apache/spark/blob/master/core/src/mai

[GitHub] [spark] cloud-fan commented on a change in pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
cloud-fan commented on a change in pull request #34904: URL: https://github.com/apache/spark/pull/34904#discussion_r773599203 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -189,6 +207,13 @@ object V2ScanR

[GitHub] [spark] cloud-fan commented on a change in pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
cloud-fan commented on a change in pull request #34904: URL: https://github.com/apache/spark/pull/34904#discussion_r773599024 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -147,40 +148,57 @@ object V2Scan

[GitHub] [spark] SparkQA commented on pull request #34976: [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp

2021-12-21 Thread GitBox
SparkQA commented on pull request #34976: URL: https://github.com/apache/spark/pull/34976#issuecomment-999284771 **[Test build #146462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146462/testReport)** for PR 34976 at commit [`03bc6f8`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999283578 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146453/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999283579 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999283580 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999283578 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146453/ -- This

[GitHub] [spark] cloud-fan commented on a change in pull request #34729: [SPARK-37475][SQL] Add scale parameter to floor and ceil functions

2021-12-21 Thread GitBox
cloud-fan commented on a change in pull request #34729: URL: https://github.com/apache/spark/pull/34729#discussion_r773595988 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala ## @@ -249,9 +249,9 @@ case class Cbrt(child:

[GitHub] [spark] cloud-fan commented on a change in pull request #34976: [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp

2021-12-21 Thread GitBox
cloud-fan commented on a change in pull request #34976: URL: https://github.com/apache/spark/pull/34976#discussion_r773593434 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ## @@ -966,9 +966,7 @@ object TypeCoercion extends

[GitHub] [spark] viirya commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
viirya commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999279604 > @viirya thanks for great work on the log4j issues 👍 are there any plans to backport these patches to branch 3.1 and 3.2? Hmm, as I know, log4j 1.x is not directly impacte

[GitHub] [spark] SparkQA commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
SparkQA commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999278823 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50936/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA removed a comment on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999223617 **[Test build #146460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146460/testReport)** for PR 34904 at commit [`ee36dbb`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999275911 **[Test build #146460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146460/testReport)** for PR 34904 at commit [`ee36dbb`](https://github.co

[GitHub] [spark] gengliangwang commented on a change in pull request #34976: [SPARK-37707][SQL] Allow store assignment between TimestampNTZ and Date/Timestamp

2021-12-21 Thread GitBox
gengliangwang commented on a change in pull request #34976: URL: https://github.com/apache/spark/pull/34976#discussion_r773589688 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -192,31 +192,33 @@ object Cast { * In prac

[GitHub] [spark] SparkQA commented on pull request #34904: [SPARK-37644][SQL] Support datasource v2 complete aggregate pushdown

2021-12-21 Thread GitBox
SparkQA commented on pull request #34904: URL: https://github.com/apache/spark/pull/34904#issuecomment-999271490 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50935/ -- This is an automated message from the A

[GitHub] [spark] HeartSaVioR commented on a change in pull request #34942: [SPARK-37680][CORE] Support RocksDB backend in Spark History Server

2021-12-21 Thread GitBox
HeartSaVioR commented on a change in pull request #34942: URL: https://github.com/apache/spark/pull/34942#discussion_r773582447 ## File path: core/src/main/scala/org/apache/spark/internal/config/History.scala ## @@ -211,4 +211,11 @@ private[spark] object History { .version

[GitHub] [spark] SparkQA commented on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
SparkQA commented on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999264148 **[Test build #146461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/146461/testReport)** for PR 34965 at commit [`374669e`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34965: [SPARK-37700][CORE][TEST][test-maven] Add LoggingSuite and some improvements

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34965: URL: https://github.com/apache/spark/pull/34965#issuecomment-999263268 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50933/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999263347 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50934/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
AmplabJenkins removed a comment on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-999263269 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146457/ -

[GitHub] [spark] AmplabJenkins commented on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999263347 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50934/ -- T

[GitHub] [spark] SparkQA commented on pull request #34931: [SPARK-37657][PYTHON] Support str and timestamp for (Series|DataFrame).describe()

2021-12-21 Thread GitBox
SparkQA commented on pull request #34931: URL: https://github.com/apache/spark/pull/34931#issuecomment-999263332 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50934/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins commented on pull request #32875: [SPARK-35703][SQL] Relax constraint for bucket join and remove HashClusteredDistribution

2021-12-21 Thread GitBox
AmplabJenkins commented on pull request #32875: URL: https://github.com/apache/spark/pull/32875#issuecomment-999263269 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/146457/ -- This

  1   2   3   4   5   6   >