[GitHub] [spark] viirya commented on a change in pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-07-26 Thread GitBox
viirya commented on a change in pull request #32332: URL: https://github.com/apache/spark/pull/32332#discussion_r676968916 ## File path: python/pyspark/sql/session.py ## @@ -697,12 +712,19 @@ def prepare(obj): verify_func(obj) return obj,

[GitHub] [spark] SparkQA commented on pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
SparkQA commented on pull request #33466: URL: https://github.com/apache/spark/pull/33466#issuecomment-887048551 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46175/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33498: [SPARK-36275][SQL] ResolveAggregateFunctions should works with nested fields

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33498: URL: https://github.com/apache/spark/pull/33498#issuecomment-887043984 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46174/

[GitHub] [spark] AmplabJenkins commented on pull request #33498: [SPARK-36275][SQL] ResolveAggregateFunctions should works with nested fields

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33498: URL: https://github.com/apache/spark/pull/33498#issuecomment-887043984 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46174/ --

[GitHub] [spark] SparkQA commented on pull request #33498: [SPARK-36275][SQL] ResolveAggregateFunctions should works with nested fields

2021-07-26 Thread GitBox
SparkQA commented on pull request #33498: URL: https://github.com/apache/spark/pull/33498#issuecomment-887043958 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46174/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33526: [SPARK-34952][SQL][FOLLOW-UP] DSv2 aggregate push down: update doc

2021-07-26 Thread GitBox
SparkQA commented on pull request #33526: URL: https://github.com/apache/spark/pull/33526#issuecomment-887038531 **[Test build #141663 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141663/testReport)** for PR 33526 at commit

[GitHub] [spark] huaxingao commented on pull request #33526: [SPARK-34952][SQL][FOLLOW-UP] DSv2 aggregate push down: update doc

2021-07-26 Thread GitBox
huaxingao commented on pull request #33526: URL: https://github.com/apache/spark/pull/33526#issuecomment-887038330 cc @cloud-fan @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] huaxingao opened a new pull request #33526: [SPARK-34952][SQL][FOLLOW-UP] DSv2 aggregate push down: update doc

2021-07-26 Thread GitBox
huaxingao opened a new pull request #33526: URL: https://github.com/apache/spark/pull/33526 ### What changes were proposed in this pull request? update java doc and JDBC data source doc. ### Why are the changes needed? update doc with the new changes ### Does

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-887037243 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46176/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
SparkQA commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-887037038 **[Test build #141662 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141662/testReport)** for PR 33034 at commit

[GitHub] [spark] SparkQA commented on pull request #33425: [SPARK-32919][FOLLOW-UP] Filter out driver in the merger locations and fix the return type of RemoveShufflePushMergerLocations

2021-07-26 Thread GitBox
SparkQA commented on pull request #33425: URL: https://github.com/apache/spark/pull/33425#issuecomment-887036793 **[Test build #141661 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141661/testReport)** for PR 33425 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33466: URL: https://github.com/apache/spark/pull/33466#issuecomment-887034375 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141659/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-887034374 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141660/

[GitHub] [spark] AmplabJenkins commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-887034374 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141660/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33466: URL: https://github.com/apache/spark/pull/33466#issuecomment-887034375 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141659/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-887009598 **[Test build #141660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141660/testReport)** for PR 33506 at commit

[GitHub] [spark] SparkQA commented on pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
SparkQA commented on pull request #33466: URL: https://github.com/apache/spark/pull/33466#issuecomment-887029839 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46175/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-887025911 **[Test build #141660 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141660/testReport)** for PR 33506 at commit

[GitHub] [spark] SparkQA commented on pull request #33498: [SPARK-36275][SQL] ResolveAggregateFunctions should works with nested fields

2021-07-26 Thread GitBox
SparkQA commented on pull request #33498: URL: https://github.com/apache/spark/pull/33498#issuecomment-887024626 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46174/ -- This is an automated message from the Apache

[GitHub] [spark] venkata91 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
venkata91 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r676934985 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -408,14 +507,24 @@ public

[GitHub] [spark] venkata91 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
venkata91 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r676936951 ## File path: core/src/main/scala/org/apache/spark/shuffle/ShuffleBlockPusher.scala ## @@ -361,7 +366,8 @@ private[spark] class

[GitHub] [spark] venkata91 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
venkata91 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r676936259 ## File path: core/src/main/scala/org/apache/spark/Dependency.scala ## @@ -122,6 +120,14 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C:

[GitHub] [spark] SparkQA removed a comment on pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33466: URL: https://github.com/apache/spark/pull/33466#issuecomment-887001359 **[Test build #141659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141659/testReport)** for PR 33466 at commit

[GitHub] [spark] venkata91 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
venkata91 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r676934985 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -408,14 +507,24 @@ public

[GitHub] [spark] SparkQA commented on pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
SparkQA commented on pull request #33466: URL: https://github.com/apache/spark/pull/33466#issuecomment-887017015 **[Test build #141659 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141659/testReport)** for PR 33466 at commit

[GitHub] [spark] venkata91 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
venkata91 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r676934391 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -135,51 +150,87 @@ protected

[GitHub] [spark] venkata91 commented on a change in pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
venkata91 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r676933588 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -78,6 +80,19 @@ public

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-887009598 **[Test build #141660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141660/testReport)** for PR 33506 at commit

[GitHub] [spark] viirya commented on pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-07-26 Thread GitBox
viirya commented on pull request #32332: URL: https://github.com/apache/spark/pull/32332#issuecomment-887007635 > The result is not correct because of incorrect type conversion. Could you also post the incorrect result in the PR? -- This is an automated message from the Apache Git

[GitHub] [spark] sunchao commented on pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
sunchao commented on pull request #33350: URL: https://github.com/apache/spark/pull/33350#issuecomment-887005723 Thanks @viirya ! Yes I agree we should backport and keep master & 3.2 consistent. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA commented on pull request #33498: [SPARK-36275][SQL] ResolveAggregateFunctions should works with nested fields

2021-07-26 Thread GitBox
SparkQA commented on pull request #33498: URL: https://github.com/apache/spark/pull/33498#issuecomment-887001376 **[Test build #141658 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141658/testReport)** for PR 33498 at commit

[GitHub] [spark] SparkQA commented on pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
SparkQA commented on pull request #33466: URL: https://github.com/apache/spark/pull/33466#issuecomment-887001359 **[Test build #141659 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141659/testReport)** for PR 33466 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886998592 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46173/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-886998600 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46171/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33340: [SPARK-36266][SHUFFLE] Rename classes in shuffle RPC used for block push operations

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33340: URL: https://github.com/apache/spark/pull/33340#issuecomment-886998596 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141652/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886998589 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886998599 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141640/

[GitHub] [spark] AmplabJenkins commented on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-886998600 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46171/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886998589 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins commented on pull request #33340: [SPARK-36266][SHUFFLE] Rename classes in shuffle RPC used for block push operations

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33340: URL: https://github.com/apache/spark/pull/33340#issuecomment-886998596 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141652/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886998592 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46173/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886998599 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141640/ -- This

[GitHub] [spark] viirya commented on a change in pull request #33525: [SPARK-35320][SQL] Improve error message for unsupported key types in MapType in from_json expression

2021-07-26 Thread GitBox
viirya commented on a change in pull request #33525: URL: https://github.com/apache/spark/pull/33525#discussion_r676909418 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -561,7 +561,17 @@ case class

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886997693 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46172/ --

[GitHub] [spark] SparkQA removed a comment on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886963605 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] SparkQA commented on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
SparkQA commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886990754 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46173/ -- This is an automated message from the

[GitHub] [spark] xinrong-databricks edited a comment on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
xinrong-databricks edited a comment on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886990183 I think the PR does introduce a user-facing change, resulting in a different result for pow between fractional Series with missing values and bool literal.

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886990318 **[Test build #141654 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141654/testReport)** for PR 33506 at commit

[GitHub] [spark] xinrong-databricks commented on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
xinrong-databricks commented on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886990183 I think the PR does introduce user-facing change, resulting in a different result for pow between fractional Series with missing values and bool literal. Would you

[GitHub] [spark] viirya closed pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
viirya closed pull request #33350: URL: https://github.com/apache/spark/pull/33350 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] viirya commented on pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
viirya commented on pull request #33350: URL: https://github.com/apache/spark/pull/33350#issuecomment-886988096 Thanks. Merging to master/3.2. Although this is not bug fix, but it is only for test and I think it is better to keep consistency between master/3.2 so it is easier to backport

[GitHub] [spark] allisonwang-db commented on a change in pull request #33498: [SPARK-36275][SQL] ResolveAggregateFunctions should works with nested fields

2021-07-26 Thread GitBox
allisonwang-db commented on a change in pull request #33498: URL: https://github.com/apache/spark/pull/33498#discussion_r676903691 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -2553,8 +2553,9 @@ class Analyzer(override

[GitHub] [spark] SparkQA commented on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
SparkQA commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886986594 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46173/ -- This is an automated message from the Apache

[GitHub] [spark] xinrong-databricks commented on a change in pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
xinrong-databricks commented on a change in pull request #33466: URL: https://github.com/apache/spark/pull/33466#discussion_r676903582 ## File path: python/pyspark/pandas/data_type_ops/num_ops.py ## @@ -371,15 +384,7 @@ def isnull(self, index_ops: IndexOpsLike) ->

[GitHub] [spark] xinrong-databricks commented on a change in pull request #33466: [SPARK-36143][PYTHON] Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread GitBox
xinrong-databricks commented on a change in pull request #33466: URL: https://github.com/apache/spark/pull/33466#discussion_r676903062 ## File path: python/pyspark/pandas/data_type_ops/num_ops.py ## @@ -34,15 +34,29 @@ _as_string_type, ) from pyspark.pandas.spark import

[GitHub] [spark] ueshin commented on a change in pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
ueshin commented on a change in pull request #33506: URL: https://github.com/apache/spark/pull/33506#discussion_r676896916 ## File path: python/pyspark/pandas/categorical.py ## @@ -680,12 +681,152 @@ def reorder_categories( def set_categories( self, -

[GitHub] [spark] SparkQA commented on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE

2021-07-26 Thread GitBox
SparkQA commented on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-886983675 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46171/ -- This is an automated message from the

[GitHub] [spark] xinrong-databricks edited a comment on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
xinrong-databricks edited a comment on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886982057 The PR looks good. Since it doesn't adjust the `pow` issue of Decimal(`NaN`) and ExtentionDtypes, which is inconsistent with the ticket title and description,

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886982679 **[Test build #141656 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141656/testReport)** for PR 33506 at commit

[GitHub] [spark] xinrong-databricks commented on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
xinrong-databricks commented on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886982057 The PR looks good. Since it doesn't adjust the issue of Decimal(`NaN`) and ExtentionDtypes, which is inconsistent with the ticket title and description. Would you

[GitHub] [spark] SparkQA removed a comment on pull request #33340: [SPARK-36266][SHUFFLE] Rename classes in shuffle RPC used for block push operations

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33340: URL: https://github.com/apache/spark/pull/33340#issuecomment-886884804 **[Test build #141652 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141652/testReport)** for PR 33340 at commit

[GitHub] [spark] SparkQA commented on pull request #33340: [SPARK-36266][SHUFFLE] Rename classes in shuffle RPC used for block push operations

2021-07-26 Thread GitBox
SparkQA commented on pull request #33340: URL: https://github.com/apache/spark/pull/33340#issuecomment-886979180 **[Test build #141652 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141652/testReport)** for PR 33340 at commit

[GitHub] [spark] xinrong-databricks removed a comment on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
xinrong-databricks removed a comment on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886974967 Shall we rename the PR since the issue of Series with Decimal(`NaN`) is not adjusted here? -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886976451 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46170/ -- This is an automated message from the

[GitHub] [spark] mridulm commented on a change in pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
mridulm commented on a change in pull request #33451: URL: https://github.com/apache/spark/pull/33451#discussion_r676874825 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java ## @@ -374,6 +379,27 @@ public int

[GitHub] [spark] mridulm commented on a change in pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
mridulm commented on a change in pull request #33451: URL: https://github.com/apache/spark/pull/33451#discussion_r676871069 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/checksum/ShuffleChecksumHelper.java ## @@ -0,0 +1,160 @@ +/* + *

[GitHub] [spark] mridulm commented on a change in pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
mridulm commented on a change in pull request #33451: URL: https://github.com/apache/spark/pull/33451#discussion_r676870269 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java ## @@ -374,6 +379,27 @@ public int

[GitHub] [spark] xinrong-databricks commented on pull request #33521: [SPARK-36142][PYTHON] Follow Pandas when pow between Series with Na and bool literal

2021-07-26 Thread GitBox
xinrong-databricks commented on pull request #33521: URL: https://github.com/apache/spark/pull/33521#issuecomment-886974967 Shall we rename the PR since the issue of Series with Decimal(`NaN`) is not adjusted here? -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] mridulm commented on a change in pull request #33451: [SPARK-36206][CORE] Support shuffle data corruption diagnosis via shuffle checksum

2021-07-26 Thread GitBox
mridulm commented on a change in pull request #33451: URL: https://github.com/apache/spark/pull/33451#discussion_r676871069 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/checksum/ShuffleChecksumHelper.java ## @@ -0,0 +1,160 @@ +/* + *

[GitHub] [spark] SparkQA removed a comment on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886745694 **[Test build #141640 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141640/testReport)** for PR 33468 at commit

[GitHub] [spark] SparkQA commented on pull request #33468: [SPARK-36247][SQL] Check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread GitBox
SparkQA commented on pull request #33468: URL: https://github.com/apache/spark/pull/33468#issuecomment-886974164 **[Test build #141640 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141640/testReport)** for PR 33468 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886964928 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141657/

[GitHub] [spark] SparkQA removed a comment on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886963940 **[Test build #141657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141657/testReport)** for PR 33034 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886964928 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141657/ -- This

[GitHub] [spark] SparkQA commented on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
SparkQA commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886964905 **[Test build #141657 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141657/testReport)** for PR 33034 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-866473755 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33350: URL: https://github.com/apache/spark/pull/33350#issuecomment-886964287 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141651/

[GitHub] [spark] AmplabJenkins commented on pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33350: URL: https://github.com/apache/spark/pull/33350#issuecomment-886964287 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141651/ -- This

[GitHub] [spark] SparkQA commented on pull request #33034: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-26 Thread GitBox
SparkQA commented on pull request #33034: URL: https://github.com/apache/spark/pull/33034#issuecomment-886963940 **[Test build #141657 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141657/testReport)** for PR 33034 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
SparkQA removed a comment on pull request #33350: URL: https://github.com/apache/spark/pull/33350#issuecomment-886884805 **[Test build #141651 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141651/testReport)** for PR 33350 at commit

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886963605 **[Test build #141656 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141656/testReport)** for PR 33506 at commit

[GitHub] [spark] EnricoMi commented on pull request #33484: [SPARK-36263][SQL][PYTHON] Add Dataframe.observation to PySpark

2021-07-26 Thread GitBox
EnricoMi commented on pull request #33484: URL: https://github.com/apache/spark/pull/33484#issuecomment-886963496 > * This patch **fails Spark unit tests**. @HyukjinKwon These test failures seem to be transient and unrelated. This time

[GitHub] [spark] SparkQA commented on pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
SparkQA commented on pull request #33350: URL: https://github.com/apache/spark/pull/33350#issuecomment-886963363 **[Test build #141651 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/141651/testReport)** for PR 33350 at commit

[GitHub] [spark] SparkQA commented on pull request #32776: [SPARK-35639][SQL] Add metrics about coalesced partitions to CustomShuffleReader in AQE

2021-07-26 Thread GitBox
SparkQA commented on pull request #32776: URL: https://github.com/apache/spark/pull/32776#issuecomment-886962850 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46171/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33523: [SPARK-35259][SHUFFLE][3.1] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33523: URL: https://github.com/apache/spark/pull/33523#issuecomment-886961586 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141645/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33522: [SPARK-36290][SQL] Push down join condition evaluation

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33522: URL: https://github.com/apache/spark/pull/33522#issuecomment-886961587 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141646/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33516: [SPARK-34249][DOCS] Add documentation for ANSI implicit cast rules

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33516: URL: https://github.com/apache/spark/pull/33516#issuecomment-886961588 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46165/

[GitHub] [spark] AmplabJenkins commented on pull request #33525: [SPARK-35320][SQL] Improve error message for unsuported key types in Map type in from_json expression

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33525: URL: https://github.com/apache/spark/pull/33525#issuecomment-886962058 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33308: [SPARK-35918][AVRO] Unify schema mismatch handling for read/write and enhance error messages

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33308: URL: https://github.com/apache/spark/pull/33308#issuecomment-886961590 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46169/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33524: [SPARK-35259][SHUFFLE][3.0] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33524: URL: https://github.com/apache/spark/pull/33524#issuecomment-886961595 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141644/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886961592 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46166/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
AmplabJenkins removed a comment on pull request #33350: URL: https://github.com/apache/spark/pull/33350#issuecomment-886961591 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46167/

[GitHub] [spark] AmplabJenkins commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886961592 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46166/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33522: [SPARK-36290][SQL] Push down join condition evaluation

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33522: URL: https://github.com/apache/spark/pull/33522#issuecomment-886961587 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141646/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33523: [SPARK-35259][SHUFFLE][3.1] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33523: URL: https://github.com/apache/spark/pull/33523#issuecomment-886961586 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141645/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33516: [SPARK-34249][DOCS] Add documentation for ANSI implicit cast rules

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33516: URL: https://github.com/apache/spark/pull/33516#issuecomment-886961588 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46165/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33524: [SPARK-35259][SHUFFLE][3.0] Update ExternalBlockHandler Timer variables to expose correct units

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33524: URL: https://github.com/apache/spark/pull/33524#issuecomment-886961595 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/141644/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33308: [SPARK-35918][AVRO] Unify schema mismatch handling for read/write and enhance error messages

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33308: URL: https://github.com/apache/spark/pull/33308#issuecomment-886961590 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46169/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33350: [SPARK-36136][SQL][TESTS] Refactor PruneFileSourcePartitionsSuite etc to a different package

2021-07-26 Thread GitBox
AmplabJenkins commented on pull request #33350: URL: https://github.com/apache/spark/pull/33350#issuecomment-886961591 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46167/ --

[GitHub] [spark] SparkQA commented on pull request #33506: [SPARK-36260][PYTHON] Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread GitBox
SparkQA commented on pull request #33506: URL: https://github.com/apache/spark/pull/33506#issuecomment-886956272 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46170/ -- This is an automated message from the Apache

<    1   2   3   4   5   6   7   8   9   >