[GitHub] [spark] AngersZhuuuu commented on pull request #33092: [SPARK-35905][SQL][FOLLOWUP][TESTS] Fix UT mistake in SQLQuerySuite

2021-06-25 Thread GitBox
AngersZh commented on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-868952001 @dongjoon-hyun Should I add SPARK-35905 to UT title? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] yaooqinn commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
yaooqinn commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868947497 thanks all! merged to master/3.1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] yaooqinn closed pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
yaooqinn closed pull request #33063: URL: https://github.com/apache/spark/pull/33063 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] yaooqinn commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-25 Thread GitBox
yaooqinn commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868943286 @mridulm @dongjoon-hyun, I re-run the benchmark based on the final commit manually. The debug log below shows the performance regression is gone. ```log 21/06/26 04:04:01

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-818410050 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868940480 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44884/

[GitHub] [spark] AmplabJenkins commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868940480 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44884/ --

[GitHub] [spark] mridulm commented on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-25 Thread GitBox
mridulm commented on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-868939032 The github actions test failure looks unrelated, let me try jenkins anyway -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] mridulm commented on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-25 Thread GitBox
mridulm commented on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-868938908 Jenkins, test this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] wangyum commented on pull request #33099: [SPARK-35904][SQL] Collapse above RebalancePartitions

2021-06-25 Thread GitBox
wangyum commented on pull request #33099: URL: https://github.com/apache/spark/pull/33099#issuecomment-868937773 cc @ulysses-you @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] wangyum opened a new pull request #33099: [SPARK-35904][SQL] Collapse above RebalancePartitions

2021-06-25 Thread GitBox
wangyum opened a new pull request #33099: URL: https://github.com/apache/spark/pull/33099 ### What changes were proposed in this pull request? Make `RebalancePartitions` extends `RepartitionOperation`. ### Why are the changes needed? `CollapseRepartition` can optimize

[GitHub] [spark] SparkQA removed a comment on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868868259 **[Test build #140348 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140348/testReport)** for PR 32921 at commit

[GitHub] [spark] SparkQA commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
SparkQA commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868935984 **[Test build #140348 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140348/testReport)** for PR 32921 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-868935705 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44883/

[GitHub] [spark] AmplabJenkins commented on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-868935705 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44883/ --

[GitHub] [spark] SparkQA commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
SparkQA commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868934750 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44884/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-25 Thread GitBox
SparkQA commented on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-868934724 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44883/ -- This is an automated message from the

[GitHub] [spark] Yikun commented on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-25 Thread GitBox
Yikun commented on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-868934485 @HyukjinKwon Ready for review, it would be good if you could take a look again. : ) -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #32753: URL: https://github.com/apache/spark/pull/32753#discussion_r659108805 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java ## @@ -33,31 +51,107 @@ /** The

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #32753: URL: https://github.com/apache/spark/pull/32753#discussion_r659108572 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java ## @@ -33,31 +51,107 @@ /** The

[GitHub] [spark] SparkQA removed a comment on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-868928627 **[Test build #140352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140352/testReport)** for PR 33097 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #32753: URL: https://github.com/apache/spark/pull/32753#discussion_r659108216 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java ## @@ -33,31 +51,107 @@ /** The

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
dongjoon-hyun commented on a change in pull request #32753: URL: https://github.com/apache/spark/pull/32753#discussion_r659107932 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java ## @@ -17,13 +17,31 @@ package

[GitHub] [spark] SparkQA commented on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-25 Thread GitBox
SparkQA commented on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-868932228 **[Test build #140352 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140352/testReport)** for PR 33097 at commit

[GitHub] [spark] dongjoon-hyun commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
dongjoon-hyun commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868932131 Thank you for rebasing, @aokolnychyi . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
SparkQA commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868930686 **[Test build #140353 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140353/testReport)** for PR 32921 at commit

[GitHub] [spark] SparkQA commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
SparkQA commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868930518 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44884/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-25 Thread GitBox
SparkQA commented on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-868930350 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44883/ -- This is an automated message from the Apache

[GitHub] [spark] dongjoon-hyun opened a new pull request #33098: [SPARK-35903][TESTS] Parameterize 'master' in TPCDSQueryBenchmark

2021-06-25 Thread GitBox
dongjoon-hyun opened a new pull request #33098: URL: https://github.com/apache/spark/pull/33098 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] SparkQA commented on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-25 Thread GitBox
SparkQA commented on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-868928627 **[Test build #140352 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140352/testReport)** for PR 33097 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-868853841 **[Test build #140346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140346/testReport)** for PR 33065 at commit

[GitHub] [spark] SparkQA commented on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-25 Thread GitBox
SparkQA commented on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-868927200 **[Test build #140346 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140346/testReport)** for PR 33065 at commit

[GitHub] [spark] mridulm commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-06-25 Thread GitBox
mridulm commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r659103667 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java ## @@ -222,7 +223,7 @@ public void

[GitHub] [spark] mridulm commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-06-25 Thread GitBox
mridulm commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r659103667 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java ## @@ -222,7 +223,7 @@ public void

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #32753: URL: https://github.com/apache/spark/pull/32753#issuecomment-868924203 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140345/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868924198 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44881/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868924197 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44882/

[GitHub] [spark] wangyum commented on a change in pull request #32932: [SPARK-35786][SQL] Add a new operator to rebalance the query output if AQE is enabled

2021-06-25 Thread GitBox
wangyum commented on a change in pull request #32932: URL: https://github.com/apache/spark/pull/32932#discussion_r659102206 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ## @@ -1351,6 +1351,31 @@ object

[GitHub] [spark] AmplabJenkins commented on pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32753: URL: https://github.com/apache/spark/pull/32753#issuecomment-868924203 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140345/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868924198 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44881/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868924197 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44882/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-868924160 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] aokolnychyi commented on a change in pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
aokolnychyi commented on a change in pull request #33096: URL: https://github.com/apache/spark/pull/33096#discussion_r659099103 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/V2ExpressionUtils.scala ## @@ -0,0 +1,80 @@ +/* + * Licensed to

[GitHub] [spark] viirya commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
viirya commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868902634 lgtm too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] viirya commented on a change in pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
viirya commented on a change in pull request #33096: URL: https://github.com/apache/spark/pull/33096#discussion_r659099034 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/V2ExpressionUtils.scala ## @@ -0,0 +1,80 @@ +/* + * Licensed to the

[GitHub] [spark] aokolnychyi commented on a change in pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
aokolnychyi commented on a change in pull request #33096: URL: https://github.com/apache/spark/pull/33096#discussion_r659098919 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/V2ExpressionUtils.scala ## @@ -0,0 +1,80 @@ +/* + * Licensed to

[GitHub] [spark] ueshin closed pull request #33094: [SPARK-35466][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.data_type_ops.*

2021-06-25 Thread GitBox
ueshin closed pull request #33094: URL: https://github.com/apache/spark/pull/33094 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] ueshin commented on pull request #33094: [SPARK-35466][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.data_type_ops.*

2021-06-25 Thread GitBox
ueshin commented on pull request #33094: URL: https://github.com/apache/spark/pull/33094#issuecomment-868902241 Thanks! merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA removed a comment on pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #32753: URL: https://github.com/apache/spark/pull/32753#issuecomment-868829589 **[Test build #140345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140345/testReport)** for PR 32753 at commit

[GitHub] [spark] SparkQA commented on pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-25 Thread GitBox
SparkQA commented on pull request #32753: URL: https://github.com/apache/spark/pull/32753#issuecomment-868901657 **[Test build #140345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140345/testReport)** for PR 32753 at commit

[GitHub] [spark] viirya commented on a change in pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
viirya commented on a change in pull request #33096: URL: https://github.com/apache/spark/pull/33096#discussion_r659097582 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/V2ExpressionUtils.scala ## @@ -0,0 +1,80 @@ +/* + * Licensed to the

[GitHub] [spark] aokolnychyi commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
aokolnychyi commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868900677 Thank you, @dongjoon-hyun! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] SparkQA commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
SparkQA commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868900609 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44881/ -- This is an automated message from the

[GitHub] [spark] ueshin opened a new pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-25 Thread GitBox
ueshin opened a new pull request #33097: URL: https://github.com/apache/spark/pull/33097 ### What changes were proposed in this pull request? Refines type hints in `pyspark.pandas.window`. Also, some refactoring is included to clean up the type hierarchy of `Rolling` and

[GitHub] [spark] dongjoon-hyun closed pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
dongjoon-hyun closed pull request #33096: URL: https://github.com/apache/spark/pull/33096 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
dongjoon-hyun commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868900348 I checked the GitHub Action. There is one irrelevant failure and the others passed. ``` - SPARK-29022: Commands using SerDe provided in --hive.aux.jars.path ***

[GitHub] [spark] SparkQA commented on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
SparkQA commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868900344 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44882/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #33080: [SPARK-35728][SPARK-35778][FOLLOWUP][TESTS] Add test case to check multiply/divide of day-time interval and year-month interval of any

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33080: URL: https://github.com/apache/spark/pull/33080#issuecomment-868899123 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
aokolnychyi commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r659095549 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala ## @@ -227,3 +228,14 @@ object ReuseSubquery extends

[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
aokolnychyi commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r659095329 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -96,6 +96,7 @@ case class

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868896849 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44879/

[GitHub] [spark] aokolnychyi commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
aokolnychyi commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868897551 Good call, @dongjoon-hyun. Added to the PR description. Could you check, please? -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HeartSaVioR closed pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
HeartSaVioR closed pull request #33085: URL: https://github.com/apache/spark/pull/33085 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] AmplabJenkins commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868896849 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44879/ --

[GitHub] [spark] HeartSaVioR commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
HeartSaVioR commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868896779 Thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] SparkQA commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
SparkQA commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868895421 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44881/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
SparkQA commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868894980 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44882/ -- This is an automated message from the Apache

[GitHub] [spark] ueshin commented on pull request #33094: [SPARK-35466][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.data_type_ops.*

2021-06-25 Thread GitBox
ueshin commented on pull request #33094: URL: https://github.com/apache/spark/pull/33094#issuecomment-868893445 cc @HyukjinKwon @itholic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] github-actions[bot] commented on pull request #29113: [SPARK-32314][SHS] Add config to control whether log old format of stacktrace

2021-06-25 Thread GitBox
github-actions[bot] commented on pull request #29113: URL: https://github.com/apache/spark/pull/29113#issuecomment-868891093 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] github-actions[bot] closed pull request #31840: [SPARK-34745][SQL] Unify overflow exception error message of integral types

2021-06-25 Thread GitBox
github-actions[bot] closed pull request #31840: URL: https://github.com/apache/spark/pull/31840 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] github-actions[bot] commented on pull request #30763: [SPARK-31801][API][SHUFFLE] Register map output metadata

2021-06-25 Thread GitBox
github-actions[bot] commented on pull request #30763: URL: https://github.com/apache/spark/pull/30763#issuecomment-868891088 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] SparkQA commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
SparkQA commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868890661 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44879/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-868889338 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44880/ --

[GitHub] [spark] SparkQA commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
SparkQA commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-868889333 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44880/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #33083: Allow sequences (tuples and lists) as pivot values argument in PySpark.

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33083: URL: https://github.com/apache/spark/pull/33083#issuecomment-86298 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA removed a comment on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868886086 **[Test build #140351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140351/testReport)** for PR 33095 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-868885474 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140349/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32787: [SPARK-35618][SQL] Resolve star expressions in subqueries using outer query plans

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #32787: URL: https://github.com/apache/spark/pull/32787#issuecomment-868885472 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140339/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
AmplabJenkins removed a comment on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868886338 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140351/

[GitHub] [spark] SparkQA commented on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
SparkQA commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868886329 **[Test build #140351 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140351/testReport)** for PR 33095 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868886338 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140351/ -- This

[GitHub] [spark] SparkQA commented on pull request #33095: [WIP][SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-06-25 Thread GitBox
SparkQA commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-868886086 **[Test build #140351 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140351/testReport)** for PR 33095 at commit

[GitHub] [spark] SparkQA commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
SparkQA commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868886081 **[Test build #140350 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140350/testReport)** for PR 33096 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32787: [SPARK-35618][SQL] Resolve star expressions in subqueries using outer query plans

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #32787: URL: https://github.com/apache/spark/pull/32787#issuecomment-868885472 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140339/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
AmplabJenkins commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-868885474 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140349/ -- This

[GitHub] [spark] HeartSaVioR commented on pull request #33085: [SPARK-35894][BUILD] Introduce new style enforce to not import scala.collection.Seq/IndexedSeq

2021-06-25 Thread GitBox
HeartSaVioR commented on pull request #33085: URL: https://github.com/apache/spark/pull/33085#issuecomment-868883808 GA build passed for Scala 2.13 build, and style check with new rule is now passed. @srowen Would it be good to go? -- This is an automated message from the Apache

[GitHub] [spark] dongjoon-hyun commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
dongjoon-hyun commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868883368 cc @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
dongjoon-hyun commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868883315 Thank you for pinging me, @aokolnychyi ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
SparkQA commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868882933 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44879/ -- This is an automated message from the Apache

[GitHub] [spark] aokolnychyi commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
aokolnychyi commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868882776 This PR contains a utility class I need for dynamic filtering. cc @sunchao @huaxingao @viirya @dongjoon-hyun @cloud-fan @HyukjinKwon @rdblue @holdenk -- This is an

[GitHub] [spark] aokolnychyi commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-25 Thread GitBox
aokolnychyi commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868882181 Submitted #33096 for the utility class. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] aokolnychyi opened a new pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-25 Thread GitBox
aokolnychyi opened a new pull request #33096: URL: https://github.com/apache/spark/pull/33096 ### What changes were proposed in this pull request? This PR adds a utility to convert public connector expressions to Catalyst expressions. ### Why are the changes

[GitHub] [spark] SparkQA commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
SparkQA commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-868882011 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44880/ -- This is an automated message from the Apache

[GitHub] [spark] ueshin commented on a change in pull request #33094: [SPARK-35466][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.data_type_ops.*

2021-06-25 Thread GitBox
ueshin commented on a change in pull request #33094: URL: https://github.com/apache/spark/pull/33094#discussion_r659078486 ## File path: python/pyspark/pandas/data_type_ops/base.py ## @@ -65,6 +65,7 @@ T_IndexOps = TypeVar("T_IndexOps", bound="IndexOpsMixin")

[GitHub] [spark] SparkQA removed a comment on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-868868541 **[Test build #140349 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140349/testReport)** for PR 31490 at commit

[GitHub] [spark] SparkQA commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-25 Thread GitBox
SparkQA commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-868877249 **[Test build #140349 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140349/testReport)** for PR 31490 at commit

[GitHub] [spark] Victsm commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-06-25 Thread GitBox
Victsm commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r659076673 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java ## @@ -222,7 +223,7 @@ public void

[GitHub] [spark] xinrong-databricks commented on pull request #33094: [SPARK-35466][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.data_type_ops.*

2021-06-25 Thread GitBox
xinrong-databricks commented on pull request #33094: URL: https://github.com/apache/spark/pull/33094#issuecomment-868875080 Thanks for working on that! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] xinrong-databricks commented on a change in pull request #33094: [SPARK-35466][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.data_type_ops.*

2021-06-25 Thread GitBox
xinrong-databricks commented on a change in pull request #33094: URL: https://github.com/apache/spark/pull/33094#discussion_r659076301 ## File path: python/pyspark/pandas/data_type_ops/base.py ## @@ -65,6 +65,7 @@ T_IndexOps = TypeVar("T_IndexOps", bound="IndexOpsMixin")

[GitHub] [spark] SparkQA removed a comment on pull request #32787: [SPARK-35618][SQL] Resolve star expressions in subqueries using outer query plans

2021-06-25 Thread GitBox
SparkQA removed a comment on pull request #32787: URL: https://github.com/apache/spark/pull/32787#issuecomment-868762786 **[Test build #140339 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140339/testReport)** for PR 32787 at commit

  1   2   3   4   5   6   7   >