[GitHub] [spark] otterc commented on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-28 Thread GitBox
otterc commented on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-870260954 > Sorry for the delay. I'll do a review today. BTW, are there any other necessary magnet PRs that have to be merged for the 3.2 cut/release? There are 2 pending tasks

[GitHub] [spark] dongjoon-hyun commented on pull request #33125: [SPARK-35483][TESTS] Enable docker_integration_tests for catalyst/sql module changes too

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33125: URL: https://github.com/apache/spark/pull/33125#issuecomment-870260591 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660301446 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the Apache

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660300861 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the Apache

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660300690 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the Apache

[GitHub] [spark] williamhyun opened a new pull request #33126: [SPARK-35924][BUILD][TESTS] Add Java 17 ea build test to GitHub action

2021-06-28 Thread GitBox
williamhyun opened a new pull request #33126: URL: https://github.com/apache/spark/pull/33126 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660299140 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the Apache

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660298585 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the Apache

[GitHub] [spark] dongjoon-hyun commented on pull request #33125: [SPARK-35483][TESTS] Enable docker_integration_tests for catalyst/sql module changes too

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33125: URL: https://github.com/apache/spark/pull/33125#issuecomment-870255930 Thank you, @sarutak and @HyukjinKwon . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33125: [SPARK-35483][TESTS] Add sql dependency to docker_integration_tests

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #33125: URL: https://github.com/apache/spark/pull/33125#discussion_r660297375 ## File path: dev/sparktestsupport/modules.py ## @@ -769,7 +769,7 @@ def __hash__(self): docker_integration_tests = Module(

[GitHub] [spark] ulysses-you commented on a change in pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions

2021-06-28 Thread GitBox
ulysses-you commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r660297252 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -642,6 +642,15 @@ object SQLConf {

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33125: [SPARK-35483][TESTS] Add sql dependency to docker_integration_tests

2021-06-28 Thread GitBox
dongjoon-hyun commented on a change in pull request #33125: URL: https://github.com/apache/spark/pull/33125#discussion_r660297221 ## File path: dev/sparktestsupport/modules.py ## @@ -769,7 +769,7 @@ def __hash__(self): docker_integration_tests = Module(

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660297030 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,453 @@ +/* + * Licensed to the Apache

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33125: [SPARK-35483][TESTS] Add sql dependency to docker_integration_tests

2021-06-28 Thread GitBox
dongjoon-hyun commented on a change in pull request #33125: URL: https://github.com/apache/spark/pull/33125#discussion_r660296765 ## File path: dev/sparktestsupport/modules.py ## @@ -769,7 +769,7 @@ def __hash__(self): docker_integration_tests = Module(

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660296809 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,453 @@ +/* + * Licensed to the Apache

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33125: [SPARK-35483][TESTS] Add sql dependency to docker_integration_tests

2021-06-28 Thread GitBox
dongjoon-hyun commented on a change in pull request #33125: URL: https://github.com/apache/spark/pull/33125#discussion_r660296765 ## File path: dev/sparktestsupport/modules.py ## @@ -769,7 +769,7 @@ def __hash__(self): docker_integration_tests = Module(

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660296038 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,455 @@ +/* + * Licensed to the Apache

[GitHub] [spark] gengliangwang commented on a change in pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-28 Thread GitBox
gengliangwang commented on a change in pull request #31490: URL: https://github.com/apache/spark/pull/31490#discussion_r660295813 ## File path: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSerdeSuite.scala ## @@ -144,6 +173,43 @@ object AvroSerdeSuite {

[GitHub] [spark] ulysses-you commented on a change in pull request #32883: [SPARK-35725][SQL] Support optimize skewed partitions in RebalancePartitions

2021-06-28 Thread GitBox
ulysses-you commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r649828089 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CoalesceShufflePartitions.scala ## @@ -91,18 +91,3 @@ case class

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660295530 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the Apache

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33125: [SPARK-35483][TESTS] Add sql dependency to docker_integration_tests

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #33125: URL: https://github.com/apache/spark/pull/33125#discussion_r660295439 ## File path: dev/sparktestsupport/modules.py ## @@ -769,7 +769,7 @@ def __hash__(self): docker_integration_tests = Module(

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33125: [SPARK-35483][TESTS] Add sql dependency to docker_integration_tests

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #33125: URL: https://github.com/apache/spark/pull/33125#discussion_r660295086 ## File path: dev/sparktestsupport/modules.py ## @@ -769,7 +769,7 @@ def __hash__(self): docker_integration_tests = Module(

[GitHub] [spark] dongjoon-hyun commented on pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870251058 Here is the follow-up for SPARK-35483. - https://github.com/apache/spark/pull/33125/files -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] dongjoon-hyun opened a new pull request #33125: [SPARK-35483][TESTS] Add sql dependency to docker_integration_tests

2021-06-28 Thread GitBox
dongjoon-hyun opened a new pull request #33125: URL: https://github.com/apache/spark/pull/33125 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] HyukjinKwon commented on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-870250749 @q2w, can you rebase and create a new PR? Seems like GA in this PR is somehow messed up. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] sarutak commented on pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-28 Thread GitBox
sarutak commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870249724 @dongjoon-hyun I think It's because docker-integration-tests don't depend on sql module in `dev/sparktestsupport/modules.py`. We intentionally let the dependency

[GitHub] [spark] aokolnychyi commented on pull request #33120: [SPARK-35899][SQL][FOLLOWUP] Utility to convert connector expressions to Catalyst

2021-06-28 Thread GitBox
aokolnychyi commented on pull request #33120: URL: https://github.com/apache/spark/pull/33120#issuecomment-870248919 Thanks, @dongjoon-hyun! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870248556 I'll make a followup PR for SPARK-35483 to enable it for SQL part change, too. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] dongjoon-hyun commented on pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870246910 This seems to disabled by `ENABLE_DOCKER_INTEGRATION_TESTS` intentionally and designed to be turned on docker module change only at the original commit

[GitHub] [spark] q2w removed a comment on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-28 Thread GitBox
q2w removed a comment on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-870246460 > Could you check the UT failure, @q2w ? If it passes in your environment, please re-trigger GitHub Action. > > ``` > [info] *** 1 TEST FAILED *** > [error]

[GitHub] [spark] q2w opened a new pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-28 Thread GitBox
q2w opened a new pull request #32902: URL: https://github.com/apache/spark/pull/32902 ### What changes were proposed in this pull request? This PR adds a config which makes block manager decommissioner to migrate block data on disk only. ### Why are the changes

[GitHub] [spark] q2w closed pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-28 Thread GitBox
q2w closed pull request #32902: URL: https://github.com/apache/spark/pull/32902 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] q2w commented on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-28 Thread GitBox
q2w commented on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-870246460 > Could you check the UT failure, @q2w ? If it passes in your environment, please re-trigger GitHub Action. > > ``` > [info] *** 1 TEST FAILED *** > [error] Failed:

[GitHub] [spark] dongjoon-hyun closed pull request #33120: [SPARK-35899][SQL][FOLLOWUP] Utility to convert connector expressions to Catalyst

2021-06-28 Thread GitBox
dongjoon-hyun closed pull request #33120: URL: https://github.com/apache/spark/pull/33120 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] cloud-fan commented on pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-28 Thread GitBox
cloud-fan commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870245406 thanks for fixing it! BTW is there a way to capture it in PR builders? Otherwise, it's too easy to forget checking... -- This is an automated message from the Apache

[GitHub] [spark] dongjoon-hyun commented on pull request #33120: [SPARK-35899][SQL][FOLLOWUP] Utility to convert connector expressions to Catalyst

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33120: URL: https://github.com/apache/spark/pull/33120#issuecomment-870245168 Merged to master. Thank you, @aokolnychyi and @HyukjinKwon . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] sunchao commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-28 Thread GitBox
sunchao commented on a change in pull request #32753: URL: https://github.com/apache/spark/pull/32753#discussion_r660286736 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedRleValuesReader.java ## @@ -156,55 +156,81 @@ public

[GitHub] [spark] HyukjinKwon commented on pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #33121: URL: https://github.com/apache/spark/pull/33121#issuecomment-870242513 LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] dongjoon-hyun commented on pull request #33113: [SPARK-34302][SQL] Migrate ALTER TABLE ... CHANGE COLUMN command to use UnresolvedTable to resolve the identifier

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33113: URL: https://github.com/apache/spark/pull/33113#issuecomment-870240863 This seems to break JDBC v2 suite. I made a follow-up. Could you review that, @imback82 and @cloud-fan ? - https://github.com/apache/spark/pull/33124 -- This is an

[GitHub] [spark] dongjoon-hyun closed pull request #33122: [SPARK-35922][BUILD] Upgrade maven-shade-plugin to 3.2.4

2021-06-28 Thread GitBox
dongjoon-hyun closed pull request #33122: URL: https://github.com/apache/spark/pull/33122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on pull request #33122: [SPARK-35922][BUILD] Upgrade maven-shade-plugin to 3.2.4

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33122: URL: https://github.com/apache/spark/pull/33122#issuecomment-870239724 Thank you, @HyukjinKwon . Merged to master. For the docker integration test failure, it's irrelevant to this. I made a PR for that. -- This is an automated

[GitHub] [spark] dongjoon-hyun closed pull request #33119: [SPARK-35920][BUILD] Upgrade to Chill 0.10.0

2021-06-28 Thread GitBox
dongjoon-hyun closed pull request #33119: URL: https://github.com/apache/spark/pull/33119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on pull request #33119: [SPARK-35920][BUILD] Upgrade to Chill 0.10.0

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33119: URL: https://github.com/apache/spark/pull/33119#issuecomment-870238738 Thank you, @HyukjinKwon . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] Ngone51 edited a comment on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-28 Thread GitBox
Ngone51 edited a comment on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-870235923 Sorry for the delay. I'll do a review today. BTW, are there any other necessary magnet PRs that have to be merged for the 3.2 cut/release? -- This is an automated

[GitHub] [spark] dongjoon-hyun opened a new pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-28 Thread GitBox
dongjoon-hyun opened a new pull request #33124: URL: https://github.com/apache/spark/pull/33124 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] Ngone51 edited a comment on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-28 Thread GitBox
Ngone51 edited a comment on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-870235923 Sorry for the delay. I'll do a review today. BTW, are there any other necessary magnet PRs that have to be merged for the 3.2 release? -- This is an automated message

[GitHub] [spark] Ngone51 commented on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-28 Thread GitBox
Ngone51 commented on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-870235923 Sorry for the delay. I'll do a review today. BTW, are there any other necessary mgnet PRs that have to be merged for the 3.2 release? -- This is an automated message from the

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon edited a comment on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-870235430 > BTW why do we add a new API in Column? That's my question (https://github.com/apache/spark/pull/32365#discussion_r660245915) ... `df.show` is already

[GitHub] [spark] HyukjinKwon commented on pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-870235430 > BTW why do we add a new API in Column? That's my question (https://github.com/apache/spark/pull/32365#discussion_r660245915) ... `df.show` is already

[GitHub] [spark] sarutak commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
sarutak commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660273331 ## File path: project/SparkBuild.scala ## @@ -802,11 +805,29 @@ object Hive { } object YARN { + val genConfigProperties =

[GitHub] [spark] ulysses-you opened a new pull request #33123: [SPARK-35923][SQL] Coalesce empty partition with mixed CoalescedPartitionSpec and PartialReducerPartitionSpec

2021-06-28 Thread GitBox
ulysses-you opened a new pull request #33123: URL: https://github.com/apache/spark/pull/33123 ### What changes were proposed in this pull request? Skip empty partitions in `ShufflePartitionsUtil.coalescePartitionsWithSkew`. ### Why are the changes needed?

[GitHub] [spark] sarutak commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
sarutak commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660276243 ## File path: project/SparkBuild.scala ## @@ -802,11 +802,30 @@ object Hive { } object YARN { + val genConfigProperties =

[GitHub] [spark] cloud-fan commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
cloud-fan commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660276267 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -1190,6 +1190,22 @@ class Column(val expr: Expression) extends Logging {

[GitHub] [spark] otterc commented on a change in pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-28 Thread GitBox
otterc commented on a change in pull request #32140: URL: https://github.com/apache/spark/pull/32140#discussion_r660275905 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -347,35 +355,56 @@ final class

[GitHub] [spark] sarutak commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
sarutak commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660273233 ## File path: project/SparkBuild.scala ## @@ -413,6 +413,9 @@ object SparkBuild extends PomBuild {

[GitHub] [spark] cloud-fan commented on pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
cloud-fan commented on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-870230165 > If we want to switch to Hive format strings we should add a configuration and apply to Cast, result strings, and everywhere without adding a new API like toPrettyString.

[GitHub] [spark] sarutak commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
sarutak commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660275536 ## File path: project/SparkBuild.scala ## @@ -802,11 +805,29 @@ object Hive { } object YARN { + val genConfigProperties =

[GitHub] [spark] cloud-fan commented on a change in pull request #32883: [SPARK-35725][SQL] Support repartition expand partitions in AQE

2021-06-28 Thread GitBox
cloud-fan commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r660273640 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedPartitions.scala ## @@ -0,0 +1,73 @@ +/* + * Licensed to

[GitHub] [spark] sarutak commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
sarutak commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660273331 ## File path: project/SparkBuild.scala ## @@ -802,11 +805,29 @@ object Hive { } object YARN { + val genConfigProperties =

[GitHub] [spark] sarutak commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
sarutak commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660273233 ## File path: project/SparkBuild.scala ## @@ -413,6 +413,9 @@ object SparkBuild extends PomBuild {

[GitHub] [spark] cloud-fan commented on a change in pull request #32883: [SPARK-35725][SQL] Support repartition expand partitions in AQE

2021-06-28 Thread GitBox
cloud-fan commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r660273178 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ShufflePartitionsUtil.scala ## @@ -296,4 +296,63 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #32883: [SPARK-35725][SQL] Support repartition expand partitions in AQE

2021-06-28 Thread GitBox
cloud-fan commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r660272066 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ShufflePartitionsUtil.scala ## @@ -296,4 +296,63 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #32883: [SPARK-35725][SQL] Support repartition expand partitions in AQE

2021-06-28 Thread GitBox
cloud-fan commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r660271729 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -642,6 +642,15 @@ object SQLConf {

[GitHub] [spark] allisonwang-db commented on a change in pull request #33070: [SPARK-35551][SQL] Handle the COUNT bug for lateral subqueries

2021-06-28 Thread GitBox
allisonwang-db commented on a change in pull request #33070: URL: https://github.com/apache/spark/pull/33070#discussion_r660271014 ## File path: sql/core/src/test/resources/sql-tests/inputs/join-lateral.sql ## @@ -83,8 +83,65 @@ SELECT * FROM t1 WHERE c1 = (SELECT MIN(a) FROM

[GitHub] [spark] Yikun commented on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-28 Thread GitBox
Yikun commented on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-870221373 > Just wondering we can only verify it manually? If by any chance, some tests are not found, can we easily know it? It just same as before, if we forgot to add the path of

[GitHub] [spark] dongjoon-hyun commented on pull request #33122: [SPARK-35922][BUILD] Upgrade maven-shade-plugin to 3.2.4

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33122: URL: https://github.com/apache/spark/pull/33122#issuecomment-870214726 Could you review this please, @HyukjinKwon ? This is only upgrading `maven-shade-plugin` and it passed here. -

[GitHub] [spark] dongjoon-hyun commented on pull request #33119: [SPARK-35920][BUILD] Upgrade to Chill 0.10.0

2021-06-28 Thread GitBox
dongjoon-hyun commented on pull request #33119: URL: https://github.com/apache/spark/pull/33119#issuecomment-870212565 Could you review this, @srowen ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #33121: URL: https://github.com/apache/spark/pull/33121#issuecomment-870209510 cc @dbtsai FYI. It's more a followup of https://github.com/apache/spark/pull/28788 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] LuciferYang commented on a change in pull request #31517: [SPARK-34309][BUILD][CORE][SQL][K8S]Use Caffeine instead of Guava Cache

2021-06-28 Thread GitBox
LuciferYang commented on a change in pull request #31517: URL: https://github.com/apache/spark/pull/31517#discussion_r660260370 ## File path: core/src/test/scala/org/apache/spark/executor/ExecutorSuite.scala ## @@ -452,7 +452,8 @@ class ExecutorSuite extends SparkFunSuite

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660260157 ## File path: project/SparkBuild.scala ## @@ -802,11 +805,29 @@ object Hive { } object YARN { + val genConfigProperties =

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660259806 ## File path: project/SparkBuild.scala ## @@ -413,6 +413,9 @@ object SparkBuild extends PomBuild {

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660259216 ## File path: project/SparkBuild.scala ## @@ -802,11 +805,29 @@ object Hive { } object YARN { + val genConfigProperties =

[GitHub] [spark] HyukjinKwon commented on pull request #33110: [SPARK-35911][SQL] Update exprId for IN subquery in DPP

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #33110: URL: https://github.com/apache/spark/pull/33110#issuecomment-870204065 @Swinky did you face something like https://github.com/apache/spark/pull/32400#issuecomment-831051189? otherwise rebasing would retrigger the build properly -- This is

[GitHub] [spark] HyukjinKwon closed pull request #33106: [SPARK-35876][SQL] ArraysZip should retain field names to avoid being re-written by analyzer/optimizer

2021-06-28 Thread GitBox
HyukjinKwon closed pull request #33106: URL: https://github.com/apache/spark/pull/33106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] HyukjinKwon commented on pull request #33106: [SPARK-35876][SQL] ArraysZip should retain field names to avoid being re-written by analyzer/optimizer

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #33106: URL: https://github.com/apache/spark/pull/33106#issuecomment-870202197 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] viirya commented on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-28 Thread GitBox
viirya commented on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-870201584 Just wondering we can only verify it manually? If by any chance, some tests are not found, can we easily know it? -- This is an automated message from the Apache Git Service.

[GitHub] [spark] yaooqinn commented on a change in pull request #33114: [SPARK-35913][SQL] Create hive permanent function with owner name

2021-06-28 Thread GitBox
yaooqinn commented on a change in pull request #33114: URL: https://github.com/apache/spark/pull/33114#discussion_r660252922 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -925,19 +925,19 @@ private[hive] class

[GitHub] [spark] HyukjinKwon commented on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-870200640 Looks pretty good to me too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32867: URL: https://github.com/apache/spark/pull/32867#discussion_r660252198 ## File path: dev/sparktestsupport/modules.py ## @@ -19,10 +19,67 @@ import itertools import os import re +import unittest +import sys + +from

[GitHub] [spark] HyukjinKwon commented on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-870200303 @Yikun sorry but mind resolving the conflicts please? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-28 Thread GitBox
AmplabJenkins removed a comment on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-870198591 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140352/

[GitHub] [spark] AmplabJenkins commented on pull request #33097: [SPARK-35901][PYTHON] Refine type hints in pyspark.pandas.window

2021-06-28 Thread GitBox
AmplabJenkins commented on pull request #33097: URL: https://github.com/apache/spark/pull/33097#issuecomment-870198591 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140352/ -- This

[GitHub] [spark] HyukjinKwon commented on pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-870198100 Currently, it's sort of mixed. - If we want to switch to Hive format strings we should add a configuration and apply to Cast, result strings, and everywhere without

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
AngersZh commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660250195 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -1190,6 +1190,22 @@ class Column(val expr: Expression) extends Logging

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660249520 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -3167,6 +3167,16 @@ object SQLConf { .booleanConf

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660249201 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -1190,6 +1190,22 @@ class Column(val expr: Expression) extends Logging

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660249045 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -3167,6 +3167,16 @@ object SQLConf { .booleanConf

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660248663 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -1190,6 +1190,22 @@ class Column(val expr: Expression) extends Logging

[GitHub] [spark] AmplabJenkins commented on pull request #33114: [SPARK-35913][SQL] Create hive permanent function with owner name

2021-06-28 Thread GitBox
AmplabJenkins commented on pull request #33114: URL: https://github.com/apache/spark/pull/33114#issuecomment-870196000 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] dongjoon-hyun opened a new pull request #33122: [SPARK-35922][BUILD] Upgrade maven-shade-plugin to 3.2.4

2021-06-28 Thread GitBox
dongjoon-hyun opened a new pull request #33122: URL: https://github.com/apache/spark/pull/33122 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
AngersZh commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660247448 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -1190,6 +1190,22 @@ class Column(val expr: Expression) extends Logging

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660245441 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -3167,6 +3167,16 @@ object SQLConf { .booleanConf

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660246398 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -3167,6 +3167,16 @@ object SQLConf { .booleanConf

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660246290 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -3167,6 +3167,16 @@ object SQLConf { .booleanConf

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-28 Thread GitBox
AmplabJenkins removed a comment on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-870192624 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140346/

[GitHub] [spark] AmplabJenkins commented on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-28 Thread GitBox
AmplabJenkins commented on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-870192624 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140346/ -- This

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660245915 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -1190,6 +1190,22 @@ class Column(val expr: Expression) extends Logging

[GitHub] [spark] viirya commented on pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
viirya commented on pull request #32928: URL: https://github.com/apache/spark/pull/32928#issuecomment-870192493 Thanks @HeartSaVioR. I will take another look. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-28 Thread GitBox
HyukjinKwon commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660245441 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -3167,6 +3167,16 @@ object SQLConf { .booleanConf

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-28 Thread GitBox
dongjoon-hyun commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660244934 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the

  1   2   3   4   >