[GitHub] [spark] allisonwang-db commented on a change in pull request #33070: [SPARK-35551][SQL] Handle the COUNT bug for lateral subqueries

2021-06-29 Thread GitBox
allisonwang-db commented on a change in pull request #33070: URL: https://github.com/apache/spark/pull/33070#discussion_r660271014 ## File path: sql/core/src/test/resources/sql-tests/inputs/join-lateral.sql ## @@ -83,8 +83,65 @@ SELECT * FROM t1 WHERE c1 = (SELECT MIN(a) FROM

[GitHub] [spark] cloud-fan commented on a change in pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-29 Thread GitBox
cloud-fan commented on a change in pull request #32365: URL: https://github.com/apache/spark/pull/32365#discussion_r660276267 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -1190,6 +1190,22 @@ class Column(val expr: Expression) extends Logging {

[GitHub] [spark] HyukjinKwon commented on pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-29 Thread GitBox
HyukjinKwon commented on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-870198100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] q2w removed a comment on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-29 Thread GitBox
q2w removed a comment on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-870246460 > Could you check the UT failure, @q2w ? If it passes in your environment, please re-trigger GitHub Action. > > ``` > [info] *** 1 TEST FAILED *** > [error]

[GitHub] [spark] cloud-fan commented on pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE

2021-06-29 Thread GitBox
cloud-fan commented on pull request #32850: URL: https://github.com/apache/spark/pull/32850#issuecomment-870297019 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] ulysses-you commented on pull request #33123: [SPARK-35923][SQL] Coalesce empty partition with mixed CoalescedPartitionSpec and PartialReducerPartitionSpec

2021-06-29 Thread GitBox
ulysses-you commented on pull request #33123: URL: https://github.com/apache/spark/pull/33123#issuecomment-870266260 cc @maropu @cloud-fan @viirya @JkSelf @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] yaooqinn commented on pull request #32931: [SPARK-33898][SQL] Support SHOW CREATE TABLE In V2

2021-06-29 Thread GitBox
yaooqinn commented on pull request #32931: URL: https://github.com/apache/spark/pull/32931#issuecomment-870174108 thanks @Peng-Lei and all. merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] ueshin closed pull request #32955: [SPARK-35344][PYTHON] Support creating a Column of numpy literals in pandas API on Spark

2021-06-29 Thread GitBox
ueshin closed pull request #32955: URL: https://github.com/apache/spark/pull/32955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] AngersZhuuuu commented on pull request #32943: [SPARK-35735][SQL] Take into account day-time interval fields in cast

2021-06-29 Thread GitBox
AngersZh commented on pull request #32943: URL: https://github.com/apache/spark/pull/32943#issuecomment-870585568 ping @MaxGekk Can help to re-trigger the jenkins test? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun commented on pull request #33119: [SPARK-35920][BUILD] Upgrade to Chill 0.10.0

2021-06-29 Thread GitBox
dongjoon-hyun commented on pull request #33119: URL: https://github.com/apache/spark/pull/33119#issuecomment-870212565 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] dongjoon-hyun commented on pull request #33122: [SPARK-35922][BUILD] Upgrade maven-shade-plugin to 3.2.4

2021-06-29 Thread GitBox
dongjoon-hyun commented on pull request #33122: URL: https://github.com/apache/spark/pull/33122#issuecomment-870214726 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] gengliangwang commented on pull request #33115: [SPARK-35916][SQL] Support subtraction among Date/Timestamp/TimestampWithoutTZ

2021-06-29 Thread GitBox
gengliangwang commented on pull request #33115: URL: https://github.com/apache/spark/pull/33115#issuecomment-870284954 Merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-29 Thread GitBox
HyukjinKwon commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-870168855 Thanks man! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33065: [SPARK-35880][SS] Track the duplicates dropped count in dedupe operator

2021-06-29 Thread GitBox
AmplabJenkins removed a comment on pull request #33065: URL: https://github.com/apache/spark/pull/33065#issuecomment-870192624 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140346/

[GitHub] [spark] vkorukanti commented on a change in pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-29 Thread GitBox
vkorukanti commented on a change in pull request #33091: URL: https://github.com/apache/spark/pull/33091#discussion_r660639490 ## File path: sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsWithStateSuite.scala ## @@ -1122,6 +1106,43 @@ class

[GitHub] [spark] AmplabJenkins commented on pull request #33114: [SPARK-35913][SQL] Create hive permanent function with owner name

2021-06-29 Thread GitBox
AmplabJenkins commented on pull request #33114: URL: https://github.com/apache/spark/pull/33114#issuecomment-870196000 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] MaxGekk commented on pull request #32943: [SPARK-35735][SQL] Take into account day-time interval fields in cast

2021-06-29 Thread GitBox
MaxGekk commented on pull request #32943: URL: https://github.com/apache/spark/pull/32943#issuecomment-870618586 @AngersZh You can re-trigger yourself by making an empty commit: ``` git commit --allow-empty -m "Trigger build" ``` -- This is an automated message from the

[GitHub] [spark] cloud-fan commented on a change in pull request #32980: [SPARK-35829][SQL] Clean up evaluates subexpressions and add more flexibility to evaluate particular subexpressoin

2021-06-29 Thread GitBox
cloud-fan commented on a change in pull request #32980: URL: https://github.com/apache/spark/pull/32980#discussion_r660694807 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -1071,14 +1133,13 @@ class

[GitHub] [spark] otterc commented on a change in pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
otterc commented on a change in pull request #32140: URL: https://github.com/apache/spark/pull/32140#discussion_r660704248 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -767,6 +878,83 @@ final class

[GitHub] [spark] cloud-fan closed pull request #33123: [SPARK-35923][SQL] Coalesce empty partition with mixed CoalescedPartitionSpec and PartialReducerPartitionSpec

2021-06-29 Thread GitBox
cloud-fan closed pull request #33123: URL: https://github.com/apache/spark/pull/33123 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] cloud-fan commented on pull request #33123: [SPARK-35923][SQL] Coalesce empty partition with mixed CoalescedPartitionSpec and PartialReducerPartitionSpec

2021-06-29 Thread GitBox
cloud-fan commented on pull request #33123: URL: https://github.com/apache/spark/pull/33123#issuecomment-870675458 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] Ngone51 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state

2021-06-29 Thread GitBox
Ngone51 commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r660724062 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -567,7 +598,8 @@ public void

[GitHub] [spark] xinrong-databricks commented on pull request #32955: [SPARK-35344][PYTHON] Support creating a Column of numpy literals in pandas API on Spark

2021-06-29 Thread GitBox
xinrong-databricks commented on pull request #32955: URL: https://github.com/apache/spark/pull/32955#issuecomment-870743798 Thanks @ueshin! I will file follow-up tickets. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] Yikun commented on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-29 Thread GitBox
Yikun commented on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-870221373 > Just wondering we can only verify it manually? If by any chance, some tests are not found, can we easily know it? It just same as before, if we forgot to add the path of

[GitHub] [spark] dongjoon-hyun closed pull request #33127: [SPARK-35483][FOLLOWUP][TESTS] Update run-tests.py doctest

2021-06-29 Thread GitBox
dongjoon-hyun closed pull request #33127: URL: https://github.com/apache/spark/pull/33127 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32928: [SPARK-35784] Implementation for RocksDB instance

2021-06-29 Thread GitBox
dongjoon-hyun commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660235291 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the

[GitHub] [spark] sarutak commented on a change in pull request #33121: [SPARK-35921][BUILD] ${spark.yarn.isHadoopProvided} in config.properties is not edited if build with SBT

2021-06-29 Thread GitBox
sarutak commented on a change in pull request #33121: URL: https://github.com/apache/spark/pull/33121#discussion_r660273233 ## File path: project/SparkBuild.scala ## @@ -413,6 +413,9 @@ object SparkBuild extends PomBuild {

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32365: [SPARK-35228][SQL] Add expression ToPrettyString for keep consistent between hive/spark format in df.show and transform

2021-06-29 Thread GitBox
HyukjinKwon edited a comment on pull request #32365: URL: https://github.com/apache/spark/pull/32365#issuecomment-870235430 > BTW why do we add a new API in Column? That's my question (https://github.com/apache/spark/pull/32365#discussion_r660245915) ... `df.show` is already

[GitHub] [spark] ueshin commented on pull request #33117: [SPARK-35859][PYTHON] Cleanup type hints in pandas-on-Spark

2021-06-29 Thread GitBox
ueshin commented on pull request #33117: URL: https://github.com/apache/spark/pull/33117#issuecomment-870124113 cc @HyukjinKwon @itholic @xinrong-databricks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] ulysses-you commented on a change in pull request #33123: [SPARK-35923][SQL] Coalesce empty partition with mixed CoalescedPartitionSpec and PartialReducerPartitionSpec

2021-06-29 Thread GitBox
ulysses-you commented on a change in pull request #33123: URL: https://github.com/apache/spark/pull/33123#discussion_r660387330 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ShufflePartitionsUtil.scala ## @@ -173,7 +176,11 @@ object

[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-29 Thread GitBox
aokolnychyi commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r660233482 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -66,10 +71,33 @@ object

[GitHub] [spark] akshatb1 opened a new pull request #33134: [SPARK-35931][CORE][YARN] Ability to override Yarn Cluster Submit Class with Configuration

2021-06-29 Thread GitBox
akshatb1 opened a new pull request #33134: URL: https://github.com/apache/spark/pull/33134 ### What changes were proposed in this pull request? - This PR allows to add a custom implementation of YARN_CLUSTER_SUBMIT_CLASS as a configuration. - This is useful when there is a

[GitHub] [spark] otterc commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state

2021-06-29 Thread GitBox
otterc commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r660749709 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/protocol/PushBlockStream.java ## @@ -99,10 +110,11 @@ public void

[GitHub] [spark] viirya commented on a change in pull request #32980: [SPARK-35829][SQL] Clean up evaluates subexpressions and add more flexibility to evaluate particular subexpressoin

2021-06-29 Thread GitBox
viirya commented on a change in pull request #32980: URL: https://github.com/apache/spark/pull/32980#discussion_r660768427 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -76,24 +76,34 @@ object ExprCode {

[GitHub] [spark] viirya commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-29 Thread GitBox
viirya commented on a change in pull request #32753: URL: https://github.com/apache/spark/pull/32753#discussion_r660168778 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java ## @@ -33,31 +51,104 @@ /** The

[GitHub] [spark] attilapiros commented on pull request #32793: [WIP][SPARK-35430] Switch on "PVs with local storage" integration test on Docker driver

2021-06-29 Thread GitBox
attilapiros commented on pull request #32793: URL: https://github.com/apache/spark/pull/32793#issuecomment-870188248 @dongjoon-hyun @shaneknapp to summarize my findings for running the "PVs with local storage": 1. The Jenkins job should be extended with creating of the Spark user on

[GitHub] [spark] cloud-fan commented on a change in pull request #32787: [SPARK-35618][SQL] Resolve star expressions in subqueries using outer query plans

2021-06-29 Thread GitBox
cloud-fan commented on a change in pull request #32787: URL: https://github.com/apache/spark/pull/32787#discussion_r660349216 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala ## @@ -807,4 +807,48 @@ class

[GitHub] [spark] dongjoon-hyun commented on pull request #33125: [SPARK-35483][TESTS] Enable docker_integration_tests for catalyst/sql module changes too

2021-06-29 Thread GitBox
dongjoon-hyun commented on pull request #33125: URL: https://github.com/apache/spark/pull/33125#issuecomment-870255930 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] yaooqinn commented on a change in pull request #33114: [SPARK-35913][SQL] Create hive permanent function with owner name

2021-06-29 Thread GitBox
yaooqinn commented on a change in pull request #33114: URL: https://github.com/apache/spark/pull/33114#discussion_r660252922 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -925,19 +925,19 @@ private[hive] class

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32991: [SPARK-35678][ML][FOLLOWUP] softmax support offset and step

2021-06-29 Thread GitBox
HyukjinKwon edited a comment on pull request #32991: URL: https://github.com/apache/spark/pull/32991#issuecomment-870124789 Oh, yeah it was reverted partially already at https://github.com/apache/spark/pull/33049 -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] HyukjinKwon closed pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-29 Thread GitBox
HyukjinKwon closed pull request #32867: URL: https://github.com/apache/spark/pull/32867 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] HeartSaVioR commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-29 Thread GitBox
HeartSaVioR commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660365487 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the

[GitHub] [spark] dongjoon-hyun closed pull request #33119: [SPARK-35920][BUILD] Upgrade to Chill 0.10.0

2021-06-29 Thread GitBox
dongjoon-hyun closed pull request #33119: URL: https://github.com/apache/spark/pull/33119 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] aokolnychyi commented on pull request #33120: [SPARK-35899][SQL][FOLLOWUP] Utility to convert connector expressions to Catalyst

2021-06-29 Thread GitBox
aokolnychyi commented on pull request #33120: URL: https://github.com/apache/spark/pull/33120#issuecomment-870176026 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33125: [SPARK-35483][TESTS] Add sql dependency to docker_integration_tests

2021-06-29 Thread GitBox
dongjoon-hyun commented on a change in pull request #33125: URL: https://github.com/apache/spark/pull/33125#discussion_r660296765 ## File path: dev/sparktestsupport/modules.py ## @@ -769,7 +769,7 @@ def __hash__(self): docker_integration_tests = Module(

[GitHub] [spark] q2w commented on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-29 Thread GitBox
q2w commented on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-870246460 > Could you check the UT failure, @q2w ? If it passes in your environment, please re-trigger GitHub Action. > > ``` > [info] *** 1 TEST FAILED *** > [error] Failed:

[GitHub] [spark] otterc commented on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
otterc commented on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-870101000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] gao634209276 commented on pull request #33119: [SPARK-35920][BUILD] Upgrade to Chill 0.10.0

2021-06-29 Thread GitBox
gao634209276 commented on pull request #33119: URL: https://github.com/apache/spark/pull/33119#issuecomment-870318756 Is this tested? When I was compiling the unsafe module, I found that esotericsoftware was missing a dependency and reported an error -- This is an automated message from

[GitHub] [spark] ueshin commented on pull request #32955: [SPARK-35344][PYTHON] Support creating a Column of numpy literals in pandas API on Spark

2021-06-29 Thread GitBox
ueshin commented on pull request #32955: URL: https://github.com/apache/spark/pull/32955#issuecomment-870169030 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] gengliangwang closed pull request #33115: [SPARK-35916][SQL] Support subtraction among Date/Timestamp/TimestampWithoutTZ

2021-06-29 Thread GitBox
gengliangwang closed pull request #33115: URL: https://github.com/apache/spark/pull/33115 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32955: [SPARK-35344][PYTHON] Support creating a Column of numpy literals in pandas API on Spark

2021-06-29 Thread GitBox
HyukjinKwon commented on a change in pull request #32955: URL: https://github.com/apache/spark/pull/32955#discussion_r660207643 ## File path: python/pyspark/pandas/spark/functions.py ## @@ -32,6 +41,22 @@ def repeat(col: Column, n: Union[int, Column]) -> Column: return

[GitHub] [spark] cloud-fan commented on pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-29 Thread GitBox
cloud-fan commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870245406 thanks for fixing it! BTW is there a way to capture it in PR builders? Otherwise, it's too easy to forget checking... -- This is an automated message from the Apache

[GitHub] [spark] sarutak commented on pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-29 Thread GitBox
sarutak commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870249724 @dongjoon-hyun I think It's because docker-integration-tests don't depend on sql module in `dev/sparktestsupport/modules.py`. We intentionally let the dependency

[GitHub] [spark] HyukjinKwon commented on pull request #33110: [SPARK-35911][SQL] Update exprId for IN subquery in DPP

2021-06-29 Thread GitBox
HyukjinKwon commented on pull request #33110: URL: https://github.com/apache/spark/pull/33110#issuecomment-870204065 @Swinky did you face something like https://github.com/apache/spark/pull/32400#issuecomment-831051189? otherwise rebasing would retrigger the build properly -- This is

[GitHub] [spark] imback82 commented on pull request #33124: [SPARK-34302][FOLLOWUP][SQL][TESTS] Update jdbc.v2.*IntegrationSuite

2021-06-29 Thread GitBox
imback82 commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870285494 Thanks @dongjoon-hyun! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] github-actions[bot] commented on pull request #31870: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2021-06-29 Thread GitBox
github-actions[bot] commented on pull request #31870: URL: https://github.com/apache/spark/pull/31870#issuecomment-870129068 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] mridulm commented on a change in pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
mridulm commented on a change in pull request #32140: URL: https://github.com/apache/spark/pull/32140#discussion_r660221758 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -347,35 +355,56 @@ final class

[GitHub] [spark] Ngone51 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state

2021-06-29 Thread GitBox
Ngone51 commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r660699353 ## File path: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala ## @@ -57,6 +61,11 @@ private[spark] class DiskBlockManager(conf:

[GitHub] [spark] AmplabJenkins commented on pull request #33135: [SPARK-35931][CORE][YARN] Ability to override Yarn Cluster Submit Class with Configuration

2021-06-29 Thread GitBox
AmplabJenkins commented on pull request #33135: URL: https://github.com/apache/spark/pull/33135#issuecomment-870689349 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] Ngone51 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state

2021-06-29 Thread GitBox
Ngone51 commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r660719801 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -403,38 +394,78 @@ public

[GitHub] [spark] otterc commented on a change in pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
otterc commented on a change in pull request #32140: URL: https://github.com/apache/spark/pull/32140#discussion_r660721217 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -386,40 +415,53 @@ final class

[GitHub] [spark] gengliangwang commented on a change in pull request #33136: [SPARK-35932][SQL] Support extracting hour/minute/second from timestamp without time zone

2021-06-29 Thread GitBox
gengliangwang commented on a change in pull request #33136: URL: https://github.com/apache/spark/pull/33136#discussion_r660729422 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ## @@ -968,6 +968,7 @@ object TypeCoercion

[GitHub] [spark] Ngone51 commented on a change in pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
Ngone51 commented on a change in pull request #32140: URL: https://github.com/apache/spark/pull/32140#discussion_r660766903 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -767,6 +878,83 @@ final class

[GitHub] [spark] akshatb1 commented on pull request #33135: [SPARK-35931][CORE][YARN] Ability to override Yarn Cluster Submit Class with Configuration

2021-06-29 Thread GitBox
akshatb1 commented on pull request #33135: URL: https://github.com/apache/spark/pull/33135#issuecomment-870772962 @srowen @HyukjinKwon @tgravescs Could you kindly help in reviewing this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] sunchao commented on a change in pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-29 Thread GitBox
sunchao commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r660321387 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsRuntimeFiltering.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33130: [SPARK-35928][BUILD] Upgrade ASM to 9.1

2021-06-29 Thread GitBox
dongjoon-hyun commented on a change in pull request #33130: URL: https://github.com/apache/spark/pull/33130#discussion_r660822494 ## File path: pom.xml ## @@ -2858,6 +2864,18 @@ org.apache.maven.plugins maven-shade-plugin 3.2.4 + +

[GitHub] [spark] dongjoon-hyun closed pull request #33130: [SPARK-35928][BUILD] Upgrade ASM to 9.1

2021-06-29 Thread GitBox
dongjoon-hyun closed pull request #33130: URL: https://github.com/apache/spark/pull/33130 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] cloud-fan commented on pull request #33113: [SPARK-34302][SQL] Migrate ALTER TABLE ... CHANGE COLUMN command to use UnresolvedTable to resolve the identifier

2021-06-29 Thread GitBox
cloud-fan commented on pull request #33113: URL: https://github.com/apache/spark/pull/33113#issuecomment-870189467 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] viirya commented on pull request #32867: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-06-29 Thread GitBox
viirya commented on pull request #32867: URL: https://github.com/apache/spark/pull/32867#issuecomment-870201584 Just wondering we can only verify it manually? If by any chance, some tests are not found, can we easily know it? -- This is an automated message from the Apache Git Service.

[GitHub] [spark] Ngone51 edited a comment on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
Ngone51 edited a comment on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-870235923 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] github-actions[bot] closed pull request #31774: [SPARK-34659] Fix that Web UI always correctly get appId

2021-06-29 Thread GitBox
github-actions[bot] closed pull request #31774: URL: https://github.com/apache/spark/pull/31774 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on pull request #33120: [SPARK-35899][SQL][FOLLOWUP] Utility to convert connector expressions to Catalyst

2021-06-29 Thread GitBox
dongjoon-hyun commented on pull request #33120: URL: https://github.com/apache/spark/pull/33120#issuecomment-870245168 Merged to master. Thank you, @aokolnychyi and @HyukjinKwon . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan commented on a change in pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE

2021-06-29 Thread GitBox
cloud-fan commented on a change in pull request #32850: URL: https://github.com/apache/spark/pull/32850#discussion_r660333239 ## File path: core/src/main/resources/error/README.md ## @@ -0,0 +1,79 @@ +# Guidelines + +To throw a standardized exception, developers should use an

[GitHub] [spark] viirya commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-29 Thread GitBox
viirya commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660295530 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the Apache

[GitHub] [spark] HyukjinKwon commented on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-29 Thread GitBox
HyukjinKwon commented on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-870250749 @q2w, can you rebase and create a new PR? Seems like GA in this PR is somehow messed up. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] HyukjinKwon commented on pull request #32583: [SPARK-35437][SQL] Hive partition filtering client optimization

2021-06-29 Thread GitBox
HyukjinKwon commented on pull request #32583: URL: https://github.com/apache/spark/pull/32583#issuecomment-870140501 How is it different from `spark.sql.optimizer.metadataOnly`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] sarutak commented on pull request #32943: [SPARK-35735][SQL] Take into account day-time interval fields in cast

2021-06-29 Thread GitBox
sarutak commented on pull request #32943: URL: https://github.com/apache/spark/pull/32943#issuecomment-870589175 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] cloud-fan commented on a change in pull request #33123: [SPARK-35923][SQL] Coalesce empty partition with mixed CoalescedPartitionSpec and PartialReducerPartitionSpec

2021-06-29 Thread GitBox
cloud-fan commented on a change in pull request #33123: URL: https://github.com/apache/spark/pull/33123#discussion_r660656779 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ShufflePartitionsUtil.scala ## @@ -172,7 +172,7 @@ object

[GitHub] [spark] akshatb1 opened a new pull request #33135: [SPARK-35931][CORE][YARN] Ability to override Yarn Cluster Submit Class with Configuration

2021-06-29 Thread GitBox
akshatb1 opened a new pull request #33135: URL: https://github.com/apache/spark/pull/33135 ### What changes were proposed in this pull request? - This PR allows to add a custom implementation of YARN_CLUSTER_SUBMIT_CLASS as a configuration. - This is useful when there is a

[GitHub] [spark] Ngone51 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state

2021-06-29 Thread GitBox
Ngone51 commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r660713731 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -403,38 +394,78 @@ public

[GitHub] [spark] otterc commented on a change in pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
otterc commented on a change in pull request #32140: URL: https://github.com/apache/spark/pull/32140#discussion_r660721217 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -386,40 +415,53 @@ final class

[GitHub] [spark] Ngone51 commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-06-29 Thread GitBox
Ngone51 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r660733183 ## File path: core/src/main/scala/org/apache/spark/Dependency.scala ## @@ -122,6 +119,14 @@ class ShuffleDependency[K: ClassTag, V: ClassTag, C:

[GitHub] [spark] viirya commented on a change in pull request #32980: [SPARK-35829][SQL] Clean up evaluates subexpressions and add more flexibility to evaluate particular subexpressoin

2021-06-29 Thread GitBox
viirya commented on a change in pull request #32980: URL: https://github.com/apache/spark/pull/32980#discussion_r660744333 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -1071,14 +1133,13 @@ class

[GitHub] [spark] otterc commented on a change in pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
otterc commented on a change in pull request #32140: URL: https://github.com/apache/spark/pull/32140#discussion_r660771829 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -767,6 +878,83 @@ final class

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #33131: [SPARK-35920][FOLLOWUP][BUILD] Fix Kryo Shaded dependency

2021-06-29 Thread GitBox
dongjoon-hyun edited a comment on pull request #33131: URL: https://github.com/apache/spark/pull/33131#issuecomment-870742689 FYI, in the master branch, I can test the module without this PR. Given that, `unsafe` module is not broken at least. Which module did you hit the failure? ```

[GitHub] [spark] dongjoon-hyun commented on pull request #33131: [SPARK-35920][FOLLOWUP][BUILD] Fix Kryo Shaded dependency

2021-06-29 Thread GitBox
dongjoon-hyun commented on pull request #33131: URL: https://github.com/apache/spark/pull/33131#issuecomment-870742689 FYI, in the master branch, I can test the module without this PR. ``` $ build/sbt "unsafe/test" ... [info] Run completed in 1 second, 56 milliseconds. [info]

[GitHub] [spark] xuanyuanking commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-29 Thread GitBox
xuanyuanking commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660815271 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the

[GitHub] [spark] srowen commented on pull request #33130: [SPARK-35928][BUILD] Upgrade ASM to 9.1

2021-06-29 Thread GitBox
srowen commented on pull request #33130: URL: https://github.com/apache/spark/pull/33130#issuecomment-870789567 I'm probably missing something - we don't have / need Jenkins tests, just the Github Actions? I just couldn't see test results here, or for the Chill change. -- This is an

[GitHub] [spark] HeartSaVioR commented on a change in pull request #33091: [SPARK-35896][SS] Include more granular metrics for stateful operators in StreamingQueryProgress

2021-06-29 Thread GitBox
HeartSaVioR commented on a change in pull request #33091: URL: https://github.com/apache/spark/pull/33091#discussion_r660321229 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FlatMapGroupsWithStateExec.scala ## @@ -127,12 +132,26 @@ case class

[GitHub] [spark] Ngone51 commented on pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-29 Thread GitBox
Ngone51 commented on pull request #32140: URL: https://github.com/apache/spark/pull/32140#issuecomment-870235923 Sorry for the delay. I'll do a review today. BTW, are there any other necessary mgnet PRs that have to be merged for the 3.2 release? -- This is an automated message from the

[GitHub] [spark] HyukjinKwon commented on pull request #33106: [SPARK-35876][SQL] ArraysZip should retain field names to avoid being re-written by analyzer/optimizer

2021-06-29 Thread GitBox
HyukjinKwon commented on pull request #33106: URL: https://github.com/apache/spark/pull/33106#issuecomment-870202197 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan commented on a change in pull request #33123: [SPARK-35923][SQL] Coalesce empty partition with mixed CoalescedPartitionSpec and PartialReducerPartitionSpec

2021-06-29 Thread GitBox
cloud-fan commented on a change in pull request #33123: URL: https://github.com/apache/spark/pull/33123#discussion_r660368966 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ShufflePartitionsUtil.scala ## @@ -173,7 +176,11 @@ object

[GitHub] [spark] HyukjinKwon closed pull request #33106: [SPARK-35876][SQL] ArraysZip should retain field names to avoid being re-written by analyzer/optimizer

2021-06-29 Thread GitBox
HyukjinKwon closed pull request #33106: URL: https://github.com/apache/spark/pull/33106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] ulysses-you commented on a change in pull request #32883: [SPARK-35725][SQL] Support repartition expand partitions in AQE

2021-06-29 Thread GitBox
ulysses-you commented on a change in pull request #32883: URL: https://github.com/apache/spark/pull/32883#discussion_r660214183 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/ExpandShufflePartitions.scala ## @@ -0,0 +1,98 @@ +/* + * Licensed to

[GitHub] [spark] xuanyuanking commented on a change in pull request #32928: [SPARK-35784][SS] Implementation for RocksDB instance

2021-06-29 Thread GitBox
xuanyuanking commented on a change in pull request #32928: URL: https://github.com/apache/spark/pull/32928#discussion_r660314758 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala ## @@ -0,0 +1,451 @@ +/* + * Licensed to the

[GitHub] [spark] dongjoon-hyun commented on pull request #33124: [SPARK-34302][SQL][TEST] Update jdbc.v2.*IntegrationSuite

2021-06-29 Thread GitBox
dongjoon-hyun commented on pull request #33124: URL: https://github.com/apache/spark/pull/33124#issuecomment-870246910 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] Shockang edited a comment on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.

2021-06-29 Thread GitBox
Shockang edited a comment on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-870278232 @dongjoon-hyun It has been revised.Could you please verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] sunchao commented on a change in pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-29 Thread GitBox
sunchao commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r660175084 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -66,10 +71,33 @@ object

[GitHub] [spark] viirya commented on a change in pull request #33108: [SPARK-35898][SQL] Fix arrays and maps in RowToColumnConverter

2021-06-29 Thread GitBox
viirya commented on a change in pull request #33108: URL: https://github.com/apache/spark/pull/33108#discussion_r660164517 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala ## @@ -264,12 +264,12 @@ private object RowToColumnConverter {

[GitHub] [spark] Yikun commented on a change in pull request #32139: [SPARK-35032][PYTHON] Port Koalas Index unit tests into PySpark

2021-06-29 Thread GitBox
Yikun commented on a change in pull request #32139: URL: https://github.com/apache/spark/pull/32139#discussion_r660422105 ## File path: dev/sparktestsupport/modules.py ## @@ -611,43 +611,47 @@ def __hash__(self): "pyspark.pandas.spark.utils",

  1   2   3   4   5   6   7   8   9   10   >