[GitHub] [spark] AngersZhuuuu closed pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle case that expression match SubExprEliminationState but not Variab

2021-06-26 Thread GitBox
AngersZh closed pull request #33082: URL: https://github.com/apache/spark/pull/33082 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] mridulm commented on pull request #32286: [SPARK-35181][CORE] Use zstd for spark.io.compression.codec by default

2021-06-26 Thread GitBox
mridulm commented on pull request #32286: URL: https://github.com/apache/spark/pull/32286#issuecomment-869102082 I am guessing this is use of `spark.sql.adaptive.advisoryPartitionSizeInBytes` ? Sounds good to continue using lz4 to preserve current behavior. We can always modify the

[GitHub] [spark] AngersZhuuuu commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle case that expression match SubExprEliminationState but not

2021-06-26 Thread GitBox
AngersZh commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-869102125 Close this according to https://github.com/apache/spark/pull/33103 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] mridulm edited a comment on pull request #32286: [SPARK-35181][CORE] Use zstd for spark.io.compression.codec by default

2021-06-26 Thread GitBox
mridulm edited a comment on pull request #32286: URL: https://github.com/apache/spark/pull/32286#issuecomment-869102082 Thanks for clarifying @dongjoon-hyun ! I am guessing this is use of `spark.sql.adaptive.advisoryPartitionSizeInBytes` ? Sounds good to continue using lz4 to

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31790: [SPARK-34509][K8S] Make dynamic allocation upscaling more progressive on K8S

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #31790: URL: https://github.com/apache/spark/pull/31790#discussion_r659263233 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -292,6 +292,14 @@ private[spark]

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31790: [SPARK-34509][K8S] Make dynamic allocation upscaling more progressive on K8S

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #31790: URL: https://github.com/apache/spark/pull/31790#discussion_r659263493 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala ## @@ -216,10

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31790: [SPARK-34509][K8S] Make dynamic allocation upscaling more progressive on K8S

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #31790: URL: https://github.com/apache/spark/pull/31790#discussion_r659263233 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -292,6 +292,14 @@ private[spark]

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31790: [SPARK-34509][K8S] Make dynamic allocation upscaling more progressive on K8S

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #31790: URL: https://github.com/apache/spark/pull/31790#discussion_r659263109 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -292,6 +292,14 @@ private[spark]

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31790: [SPARK-34509][K8S] Make dynamic allocation upscaling more progressive on K8S

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #31790: URL: https://github.com/apache/spark/pull/31790#discussion_r659262994 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -292,6 +292,14 @@ private[spark]

[GitHub] [spark] dongjoon-hyun commented on pull request #31790: [SPARK-34509][K8S] Make dynamic allocation upscaling more progressive on K8S

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #31790: URL: https://github.com/apache/spark/pull/31790#issuecomment-869101113 Oh, I missed your ping here. Sorry for being late. I'll review right now. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] dongjoon-hyun commented on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #32801: URL: https://github.com/apache/spark/pull/32801#issuecomment-869100942 cc @rxin -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #32801: URL: https://github.com/apache/spark/pull/32801#discussion_r659262486 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ## @@ -622,6 +622,8 @@ object

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #32801: URL: https://github.com/apache/spark/pull/32801#discussion_r659262138 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala ## @@ -301,3 +305,137 @@ case class CurrentUser()

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-26 Thread GitBox
dongjoon-hyun edited a comment on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-869100131 Could you check the UT failure, @q2w ? If it passes in your environment, please re-trigger GitHub Action. ``` [info] *** 1 TEST FAILED *** [error] Failed:

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-860592758 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dongjoon-hyun commented on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-869100131 Could you check the UT failure, @q2w ? ``` [info] *** 1 TEST FAILED *** [error] Failed: Total 3061, Failed 1, Errors 0, Passed 3060, Ignored 7, Canceled 1

[GitHub] [spark] dongjoon-hyun commented on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-869100038 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on pull request #32902: [SPARK-35754][CORE] Add config to put migrating blocks on disk only

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #32902: URL: https://github.com/apache/spark/pull/32902#issuecomment-869100012 cc @holdenk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33070: [SPARK-35551][SQL] Handle the COUNT bug for lateral subqueries

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #33070: URL: https://github.com/apache/spark/pull/33070#discussion_r659261525 ## File path: sql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/expressions/AttributeMap.scala ## @@ -50,4 +50,6 @@ class

[GitHub] [spark] dongjoon-hyun closed pull request #33087: [SPARK-35893][TESTS] Add unit test case for MySQLDialect.getCatalystType

2021-06-26 Thread GitBox
dongjoon-hyun closed pull request #33087: URL: https://github.com/apache/spark/pull/33087 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun closed pull request #33099: [SPARK-35904][SQL] Collapse above RebalancePartitions

2021-06-26 Thread GitBox
dongjoon-hyun closed pull request #33099: URL: https://github.com/apache/spark/pull/33099 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33099: [SPARK-35904][SQL] Collapse above RebalancePartitions

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #33099: URL: https://github.com/apache/spark/pull/33099#discussion_r659259391 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ## @@ -1373,16 +1373,18 @@

[GitHub] [spark] viirya commented on a change in pull request #33008: [WIP][SPARK-35801][SQL] Support DELETE operations that require rewriting data

2021-06-26 Thread GitBox
viirya commented on a change in pull request #33008: URL: https://github.com/apache/spark/pull/33008#discussion_r659253107 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/SupportsDelta.java ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache

[GitHub] [spark] venkata91 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
venkata91 commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659246692 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -417,24 +476,75 @@ public

[GitHub] [spark] venkata91 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
venkata91 commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659246152 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -417,24 +476,75 @@ public

[GitHub] [spark] venkata91 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
venkata91 commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659246107 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -583,6 +583,7 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] [spark] wangyum opened a new pull request #33105: [SPARK-35908][SQL] Remove repartition if the child maximum number of rows less than or equal to 1

2021-06-26 Thread GitBox
wangyum opened a new pull request #33105: URL: https://github.com/apache/spark/pull/33105 ### What changes were proposed in this pull request? Enhance `OptimizeRepartition` to remove repartition if the child maximum number of rows less than or equal to 1. For example: ```scala

[GitHub] [spark] zhouyejoe commented on pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the state in a bett

2021-06-26 Thread GitBox
zhouyejoe commented on pull request #33078: URL: https://github.com/apache/spark/pull/33078#issuecomment-869081628 Thanks @mridulm @otterc @venkata91 for reviewing the PR. Addressed majority of the comments. @otterc The unit test to simulate the concurrency control is yet to be

[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659244241 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -417,24 +476,75 @@ public

[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659244070 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -583,6 +583,7 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] [spark] viirya commented on a change in pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
viirya commented on a change in pull request #33103: URL: https://github.com/apache/spark/pull/33103#discussion_r659243561 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/decimalExpressions.scala ## @@ -111,9 +111,8 @@ case class

[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659243498 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -417,24 +476,75 @@ public

[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659243498 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -417,24 +476,75 @@ public

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
AngersZh commented on a change in pull request #33103: URL: https://github.com/apache/spark/pull/33103#discussion_r659243432 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/decimalExpressions.scala ## @@ -111,9 +111,8 @@ case class

[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659243367 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -73,27 +76,26 @@ public class

[GitHub] [spark] zhouyejoe commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
zhouyejoe commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659243311 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -217,61 +237,74 @@ public

[GitHub] [spark] AmplabJenkins commented on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.

2021-06-26 Thread GitBox
AmplabJenkins commented on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-869080424 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] c21 closed pull request #32881: [SPARK-33298][CORE] Decouple file naming from FileCommitProtocol

2021-06-26 Thread GitBox
c21 closed pull request #32881: URL: https://github.com/apache/spark/pull/32881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] c21 commented on pull request #32881: [SPARK-33298][CORE] Decouple file naming from FileCommitProtocol

2021-06-26 Thread GitBox
c21 commented on pull request #32881: URL: https://github.com/apache/spark/pull/32881#issuecomment-869078476 Update: we decided to go with https://github.com/apache/spark/pull/33012 instead of this PR, as we know [some other

[GitHub] [spark] AmplabJenkins commented on pull request #33104: [SPARK-35902][Core] spark.driver.log.dfsDir with hdfs scheme failed

2021-06-26 Thread GitBox
AmplabJenkins commented on pull request #33104: URL: https://github.com/apache/spark/pull/33104#issuecomment-869077758 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] github-actions[bot] closed pull request #30763: [SPARK-31801][API][SHUFFLE] Register map output metadata

2021-06-26 Thread GitBox
github-actions[bot] closed pull request #30763: URL: https://github.com/apache/spark/pull/30763 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] github-actions[bot] closed pull request #29113: [SPARK-32314][SHS] Add config to control whether log old format of stacktrace

2021-06-26 Thread GitBox
github-actions[bot] closed pull request #29113: URL: https://github.com/apache/spark/pull/29113 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] github-actions[bot] commented on pull request #31601: [SPARK-34484][SQL] Rename `map` to `mapAttr` in Catalyst DSL

2021-06-26 Thread GitBox
github-actions[bot] commented on pull request #31601: URL: https://github.com/apache/spark/pull/31601#issuecomment-869077048 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] viirya commented on a change in pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
viirya commented on a change in pull request #33103: URL: https://github.com/apache/spark/pull/33103#discussion_r659240059 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ## @@ -2914,6 +2914,21 @@ class DataFrameSuite extends QueryTest val

[GitHub] [spark] viirya commented on a change in pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
viirya commented on a change in pull request #33103: URL: https://github.com/apache/spark/pull/33103#discussion_r659240038 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/decimalExpressions.scala ## @@ -111,9 +111,8 @@ case class

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #33103: URL: https://github.com/apache/spark/pull/33103#discussion_r659239936 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ## @@ -2914,6 +2914,21 @@ class DataFrameSuite extends QueryTest

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #33103: URL: https://github.com/apache/spark/pull/33103#discussion_r659239936 ## File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala ## @@ -2914,6 +2914,21 @@ class DataFrameSuite extends QueryTest

[GitHub] [spark] fhygh opened a new pull request #33104: spark.driver.log.dfsDir with hdfs scheme failed

2021-06-26 Thread GitBox
fhygh opened a new pull request #33104: URL: https://github.com/apache/spark/pull/33104 ### What changes were proposed in this pull request? when persist driver logs in client mode to dfs, log dir support scheme path ### Why are the changes needed? when

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
dongjoon-hyun commented on a change in pull request #33103: URL: https://github.com/apache/spark/pull/33103#discussion_r659239901 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/decimalExpressions.scala ## @@ -111,9 +111,8 @@ case class

[GitHub] [spark] attilapiros commented on pull request #31790: [SPARK-34509][K8S] Make dynamic allocation upscaling more progressive on K8S

2021-06-26 Thread GitBox
attilapiros commented on pull request #31790: URL: https://github.com/apache/spark/pull/31790#issuecomment-869068279 jenkins retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] viirya commented on pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
viirya commented on pull request #33103: URL: https://github.com/apache/spark/pull/33103#issuecomment-869068250 cc @cloud-fan @wangyum @AngersZh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] viirya opened a new pull request #33103: [SPARK-35886][SQL] PromotePrecision should not overwrite genCode

2021-06-26 Thread GitBox
viirya opened a new pull request #33103: URL: https://github.com/apache/spark/pull/33103 ### What changes were proposed in this pull request? This patch fixes `PromotePrecision` where it overwrites `genCode` where subexpression elimination should happen. ### Why

[GitHub] [spark] bersprockets commented on pull request #32969: [SPARK-35817][SQL] Restore performance of queries against wide Avro tables

2021-06-26 Thread GitBox
bersprockets commented on pull request #32969: URL: https://github.com/apache/spark/pull/32969#issuecomment-869062066 @gengliangwang It's no longer reproducible. branch-3.1 (at the HEAD and even at older commits) runs similar to master. I won't file a Jira at this time. -- This is an

[GitHub] [spark] dongjoon-hyun commented on pull request #32286: [SPARK-35181][CORE] Use zstd for spark.io.compression.codec by default

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #32286: URL: https://github.com/apache/spark/pull/32286#issuecomment-869060397 Thank you for review, @mridulm . The change is required because the UT depends on the results based on the intermediate statistics of the query. -- This is an automated

[GitHub] [spark] dongjoon-hyun commented on pull request #33100: [SPARK-35906][SQL] Remove order by if the maximum number of rows less than or equal to 1

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33100: URL: https://github.com/apache/spark/pull/33100#issuecomment-869059372 Could you check the UT failure? ``` SPARK-24556: always rewrite output partitioning in ReusedExchangeExec and InMemoryTableScanExec *** FAILED *** (40 milliseconds)

[GitHub] [spark] viirya commented on a change in pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle case that expression match SubExprEliminationState but

2021-06-26 Thread GitBox
viirya commented on a change in pull request #33082: URL: https://github.com/apache/spark/pull/33082#discussion_r659221432 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -1766,19 +1766,20 @@ object

[GitHub] [spark] viirya commented on a change in pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle case that expression match SubExprEliminationState but

2021-06-26 Thread GitBox
viirya commented on a change in pull request #33082: URL: https://github.com/apache/spark/pull/33082#discussion_r659221432 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -1766,19 +1766,20 @@ object

[GitHub] [spark] venkata91 commented on a change in pull request #33078: [SPARK-35546][Shuffle] Enable push-based shuffle when multiple app attempts are enabled and manage concurrent access to the sta

2021-06-26 Thread GitBox
venkata91 commented on a change in pull request #33078: URL: https://github.com/apache/spark/pull/33078#discussion_r659213951 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java ## @@ -417,24 +476,75 @@ public

[GitHub] [spark] attilapiros commented on a change in pull request #31790: [SPARK-34509][K8S] Make dynamic allocation upscaling more progressive on K8S

2021-06-26 Thread GitBox
attilapiros commented on a change in pull request #31790: URL: https://github.com/apache/spark/pull/31790#discussion_r659207058 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala ## @@ -292,6 +292,14 @@ private[spark]

[GitHub] [spark] dongjoon-hyun closed pull request #33102: Test Apache ORC 1.7.0-SNAPSHOT

2021-06-26 Thread GitBox
dongjoon-hyun closed pull request #33102: URL: https://github.com/apache/spark/pull/33102 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun opened a new pull request #33102: Test Apache ORC 1.7.0-SNAPSHOT

2021-06-26 Thread GitBox
dongjoon-hyun opened a new pull request #33102: URL: https://github.com/apache/spark/pull/33102 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun closed pull request #33062: [SPARK-35877][BUILD] Upgrade protypebuf to 3.2.0

2021-06-26 Thread GitBox
dongjoon-hyun closed pull request #33062: URL: https://github.com/apache/spark/pull/33062 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on pull request #33062: [SPARK-35877][BUILD] Upgrade protypebuf to 3.2.0

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33062: URL: https://github.com/apache/spark/pull/33062#issuecomment-869035233 According to the above discussion, I'll close this PR, @PavithraRamachandran . Please feel free to reopen this if there is any update from you. -- This is an

[GitHub] [spark] premsagarreddy removed a comment on pull request #29178: [SPARK-32380][SQL] fixed spark3.0 access hive table while data in hbase problem

2021-06-26 Thread GitBox
premsagarreddy removed a comment on pull request #29178: URL: https://github.com/apache/spark/pull/29178#issuecomment-825436315 @HyukjinKwon could you pls share the steps to resolve the spark3.0 access hive table while data in hbase problem -- This is an automated message from the

[GitHub] [spark] dongjoon-hyun commented on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-869030317 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33101: URL: https://github.com/apache/spark/pull/33101#issuecomment-869030227 Thank you for making a PR, @Shockang . Could you enable GitHub Action on your Spark fork? - https://spark.apache.org/developer-tools.html (Testing with GitHub Actions

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #33092: [SPARK-35905][SQL][TESTS] Fix UT to clean up table/view in SQLQuerySuite

2021-06-26 Thread GitBox
dongjoon-hyun edited a comment on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-869029145 Thank you for fixing all of them, @AngersZh . Merged to master for Apache Spark 3.2. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] dongjoon-hyun closed pull request #33092: [SPARK-35905][SQL][TESTS] Fix UT to clean up table/view in SQLQuerySuite

2021-06-26 Thread GitBox
dongjoon-hyun closed pull request #33092: URL: https://github.com/apache/spark/pull/33092 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-06-26 Thread GitBox
AmplabJenkins removed a comment on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-868889338 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44880/

[GitHub] [spark] mridulm commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-26 Thread GitBox
mridulm commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-869029335 Thanks @yaooqinn ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun commented on pull request #33092: [SPARK-35905][SQL][TESTS] Fix UT not drop table/view in SQLQuerySuite

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-869029145 Thank you for fixing all of them, @AngersZh . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] sunchao commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-26 Thread GitBox
sunchao commented on a change in pull request #32753: URL: https://github.com/apache/spark/pull/32753#discussion_r659191689 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java ## @@ -33,31 +51,107 @@ /** The

[GitHub] [spark] otterc commented on a change in pull request #32140: [SPARK-32922][SHUFFLE][CORE] Adds support for executors to fetch local and remote merged shuffle data

2021-06-26 Thread GitBox
otterc commented on a change in pull request #32140: URL: https://github.com/apache/spark/pull/32140#discussion_r652951329 ## File path: core/src/main/scala/org/apache/spark/storage/PushBasedFetchHelper.scala ## @@ -0,0 +1,336 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] dongjoon-hyun commented on pull request #33098: [SPARK-35903][TESTS] Parameterize 'master' in TPCDSQueryBenchmark

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33098: URL: https://github.com/apache/spark/pull/33098#issuecomment-869027217 Merged to master/3.1/3.0. I backported this test patch to help the release comparison. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] dongjoon-hyun closed pull request #33098: [SPARK-35903][TESTS] Parameterize 'master' in TPCDSQueryBenchmark

2021-06-26 Thread GitBox
dongjoon-hyun closed pull request #33098: URL: https://github.com/apache/spark/pull/33098 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] dongjoon-hyun commented on pull request #33098: [SPARK-35903][TESTS] Parameterize 'master' in TPCDSQueryBenchmark

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33098: URL: https://github.com/apache/spark/pull/33098#issuecomment-869026960 Thank you, @wangyum ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AngersZhuuuu commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle case that expression match SubExprEliminationState but not

2021-06-26 Thread GitBox
AngersZh commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-869025920 > Looks reasonable, although the description and the title confuse. Updated, how about current? -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle case that expression match SubExprEliminationSta

2021-06-26 Thread GitBox
AngersZh commented on a change in pull request #33082: URL: https://github.com/apache/spark/pull/33082#discussion_r659189132 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -1766,19 +1766,20 @@ object

[GitHub] [spark] AngersZhuuuu commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle case that expression match SubExprEliminationState but not

2021-06-26 Thread GitBox
AngersZh commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-869025416 > Do we have "subquery" here? Please update the description if it is not "subquery". My mistake, should be SubExprEliminationState -- This is an automated

[GitHub] [spark] viirya commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle matched subQuery but not VariableValue

2021-06-26 Thread GitBox
viirya commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-869022499 Do we have "subquery" here? Please update the description if it is not "subquery". -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] viirya commented on a change in pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle matched subQuery but not VariableValue

2021-06-26 Thread GitBox
viirya commented on a change in pull request #33082: URL: https://github.com/apache/spark/pull/33082#discussion_r659183793 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -1766,19 +1766,20 @@ object

[GitHub] [spark] viirya commented on a change in pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle matched subQuery but not VariableValue

2021-06-26 Thread GitBox
viirya commented on a change in pull request #33082: URL: https://github.com/apache/spark/pull/33082#discussion_r659183587 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -4042,6 +4042,22 @@ class SQLQuerySuite extends QueryTest with

[GitHub] [spark] wangyum commented on pull request #33082: [SPARK-35886][SQL] CodeGenerator.getLocalInputVariableValues should handle matched subQuery but not VariableValue

2021-06-26 Thread GitBox
wangyum commented on pull request #33082: URL: https://github.com/apache/spark/pull/33082#issuecomment-869020556 Also cc @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] Shockang opened a new pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.

2021-06-26 Thread GitBox
Shockang opened a new pull request #33101: URL: https://github.com/apache/spark/pull/33101 ### What changes were proposed in this pull request? The code of method: createDirectory in class: org.apache.spark.util.Utils is modified. ### Why are the changes needed? To

[GitHub] [spark] wangyum opened a new pull request #33100: [SPARK-35906][SQL] Remove order by if the maximum number of rows less than or equal to 1

2021-06-26 Thread GitBox
wangyum opened a new pull request #33100: URL: https://github.com/apache/spark/pull/33100 ### What changes were proposed in this pull request? This PR removes order by if the maximum number of rows less than or equal to 1. For example: ```scala spark.sql("select count(*) from

[GitHub] [spark] AmplabJenkins commented on pull request #33093: [SPARK-35897][SS][WIP] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-06-26 Thread GitBox
AmplabJenkins commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-868995436 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA removed a comment on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-26 Thread GitBox
SparkQA removed a comment on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868930686 **[Test build #140353 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140353/testReport)** for PR 32921 at commit

[GitHub] [spark] SparkQA commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-26 Thread GitBox
SparkQA commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-868983433 **[Test build #140353 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140353/testReport)** for PR 32921 at commit

[GitHub] [spark] AngersZhuuuu commented on pull request #33092: [SPARK-35905][SQL][TESTS] Fix UT not drop table/view in SQLQuerySuite

2021-06-26 Thread GitBox
AngersZh commented on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-868981070 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] MaxGekk closed pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-26 Thread GitBox
MaxGekk closed pull request #33086: URL: https://github.com/apache/spark/pull/33086 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] MaxGekk commented on pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-26 Thread GitBox
MaxGekk commented on pull request #33086: URL: https://github.com/apache/spark/pull/33086#issuecomment-868980563 +1, LGTM. Merging to master. Thank you, @gengliangwang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] MaxGekk commented on pull request #33086: [SPARK-35895][SQL] Support subtracting Intervals from TimestampWithoutTZ

2021-06-26 Thread GitBox
MaxGekk commented on pull request #33086: URL: https://github.com/apache/spark/pull/33086#issuecomment-868980476 The build failure by Scala 2.13: ``` [error] /home/runner/work/spark/spark/core/src/main/scala/org/apache/spark/SparkContext.scala:428:30: type mismatch; [error]

[GitHub] [spark] SparkQA removed a comment on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-26 Thread GitBox
SparkQA removed a comment on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868886081 **[Test build #140350 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140350/testReport)** for PR 33096 at commit

[GitHub] [spark] SparkQA commented on pull request #33096: [SPARK-35899][SQL] Utility to convert connector expressions to Catalyst

2021-06-26 Thread GitBox
SparkQA commented on pull request #33096: URL: https://github.com/apache/spark/pull/33096#issuecomment-868965600 **[Test build #140350 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140350/testReport)** for PR 33096 at commit

[GitHub] [spark] AngersZhuuuu commented on pull request #33092: [SPARK-35905][SQL][TESTS] Fix UT not drop table/view in SQLQuerySuite

2021-06-26 Thread GitBox
AngersZh commented on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-868959359 > It's okay to add that, but you should remove '[FOLLOWUP]'. BTW, is this the only case in that test suite file? Check again, should have fix all table/view in this

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #33092: [SPARK-35905][SQL][FOLLOWUP][TESTS] Adding withTable in SQLQuerySuite about test("SPARK-33338: GROUP BY using literal map should

2021-06-26 Thread GitBox
dongjoon-hyun edited a comment on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-868956083 It's okay to add that, but you should remove '[FOLLOWUP]'. BTW, is this the only case in that test suite file? -- This is an automated message from the Apache

[GitHub] [spark] dongjoon-hyun commented on pull request #33092: [SPARK-35905][SQL][FOLLOWUP][TESTS] Adding withTable in SQLQuerySuite about test("SPARK-33338: GROUP BY using literal map should not fa

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33092: URL: https://github.com/apache/spark/pull/33092#issuecomment-868956083 It's okay, but you should remove '[FOLLOWUP]'. BTW, is this the only case in that test suite file? -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] dongjoon-hyun commented on pull request #33063: [SPARK-35879][Core][Shuffle] Fix performance regression caused by collectFetchRequests

2021-06-26 Thread GitBox
dongjoon-hyun commented on pull request #33063: URL: https://github.com/apache/spark/pull/33063#issuecomment-868955270 Thank you, @yaooqinn and @mridulm ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above