[GitHub] [spark] AmplabJenkins removed a comment on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727161445 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] c21 commented on pull request #30347: [SPARK-33209][SS] Refactor unit test of stream-stream join in UnsupportedOperationsSuite

2020-11-13 Thread GitBox
c21 commented on pull request #30347: URL: https://github.com/apache/spark/pull/30347#issuecomment-727162216 Thanks @HeartSaVioR for heads up, please take your time. Thanks. This is an automated message from the Apache Git Se

[GitHub] [spark] SparkQA commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
SparkQA commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727161442 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35691/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727161445 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] viirya commented on a change in pull request #30368: [SPARK-33442][SQL] Change Combine Limit to Eliminate limit using max row

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30368: URL: https://github.com/apache/spark/pull/30368#discussion_r523389535 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -1452,11 +1452,21 @@ object PushPredicateThroughJo

[GitHub] [spark] Ngone51 commented on a change in pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting external

2020-11-13 Thread GitBox
Ngone51 commented on a change in pull request #30164: URL: https://github.com/apache/spark/pull/30164#discussion_r523388191 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala ## @@ -657,6 +688,38 @@ class BlockManagerMasterEndpoint(

[GitHub] [spark] maropu commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
maropu commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727158698 > I think it should be added as a separate PR and merge the test PR first? Then we can run against with it to check if the result is the same. @maropu Yea, it looks fine.

[GitHub] [spark] SparkQA commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
SparkQA commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727158547 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35691/ -

[GitHub] [spark] viirya commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r523295714 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EvaluationRuntime.scala ## @@ -0,0 +1,88 @@ +/* + * Licensed to the Ap

[GitHub] [spark] HeartSaVioR commented on pull request #30347: [SPARK-33209][SS] Refactor unit test of stream-stream join in UnsupportedOperationsSuite

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30347: URL: https://github.com/apache/spark/pull/30347#issuecomment-727156640 Hey sorry. Please give me to some more time to start looking into, as here's Sat. already. I'd probably take this in early next week. Thanks for understanding. ---

[GitHub] [spark] viirya commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727155028 Hmm, actually for https://github.com/apache/spark/pull/30341#discussion_r522904967, that is to add some tests in `SQLQueryTestSuite` with `--CONFIG_DIM spark.sql.codegen.wholeSt

[GitHub] [spark] viirya commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727154709 I addressed most of comments except for https://github.com/apache/spark/pull/30341#discussion_r522822589 and https://github.com/apache/spark/pull/30341#discussion_r522904967.

[GitHub] [spark] viirya commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r523384442 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/EvaluationRuntimeSuite.scala ## @@ -0,0 +1,79 @@ +/* + * Licensed to t

[GitHub] [spark] viirya commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r523384132 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EvaluationRuntime.scala ## @@ -0,0 +1,88 @@ +/* + * Licensed to the Ap

[GitHub] [spark] viirya commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r523384163 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionProxy.scala ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apac

[GitHub] [spark] viirya commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r523384138 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionProxy.scala ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apac

[GitHub] [spark] viirya commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r523384143 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionProxy.scala ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apac

[GitHub] [spark] SparkQA commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
SparkQA commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727154090 **[Test build #131088 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131088/testReport)** for PR 30341 at commit [`4780b65`](https://github.com

[GitHub] [spark] viirya commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r523383340 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EvaluationRuntime.scala ## @@ -0,0 +1,88 @@ +/* + * Licensed to the Ap

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30358: [SPARK-33394][SQL] Throw `NoSuchNamespaceException` for not existing namespace in `InMemoryTableCatalog.listTables()`

2020-11-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #30358: URL: https://github.com/apache/spark/pull/30358#discussion_r523383165 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuiteBase.scala ## @@ -57,6 +58,13 @@ trait ShowTablesSuiteB

[GitHub] [spark] MaxGekk commented on a change in pull request #30358: [SPARK-33394][SQL] Throw `NoSuchNamespaceException` for not existing namespace in `InMemoryTableCatalog.listTables()`

2020-11-13 Thread GitBox
MaxGekk commented on a change in pull request #30358: URL: https://github.com/apache/spark/pull/30358#discussion_r523382673 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuiteBase.scala ## @@ -57,6 +58,13 @@ trait ShowTablesSuiteBase ex

[GitHub] [spark] c21 commented on pull request #30347: [SPARK-33209][SS] Refactor unit test of stream-stream join in UnsupportedOperationsSuite

2020-11-13 Thread GitBox
c21 commented on pull request #30347: URL: https://github.com/apache/spark/pull/30347#issuecomment-727151821 @HeartSaVioR - could you also help take a look when you have time? Thanks. This is an automated message from the Apa

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30358: [SPARK-33394][SQL] Throw `NoSuchNamespaceException` for not existing namespace in `InMemoryTableCatalog.listTables()`

2020-11-13 Thread GitBox
dongjoon-hyun commented on a change in pull request #30358: URL: https://github.com/apache/spark/pull/30358#discussion_r523382095 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuiteBase.scala ## @@ -57,6 +58,13 @@ trait ShowTablesSuiteB

[GitHub] [spark] viirya commented on a change in pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30341: URL: https://github.com/apache/spark/pull/30341#discussion_r523381340 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExpressionProxy.scala ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apac

[GitHub] [spark] BryanCutler commented on a change in pull request #30309: [SPARK-33407][PYTHON] Simplify the exception message from Python UDFs (disabled by default)

2020-11-13 Thread GitBox
BryanCutler commented on a change in pull request #30309: URL: https://github.com/apache/spark/pull/30309#discussion_r523380710 ## File path: python/pyspark/util.py ## @@ -75,6 +79,144 @@ def wrapper(*args, **kwargs): return wrapper +def walk_tb(tb): +while tb is n

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxCh

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727147876 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] viirya commented on a change in pull request #30309: [SPARK-33407][PYTHON] Simplify the exception message from Python UDFs (disabled by default)

2020-11-13 Thread GitBox
viirya commented on a change in pull request #30309: URL: https://github.com/apache/spark/pull/30309#discussion_r523380970 ## File path: python/pyspark/util.py ## @@ -177,6 +319,8 @@ def __del__(self): if __name__ == "__main__": import doctest -(failure_count, test_

[GitHub] [spark] HeartSaVioR commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727149697 I just initiated the discussion on dev@ mailing list which I should have been done instead. https://lists.apache.org/thread.html/r30069e17f59e8d29267ae296d568409709054760

[GitHub] [spark] dongjoon-hyun removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
dongjoon-hyun removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727125565 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HeartSaVioR removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727138220 I was feeling odd and became feeling upset because my intention wasn't change from the first comment and the intention was disregarded (at least that's what I felt l

[GitHub] [spark] HeartSaVioR removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727120756 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HyukjinKwon removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HyukjinKwon removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726476089 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HeartSaVioR removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726418896 That's not my point. There's no indication I have produced all review comments (while actually I produced all review comments), and review comment author would be th

[GitHub] [spark] dongjoon-hyun removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
dongjoon-hyun removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726811877 @HeartSaVioR . It seems that you tend to frame the Apache Spark community in negative ways. For your examples, it clearly looks like a straw-man attack. All of us

[GitHub] [spark] dongjoon-hyun removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
dongjoon-hyun removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726413507 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HeartSaVioR removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726410128 (I hope the PR can wait for reviewers' approvals who left valid review comments. I understand the PR can be merged "technically", but doesn't seem to be a good pract

[GitHub] [spark] HeartSaVioR removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726428939 No that's not also my point. I don't claim about the domain owner (but in practice I see there's implicit domain owner). You can merge the PR in SS area without me i

[GitHub] [spark] HeartSaVioR removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726569827 I'd rather avoid the chance of "post-review" whenever possible, but I'd admit everyone has different thoughts. I'm OK with it, and if that's considered here (and no

[GitHub] [spark] HeartSaVioR removed a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR removed a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-726433238 What do you say? I don't say you're missing the comment. I say the confirmation is better to be done by the reviewers who review, not from some other one. The confir

[GitHub] [spark] HeartSaVioR commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727149286 Sorry to all for all noises. Please disregard all conversation. I'll remove them now. This is an automated

[GitHub] [spark] AmplabJenkins commented on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxChunksBein

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727147876 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxChunksBe

2020-11-13 Thread GitBox
SparkQA removed a comment on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727132757 **[Test build #131087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131087/testReport)** for PR 30139 at commit [`fab5557`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxChunksBeingTrans

2020-11-13 Thread GitBox
SparkQA commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727147710 **[Test build #131087 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131087/testReport)** for PR 30139 at commit [`fab5557`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30374: [WIP][SPARK-33444][ML] Added support for Initial model in Gaussian Mixture Model in ML

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30374: URL: https://github.com/apache/spark/pull/30374#issuecomment-727147056 Can one of the admins verify this patch? This is an automated message from the Apache Git Service.

[GitHub] [spark] AmplabJenkins commented on pull request #30374: [WIP][SPARK-33444][ML] Added support for Initial model in Gaussian Mixture Model in ML

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30374: URL: https://github.com/apache/spark/pull/30374#issuecomment-727147154 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] shahar603 opened a new pull request #30374: [WIP][SPARK-33444][ML] Added support for Initial model in Gaussian Mixture Model in ML

2020-11-13 Thread GitBox
shahar603 opened a new pull request #30374: URL: https://github.com/apache/spark/pull/30374 ### What changes were proposed in this pull request? Added an optional `initialModel` in the GaussianMixture class. ### Why are the changes needed? To allow for non random initial

[GitHub] [spark] AmplabJenkins commented on pull request #30374: [WIP][SPARK-33444][ML] Added support for Initial model in Gaussian Mixture Model in ML

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30374: URL: https://github.com/apache/spark/pull/30374#issuecomment-727147056 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] dongjoon-hyun commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
dongjoon-hyun commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727146418 Can we talk over the phone? It would be better for us. This is an automated message from the Apache Git Se

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-727143194 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-727143194 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-13 Thread GitBox
SparkQA removed a comment on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-727093834 **[Test build #131086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131086/testReport)** for PR 29414 at commit [`5b41fbc`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-13 Thread GitBox
SparkQA commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-727143021 **[Test build #131086 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131086/testReport)** for PR 29414 at commit [`5b41fbc`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30373: [SPARK-33183][SQL][FOLLOW-UP] Adjust RemoveRedundantSorts rule order and update config version

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30373: URL: https://github.com/apache/spark/pull/30373#issuecomment-727142229 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30373: [SPARK-33183][SQL][FOLLOW-UP] Adjust RemoveRedundantSorts rule order and update config version

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30373: URL: https://github.com/apache/spark/pull/30373#issuecomment-727142229 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #30373: [SPARK-33183][SQL][FOLLOW-UP] Adjust RemoveRedundantSorts rule order and update config version

2020-11-13 Thread GitBox
SparkQA removed a comment on pull request #30373: URL: https://github.com/apache/spark/pull/30373#issuecomment-727092021 **[Test build #131085 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131085/testReport)** for PR 30373 at commit [`e6b0bb3`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30373: [SPARK-33183][SQL][FOLLOW-UP] Adjust RemoveRedundantSorts rule order and update config version

2020-11-13 Thread GitBox
SparkQA commented on pull request #30373: URL: https://github.com/apache/spark/pull/30373#issuecomment-727142042 **[Test build #131085 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131085/testReport)** for PR 30373 at commit [`e6b0bb3`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxCh

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727140308 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxCh

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727140306 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxChunksBein

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727140306 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxChunksBeingTrans

2020-11-13 Thread GitBox
SparkQA commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727140293 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35690/ ---

[GitHub] [spark] HeartSaVioR edited a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR edited a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727138220 I was feeling odd and became feeling upset because my intention wasn't change from the first comment and the intention was disregarded (at least that's what I felt li

[GitHub] [spark] HeartSaVioR commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727138220 I was feeling odd and became feeling upset because my intention wasn't change from the first comment and the intention was disregarded (at least that's what I felt like) for

[GitHub] [spark] SparkQA commented on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxChunksBeingTrans

2020-11-13 Thread GitBox
SparkQA commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727137606 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35690/ -

[GitHub] [spark] mridulm commented on pull request #30370: [SPARK-33446][CORE] Add config spark.executor.coresOverhead

2020-11-13 Thread GitBox
mridulm commented on pull request #30370: URL: https://github.com/apache/spark/pull/30370#issuecomment-727133352 > I want an executor with 2 cores and 6 gb, but only 1 core used for task allocation, which means at most 1 task could be running on this executor. If I use existing configs, th

[GitHub] [spark] manuzhang commented on pull request #29797: [SPARK-32932][SQL] Do not use local shuffle reader at final stage on write command

2020-11-13 Thread GitBox
manuzhang commented on pull request #29797: URL: https://github.com/apache/spark/pull/29797#issuecomment-727132781 @maryannxue thanks for pointing out the fundamental problems while I'm not familiar with the design behind. When starting out, I was trying to solve the most urgent issue in o

[GitHub] [spark] SparkQA commented on pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxChunksBeingTrans

2020-11-13 Thread GitBox
SparkQA commented on pull request #30139: URL: https://github.com/apache/spark/pull/30139#issuecomment-727132757 **[Test build #131087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131087/testReport)** for PR 30139 at commit [`fab5557`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting externa

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-727123181 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/131

[GitHub] [spark] SparkQA removed a comment on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting external shuf

2020-11-13 Thread GitBox
SparkQA removed a comment on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-726987157 **[Test build #131082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131082/testReport)** for PR 30164 at commit [`9ba4dfb`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting externa

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-727122853 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `ma

2020-11-13 Thread GitBox
AngersZh commented on a change in pull request #30139: URL: https://github.com/apache/spark/pull/30139#discussion_r523364609 ## File path: common/network-common/src/main/java/org/apache/spark/network/server/ChunkFetchRequestHandler.java ## @@ -88,12 +88,14 @@ public void p

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
dongjoon-hyun edited a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727130032 You are repeatedly saying that I had taken your right away here. In this PR's context, it doesn't make sense to me either.

[GitHub] [spark] dongjoon-hyun commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
dongjoon-hyun commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727130032 You are saying that I had taken your right away repeatedly. In this PR's context, it doesn't make sense to me either.

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
dongjoon-hyun edited a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727125565 Please remember that I also replied politely at your first comment like this. I apologized first and tried to understand your request and to fix it immiediately bec

[GitHub] [spark] dongjoon-hyun commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
dongjoon-hyun commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727125565 Please remember that I also replied politely at your first comment like this. I apologized first and tried to understand your request and to fix it immiediately because I

[GitHub] [spark] ueshin commented on pull request #30242: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.

2020-11-13 Thread GitBox
ueshin commented on pull request #30242: URL: https://github.com/apache/spark/pull/30242#issuecomment-727124941 gentle ping This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HeartSaVioR commented on pull request #30366: [WIP][SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-727124742 I just changed the PR to draft, as I would make sure we consider about the better change (than the current approach of the PR) accounting my comments above. Reviewers

[GitHub] [spark] otterc commented on a change in pull request #30139: [SPARK-31069][CORE] Avoid repeat compute `chunksBeingTransferred` cause hight cpu cost in external shuffle service when `maxChunk

2020-11-13 Thread GitBox
otterc commented on a change in pull request #30139: URL: https://github.com/apache/spark/pull/30139#discussion_r523353481 ## File path: common/network-common/src/main/java/org/apache/spark/network/server/ChunkFetchRequestHandler.java ## @@ -88,12 +88,14 @@ public void process

[GitHub] [spark] HeartSaVioR edited a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR edited a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727121487 And I also admit I have different voice on post-review. I agree post-review would open the possibility for reviewers to review later who weren't active during the rev

[GitHub] [spark] AmplabJenkins commented on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting external shuffl

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-727123177 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting external shuffle serv

2020-11-13 Thread GitBox
SparkQA commented on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-727122995 **[Test build #131082 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131082/testReport)** for PR 30164 at commit [`9ba4dfb`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #30164: [SPARK-32919][SHUFFLE][test-maven][test-hadoop2.7] Driver side changes for coordinating push based shuffle by selecting external shuffl

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30164: URL: https://github.com/apache/spark/pull/30164#issuecomment-727122853 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] HeartSaVioR edited a comment on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-13 Thread GitBox
HeartSaVioR edited a comment on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-727107472 @tgravescs Thanks for the comment. > I thought the way we did it was just got the earliest renewal so we didn't have to track all them individually beca

[GitHub] [spark] HeartSaVioR edited a comment on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR edited a comment on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727120756 I was writing a wall of text and Chrome happily (?) killed itself. Rewriting one. What I really asked you to do is exactly this. The practice is also happened

[GitHub] [spark] HeartSaVioR commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727121487 And I also admit I have different voice on post-review. I agree post-review would open the possibility for reviewers to review later who weren't active during the review per

[GitHub] [spark] HeartSaVioR commented on pull request #30210: [SPARK-33259][SS] Disable streaming query with possible correctness issue by default

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30210: URL: https://github.com/apache/spark/pull/30210#issuecomment-727120756 I was writing a wall of text and Chrome happily (?) killed itself. Rewriting one. What I really asked you to do is exactly this. The practice is also happened for the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30371: [SPARK-33337][SQL][FOLLOWUP] Prevent possible flakyness in SubexpressionEliminationSuite

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30371: URL: https://github.com/apache/spark/pull/30371#issuecomment-727107898 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HeartSaVioR commented on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-727108543 @dongjoon-hyun Thanks for the review. Given @tgravescs is revisiting the logic (and considering the above comments as well), I expect the code change would occur afterwa

[GitHub] [spark] AmplabJenkins commented on pull request #30371: [SPARK-33337][SQL][FOLLOWUP] Prevent possible flakyness in SubexpressionEliminationSuite

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30371: URL: https://github.com/apache/spark/pull/30371#issuecomment-727107898 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #30371: [SPARK-33337][SQL][FOLLOWUP] Prevent possible flakyness in SubexpressionEliminationSuite

2020-11-13 Thread GitBox
SparkQA removed a comment on pull request #30371: URL: https://github.com/apache/spark/pull/30371#issuecomment-727018690 **[Test build #131084 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131084/testReport)** for PR 30371 at commit [`9816402`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30371: [SPARK-33337][SQL][FOLLOWUP] Prevent possible flakyness in SubexpressionEliminationSuite

2020-11-13 Thread GitBox
SparkQA commented on pull request #30371: URL: https://github.com/apache/spark/pull/30371#issuecomment-727107533 **[Test build #131084 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131084/testReport)** for PR 30371 at commit [`9816402`](https://github.co

[GitHub] [spark] HeartSaVioR edited a comment on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-13 Thread GitBox
HeartSaVioR edited a comment on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-727107472 @tgravescs Thanks for the comment. > I thought the way we did it was just got the earliest renewal so we didn't have to track all them individually beca

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-727107235 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HeartSaVioR commented on pull request #30366: [SPARK-33440][CORE] Separate the calculation for the next renewal date for each delegation token

2020-11-13 Thread GitBox
HeartSaVioR commented on pull request #30366: URL: https://github.com/apache/spark/pull/30366#issuecomment-727107472 @tgravescs Thanks for the comment. > I thought the way we did it was just got the earliest renewal so we didn't have to track all them individually because the

[GitHub] [spark] SparkQA commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-13 Thread GitBox
SparkQA commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-727107227 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35689/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #29414: [SPARK-32106][SQL] Implement script transform in sql/core

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #29414: URL: https://github.com/apache/spark/pull/29414#issuecomment-727107235 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30373: [SPARK-33183][SQL][FOLLOW-UP] Adjust RemoveRedundantSorts rule order and update config version

2020-11-13 Thread GitBox
AmplabJenkins removed a comment on pull request #30373: URL: https://github.com/apache/spark/pull/30373#issuecomment-727106355 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30373: [SPARK-33183][SQL][FOLLOW-UP] Adjust RemoveRedundantSorts rule order and update config version

2020-11-13 Thread GitBox
AmplabJenkins commented on pull request #30373: URL: https://github.com/apache/spark/pull/30373#issuecomment-727106355 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #30373: [SPARK-33183][SQL][FOLLOW-UP] Adjust RemoveRedundantSorts rule order and update config version

2020-11-13 Thread GitBox
SparkQA commented on pull request #30373: URL: https://github.com/apache/spark/pull/30373#issuecomment-727106353 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35688/ ---

[GitHub] [spark] warrenzhu25 commented on pull request #30370: [SPARK-33446][CORE] Add config spark.executor.coresOverhead

2020-11-13 Thread GitBox
warrenzhu25 commented on pull request #30370: URL: https://github.com/apache/spark/pull/30370#issuecomment-727105908 > I am not sure I follow. > > * If you want an executor with 2 cores and 6 gb, you can allocate them with existing configs. > * If you want an executor with 1 core

[GitHub] [spark] mridulm edited a comment on pull request #30370: [SPARK-33446][CORE] Add config spark.executor.coresOverhead

2020-11-13 Thread GitBox
mridulm edited a comment on pull request #30370: URL: https://github.com/apache/spark/pull/30370#issuecomment-727104096 I am not sure I follow. * If you want an executor with 2 cores and 6 gb, you can allocate them with existing configs. * If you want an executor with 1 core and 6

  1   2   3   4   5   6   7   8   >