[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-20 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-661344352 @hvanhovell . The following is complete wrong because the above optimization was one of the recommendations for many Hortonworks customers to save their HDFS usage.

[GitHub] [spark] dongjoon-hyun removed a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-20 Thread GitBox
dongjoon-hyun removed a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-661345335 For the following, I added SPARK-32318 added a test coverage at master/3.0/2.4. Are you suggesting that's not enough? > Finally I do want to point out that there

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29159: [SPARK-32310][ML][PySpark] ML params default value parity

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29159: URL: https://github.com/apache/spark/pull/29159#issuecomment-661345145 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-20 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-661345335 For the following, I added SPARK-32318 added a test coverage at master/3.0/2.4. Are you suggesting that's not enough? > Finally I do want to point out that there is no m

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-20 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-661344352 @hvanhovell . The following is complete wrong because the above optimization was one of the recommendations for many Hortonworks customers to save their HDFS usage.

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-20 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-661344352 @hvanhovell . The following is complete wrong because the above optimization was one of the recommendations for many Hortonworks customers to save their HDFS usage.

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-20 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-661344352 @hvanhovell . The following is complete wrong because the above optimization was one of the recommendations for many Hortonworks customers to save their HDFS usage.

[GitHub] [spark] imback82 commented on pull request #29167: [SPARK-32374][SQL] Disallow setting properties when creating temporary views

2020-07-20 Thread GitBox
imback82 commented on pull request #29167: URL: https://github.com/apache/spark/pull/29167#issuecomment-661347555 > Hmm, does it mean the specified properties are not used by Spark currently? If users possibly already use it? Correct, they are not being used for temporary views. Thus

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-20 Thread GitBox
dongjoon-hyun edited a comment on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-661344352 @hvanhovell . Thank you for your feedback. The following looks a little wrong to me because the above optimization was one of the recommendations for many Hortonwor

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-20 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-661348318 For the following, I'd like to ask your help if you are interested. I believe we want to build the better Apache Spark in the community together. > If you generalize the

[GitHub] [spark] holdenk commented on a change in pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-20 Thread GitBox
holdenk commented on a change in pull request #29014: URL: https://github.com/apache/spark/pull/29014#discussion_r457705293 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ## @@ -912,13 +914,39 @@ private[spark] class TaskSchedulerImpl(

[GitHub] [spark] huaxingao commented on pull request #29159: [SPARK-32310][ML][PySpark] ML params default value parity

2020-07-20 Thread GitBox
huaxingao commented on pull request #29159: URL: https://github.com/apache/spark/pull/29159#issuecomment-661352641 @viirya It's backporting #29122 and #29153. I haven't finished #29153 yet. I want to finish 3.0 first because seems the plan is to cut 3.0.1 pretty soon. ---

[GitHub] [spark] SparkQA commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-20 Thread GitBox
SparkQA commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-661355673 **[Test build #126204 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126204/testReport)** for PR 29067 at commit [`38ca741`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-661356780 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] revans2 commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-20 Thread GitBox
revans2 commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-661355943 @cloud-fan I just added in the API changes that you requested. Please let me know what you think, and if the changes to `QueryExecution` are acceptable. I will work on some tes

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-661356780 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on pull request #25870: [SPARK-27936][K8S] Support python deps

2020-07-20 Thread GitBox
dongjoon-hyun commented on pull request #25870: URL: https://github.com/apache/spark/pull/25870#issuecomment-661368864 +1 for @holdenk 's comment. Also, in general, we cannot backport the improvement JIRA. This is an automat

[GitHub] [spark] mukulmurthy commented on pull request #29107: [SPARK-32308][SQL] Move by-name resolution logic of unionByName from API code to analysis phase

2020-07-20 Thread GitBox
mukulmurthy commented on pull request #29107: URL: https://github.com/apache/spark/pull/29107#issuecomment-661372139 Awesome, I just created https://issues.apache.org/jira/browse/SPARK-32376. We've implemented a utility to do this before; I'll sync with my team and see if it's easy t

[GitHub] [spark] SparkQA commented on pull request #29159: [SPARK-32310][ML][PySpark][3.0] ML params default value parity

2020-07-20 Thread GitBox
SparkQA commented on pull request #29159: URL: https://github.com/apache/spark/pull/29159#issuecomment-661404799 **[Test build #126203 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126203/testReport)** for PR 29159 at commit [`16abf9c`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29159: [SPARK-32310][ML][PySpark][3.0] ML params default value parity

2020-07-20 Thread GitBox
SparkQA removed a comment on pull request #29159: URL: https://github.com/apache/spark/pull/29159#issuecomment-661344542 **[Test build #126203 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126203/testReport)** for PR 29159 at commit [`16abf9c`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #29159: [SPARK-32310][ML][PySpark][3.0] ML params default value parity

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29159: URL: https://github.com/apache/spark/pull/29159#issuecomment-661405808 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] viirya commented on pull request #29107: [SPARK-32308][SQL] Move by-name resolution logic of unionByName from API code to analysis phase

2020-07-20 Thread GitBox
viirya commented on pull request #29107: URL: https://github.com/apache/spark/pull/29107#issuecomment-661406076 Thanks @mukulmurthy. I've not worked on it yet. Feel free to open a PR if you are ready to port the code. This i

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29159: [SPARK-32310][ML][PySpark][3.0] ML params default value parity

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29159: URL: https://github.com/apache/spark/pull/29159#issuecomment-661405808 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] ulysses-you closed pull request #29142: [SPARK-32360][SQL] Add MaxMinBy to support eliminate sorts

2020-07-20 Thread GitBox
ulysses-you closed pull request #29142: URL: https://github.com/apache/spark/pull/29142 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] ulysses-you commented on pull request #29142: [SPARK-32360][SQL] Add MaxMinBy to support eliminate sorts

2020-07-20 Thread GitBox
ulysses-you commented on pull request #29142: URL: https://github.com/apache/spark/pull/29142#issuecomment-661418274 Thanks for the negative case, seems I missed something. This is an automated message from the Apache Git Ser

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29160: [SPARK-32364][SQL] `path` argument of DataFrame.load/save should override the existing options

2020-07-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #29160: URL: https://github.com/apache/spark/pull/29160#discussion_r457747220 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -211,6 +211,7 @@ class DataFrameReader private[sql](sparkSess

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29160: [SPARK-32364][SQL] `path` argument of DataFrame.load/save should override the existing options

2020-07-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #29160: URL: https://github.com/apache/spark/pull/29160#discussion_r457747220 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -211,6 +211,7 @@ class DataFrameReader private[sql](sparkSess

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29160: [SPARK-32364][SQL] `path` argument of DataFrame.load/save should override the existing options

2020-07-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #29160: URL: https://github.com/apache/spark/pull/29160#discussion_r457749747 ## File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ## @@ -211,6 +211,7 @@ class DataFrameReader private[sql](sparkSess

[GitHub] [spark] SparkQA commented on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests

2020-07-20 Thread GitBox
SparkQA commented on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661447178 **[Test build #126200 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126200/testReport)** for PR 29169 at commit [`ba22a2e`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests

2020-07-20 Thread GitBox
SparkQA removed a comment on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661316285 **[Test build #126200 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126200/testReport)** for PR 29169 at commit [`ba22a2e`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661448336 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661448336 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661448350 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/126

[GitHub] [spark] SparkQA commented on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests

2020-07-20 Thread GitBox
SparkQA commented on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661450980 **[Test build #126205 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126205/testReport)** for PR 29169 at commit [`7173918`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #29138: [SPARK-32338] [SQL] Overload slice to accept Column for start and length

2020-07-20 Thread GitBox
SparkQA commented on pull request #29138: URL: https://github.com/apache/spark/pull/29138#issuecomment-661451919 **[Test build #126194 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126194/testReport)** for PR 29138 at commit [`43d4f18`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661451918 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29138: [SPARK-32338] [SQL] Overload slice to accept Column for start and length

2020-07-20 Thread GitBox
SparkQA removed a comment on pull request #29138: URL: https://github.com/apache/spark/pull/29138#issuecomment-661251953 **[Test build #126194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126194/testReport)** for PR 29138 at commit [`43d4f18`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29169: [SPARK-32357][INFRA] Add a step in GitHub Actions to show failed tests

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29169: URL: https://github.com/apache/spark/pull/29169#issuecomment-661451918 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] xuanyuanking commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-20 Thread GitBox
xuanyuanking commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661452326 retest this please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] AmplabJenkins commented on pull request #29138: [SPARK-32338] [SQL] Overload slice to accept Column for start and length

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29138: URL: https://github.com/apache/spark/pull/29138#issuecomment-661453394 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29138: [SPARK-32338] [SQL] Overload slice to accept Column for start and length

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29138: URL: https://github.com/apache/spark/pull/29138#issuecomment-661453394 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] navinvishy opened a new pull request #29170: [SPARK-30876][SQL]: Optimizer fails to infer constraints within join

2020-07-20 Thread GitBox
navinvishy opened a new pull request #29170: URL: https://github.com/apache/spark/pull/29170 ### What changes were proposed in this pull request? For a 3-way join of the kind described below, the optimizer fails to infer the constraint a=1. This appears to be because of the i

[GitHub] [spark] SparkQA commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-20 Thread GitBox
SparkQA commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661457578 **[Test build #126206 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126206/testReport)** for PR 28977 at commit [`0556b4b`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #29170: [SPARK-30876][SQL]: Optimizer fails to infer constraints within join

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29170: URL: https://github.com/apache/spark/pull/29170#issuecomment-661457209 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] AmplabJenkins commented on pull request #29170: [SPARK-30876][SQL]: Optimizer fails to infer constraints within join

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29170: URL: https://github.com/apache/spark/pull/29170#issuecomment-661458294 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661458773 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661458773 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29170: [SPARK-30876][SQL]: Optimizer fails to infer constraints within join

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29170: URL: https://github.com/apache/spark/pull/29170#issuecomment-661457209 Can one of the admins verify this patch? This is an automated message from the Apache Git Service.

[GitHub] [spark] SparkQA commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-20 Thread GitBox
SparkQA commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661464775 **[Test build #126207 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126207/testReport)** for PR 28977 at commit [`38e5ff4`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661465707 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661465707 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-661465722 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8

[GitHub] [spark] AmplabJenkins commented on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661472688 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661472688 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] ulysses-you commented on a change in pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-20 Thread GitBox
ulysses-you commented on a change in pull request #29152: URL: https://github.com/apache/spark/pull/29152#discussion_r457762258 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ## @@ -96,6 +97,7 @@ case class CreateViewCommand( qe.a

[GitHub] [spark] ulysses-you commented on a change in pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-20 Thread GitBox
ulysses-you commented on a change in pull request #29152: URL: https://github.com/apache/spark/pull/29152#discussion_r457762298 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/PlanResolutionSuite.scala ## @@ -1557,6 +1557,19 @@ class PlanResolution

[GitHub] [spark] SparkQA commented on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-20 Thread GitBox
SparkQA commented on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661478293 **[Test build #126208 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126208/testReport)** for PR 29152 at commit [`94fb546`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661479072 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29152: [SPARK-32356][SQL] Forbid create view with null type

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29152: URL: https://github.com/apache/spark/pull/29152#issuecomment-661479072 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29032: URL: https://github.com/apache/spark/pull/29032#discussion_r457764133 ## File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala ## @@ -871,7 +871,7 @@ private[deploy] class Master( logInfo(

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29032: URL: https://github.com/apache/spark/pull/29032#discussion_r457764260 ## File path: core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala ## @@ -166,11 +166,12 @@ private[spark] class Coarse

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29032: URL: https://github.com/apache/spark/pull/29032#discussion_r457764507 ## File path: core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala ## @@ -166,11 +166,12 @@ private[spark] class Coarse

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29032: URL: https://github.com/apache/spark/pull/29032#discussion_r457765945 ## File path: core/src/main/scala/org/apache/spark/scheduler/ExecutorDecommissionInfo.scala ## @@ -0,0 +1,28 @@ +/* + * Licensed to the Apache Soft

[GitHub] [spark] SparkQA removed a comment on pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-07-20 Thread GitBox
SparkQA removed a comment on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-661271487 **[Test build #126195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126195/testReport)** for PR 29066 at commit [`98fb788`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-07-20 Thread GitBox
SparkQA commented on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-661493617 **[Test build #126195 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126195/testReport)** for PR 29066 at commit [`98fb788`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-661495215 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-661495215 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] ramrock2008 opened a new pull request #29171: Spark works despite SSL certificate in keystore has expired

2020-07-20 Thread GitBox
ramrock2008 opened a new pull request #29171: URL: https://github.com/apache/spark/pull/29171 Hello Spark developers, I'm running the following basic spark job on YARN with SSL enabled: ``` spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode clien

[GitHub] [spark] AmplabJenkins commented on pull request #29171: Spark works despite SSL certificate in keystore has expired

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29171: URL: https://github.com/apache/spark/pull/29171#issuecomment-661497286 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] dongjoon-hyun opened a new pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition

2020-07-20 Thread GitBox
dongjoon-hyun opened a new pull request #29172: URL: https://github.com/apache/spark/pull/29172 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] AmplabJenkins commented on pull request #29171: Spark works despite SSL certificate in keystore has expired

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29171: URL: https://github.com/apache/spark/pull/29171#issuecomment-661498315 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29171: Spark works despite SSL certificate in keystore has expired

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29171: URL: https://github.com/apache/spark/pull/29171#issuecomment-661497286 Can one of the admins verify this patch? This is an automated message from the Apache Git Service.

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29032: URL: https://github.com/apache/spark/pull/29032#discussion_r457770003 ## File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala ## @@ -871,7 +871,7 @@ private[deploy] class Master( logInfo(

[GitHub] [spark] ueshin commented on pull request #29138: [SPARK-32338] [SQL] Overload slice to accept Column for start and length

2020-07-20 Thread GitBox
ueshin commented on pull request #29138: URL: https://github.com/apache/spark/pull/29138#issuecomment-661501964 Thanks! merging to master. This is an automated message from the Apache Git Service. To respond to the message, p

[GitHub] [spark] ueshin closed pull request #29138: [SPARK-32338] [SQL] Overload slice to accept Column for start and length

2020-07-20 Thread GitBox
ueshin closed pull request #29138: URL: https://github.com/apache/spark/pull/29138 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [spark] SparkQA commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition

2020-07-20 Thread GitBox
SparkQA commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661503811 **[Test build #126209 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126209/testReport)** for PR 29172 at commit [`27bd6d5`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661504894 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] dongjoon-hyun commented on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition

2020-07-20 Thread GitBox
dongjoon-hyun commented on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661504976 We need to fix this fundamental bug first. This is an automated message from the Apache Git Service. To re

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29172: URL: https://github.com/apache/spark/pull/29172#issuecomment-661504894 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29172: [SPARK-32377][SQL] CaseInsensitiveMap should be deterministic for addition

2020-07-20 Thread GitBox
dongjoon-hyun commented on a change in pull request #29172: URL: https://github.com/apache/spark/pull/29172#discussion_r457773370 ## File path: sql/catalyst/src/main/scala-2.12/org/apache/spark/sql/catalyst/util/CaseInsensitiveMap.scala ## @@ -40,7 +40,7 @@ class CaseInsensiti

[GitHub] [spark] SparkQA commented on pull request #29168: [WIP][SPARK-32375][SQL] Basic functionality of table catalog v2 for JDBC

2020-07-20 Thread GitBox
SparkQA commented on pull request #29168: URL: https://github.com/apache/spark/pull/29168#issuecomment-661514577 **[Test build #126197 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126197/testReport)** for PR 29168 at commit [`43c35fc`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29168: [WIP][SPARK-32375][SQL] Basic functionality of table catalog v2 for JDBC

2020-07-20 Thread GitBox
SparkQA removed a comment on pull request #29168: URL: https://github.com/apache/spark/pull/29168#issuecomment-661305415 **[Test build #126197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/126197/testReport)** for PR 29168 at commit [`43c35fc`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29168: [WIP][SPARK-32375][SQL] Basic functionality of table catalog v2 for JDBC

2020-07-20 Thread GitBox
AmplabJenkins removed a comment on pull request #29168: URL: https://github.com/apache/spark/pull/29168#issuecomment-661516112 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29168: [WIP][SPARK-32375][SQL] Basic functionality of table catalog v2 for JDBC

2020-07-20 Thread GitBox
AmplabJenkins commented on pull request #29168: URL: https://github.com/apache/spark/pull/29168#issuecomment-661516112 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] dongjoon-hyun commented on pull request #29142: [SPARK-32360][SQL] Add MaxMinBy to support eliminate sorts

2020-07-20 Thread GitBox
dongjoon-hyun commented on pull request #29142: URL: https://github.com/apache/spark/pull/29142#issuecomment-661517034 Thank you, @ulysses-you . This is an automated message from the Apache Git Service. To respond to the mess

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29142: [SPARK-32360][SQL] Add MaxMinBy to support eliminate sorts

2020-07-20 Thread GitBox
dongjoon-hyun edited a comment on pull request #29142: URL: https://github.com/apache/spark/pull/29142#issuecomment-661517034 Thank you for closing, @ulysses-you . This is an automated message from the Apache Git Service. To

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29014: URL: https://github.com/apache/spark/pull/29014#discussion_r457776437 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ## @@ -912,13 +914,39 @@ private[spark] class TaskSchedulerImp

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29014: URL: https://github.com/apache/spark/pull/29014#discussion_r457776625 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ## @@ -1007,6 +1035,8 @@ private[spark] class TaskSchedulerImp

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29014: URL: https://github.com/apache/spark/pull/29014#discussion_r457778416 ## File path: core/src/test/scala/org/apache/spark/deploy/client/AppClientSuite.scala ## @@ -245,8 +249,8 @@ class AppClientSuite execRemove

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29014: URL: https://github.com/apache/spark/pull/29014#discussion_r457778500 ## File path: core/src/test/scala/org/apache/spark/deploy/DecommissionWorkerSuite.scala ## @@ -0,0 +1,388 @@ +/* + * Licensed to the Apache Softwar

[GitHub] [spark] xuanyuanking commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-20 Thread GitBox
xuanyuanking commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r457778586 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala ## @@ -106,10 +106,8 @@ abstract class

[GitHub] [spark] HeartSaVioR commented on pull request #29149: [SPARK-32350][CORE] Add batch-write on LevelDB to improve performance of HybridStore

2020-07-20 Thread GitBox
HeartSaVioR commented on pull request #29149: URL: https://github.com/apache/spark/pull/29149#issuecomment-661527330 @mridulm @tgravescs I'm planning to merge this in tomorrow. Please comment if you'd like to have time to review. Thanks!

[GitHub] [spark] xuanyuanking commented on pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-07-20 Thread GitBox
xuanyuanking commented on pull request #24173: URL: https://github.com/apache/spark/pull/24173#issuecomment-661528439 Sorry for the late, I'll review this one in this week. I think it's the other necessary guard for state store format together with #28707.

[GitHub] [spark] leanken commented on a change in pull request #29104: [SPARK-32290][SQL] SingleColumn Null Aware Anti Join Optimize

2020-07-20 Thread GitBox
leanken commented on a change in pull request #29104: URL: https://github.com/apache/spark/pull/29104#discussion_r457784068 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2671,6 +2671,25 @@ object SQLConf { .checkValue(_ > 0

[GitHub] [spark] leanken commented on a change in pull request #29104: [SPARK-32290][SQL] SingleColumn Null Aware Anti Join Optimize

2020-07-20 Thread GitBox
leanken commented on a change in pull request #29104: URL: https://github.com/apache/spark/pull/29104#discussion_r457786215 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -44,7 +45,7 @@ case class BroadcastHashJoinE

[GitHub] [spark] leanken commented on a change in pull request #29104: [SPARK-32290][SQL] SingleColumn Null Aware Anti Join Optimize

2020-07-20 Thread GitBox
leanken commented on a change in pull request #29104: URL: https://github.com/apache/spark/pull/29104#discussion_r457786737 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2671,6 +2671,25 @@ object SQLConf { .checkValue(_ > 0

[GitHub] [spark] viirya commented on a change in pull request #29139: [SPARK-32339][ML][DOC] Improve MLlib BLAS native acceleration docs

2020-07-20 Thread GitBox
viirya commented on a change in pull request #29139: URL: https://github.com/apache/spark/pull/29139#discussion_r457788794 ## File path: docs/ml-linalg-guide.md ## @@ -0,0 +1,105 @@ +--- +layout: global +title: MLlib Linear Algebra Acceleration Guide +displayTitle: MLlib Linear

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29014: URL: https://github.com/apache/spark/pull/29014#discussion_r457791246 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ## @@ -1767,10 +1767,18 @@ private[spark] class DAGScheduler(

[GitHub] [spark] leanken commented on a change in pull request #29104: [SPARK-32290][SQL] SingleColumn Null Aware Anti Join Optimize

2020-07-20 Thread GitBox
leanken commented on a change in pull request #29104: URL: https://github.com/apache/spark/pull/29104#discussion_r457791849 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -385,55 +408,101 @@ case class BroadcastHash

[GitHub] [spark] agrawaldevesh commented on a change in pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-20 Thread GitBox
agrawaldevesh commented on a change in pull request #29014: URL: https://github.com/apache/spark/pull/29014#discussion_r457791845 ## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ## @@ -912,13 +914,39 @@ private[spark] class TaskSchedulerImp

<    1   2   3   4   5   6   7   8   9   >