[GitHub] [spark] AmplabJenkins commented on pull request #29130: [SPARK-32330][SQL] Preserve shuffled hash join build side partitioning

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29130: URL: https://github.com/apache/spark/pull/29130#issuecomment-659176704 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] jiangxb1987 commented on a change in pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
jiangxb1987 commented on a change in pull request #29032: URL: https://github.com/apache/spark/pull/29032#discussion_r455525039 ## File path: core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala ## @@ -181,7 +182,8 @@ private[spark] class

[GitHub] [spark] c21 commented on pull request #29130: [SPARK-32330][SQL] Preserve shuffled hash join build side partitioning

2020-07-15 Thread GitBox
c21 commented on pull request #29130: URL: https://github.com/apache/spark/pull/29130#issuecomment-659174967 cc @maropu, @cloud-fan, @gatorsmile and @sameeragarwal if you guys can help take a look. Thanks! This is an

[GitHub] [spark] jiangxb1987 commented on a change in pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
jiangxb1987 commented on a change in pull request #29032: URL: https://github.com/apache/spark/pull/29032#discussion_r455524790 ## File path: core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala ## @@ -181,7 +182,8 @@ private[spark] class

[GitHub] [spark] c21 opened a new pull request #29130: [SPARK-32330][SQL] Preserve shuffled hash join build side partitioning

2020-07-15 Thread GitBox
c21 opened a new pull request #29130: URL: https://github.com/apache/spark/pull/29130 ### What changes were proposed in this pull request? Currently `ShuffledHashJoin.outputPartitioning` inherits from `HashJoin.outputPartitioning`, which only preserves stream side

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659172455 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits

2020-07-15 Thread GitBox
SparkQA commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-659172738 **[Test build #125937 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125937/testReport)** for PR 28961 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659172455 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659171827 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659171827 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659112300 **[Test build #125925 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125925/testReport)** for PR 29032 at commit

[GitHub] [spark] zhengruifeng commented on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-15 Thread GitBox
zhengruifeng commented on pull request #29095: URL: https://github.com/apache/spark/pull/29095#issuecomment-659171262 @viirya This PR is only on cpu. This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] SparkQA commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
SparkQA commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659171509 **[Test build #125925 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125925/testReport)** for PR 29032 at commit

[GitHub] [spark] viirya commented on a change in pull request #29107: [SPARK-32308][SQL] Move by-name resolution logic of unionByName from API code to analysis phase

2020-07-15 Thread GitBox
viirya commented on a change in pull request #29107: URL: https://github.com/apache/spark/pull/29107#discussion_r455521016 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ## @@ -1099,6 +1101,64 @@ object TypeCoercion {

[GitHub] [spark] viirya commented on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-15 Thread GitBox
viirya commented on pull request #29095: URL: https://github.com/apache/spark/pull/29095#issuecomment-659170171 Is this also memory optimization? But looks like cpu time optimization from the description? This is an

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659169320 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659169320 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] jiangxb1987 commented on a change in pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
jiangxb1987 commented on a change in pull request #29015: URL: https://github.com/apache/spark/pull/29015#discussion_r455517786 ## File path: core/src/main/scala/org/apache/spark/deploy/master/ui/MasterWebUI.scala ## @@ -49,6 +55,26 @@ class MasterWebUI( "/app/kill",

[GitHub] [spark] viirya commented on a change in pull request #29112: [SPARK-32310][ML][PySpark] ML params default value parity in classification, regression, clustering and fpm

2020-07-15 Thread GitBox
viirya commented on a change in pull request #29112: URL: https://github.com/apache/spark/pull/29112#discussion_r455519563 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/FMClassifier.scala ## @@ -85,7 +85,6 @@ class FMClassifier @Since("3.0.0") ( */

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659168327 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659168327 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29126: [SPARK-32324][SQL]Fix error messages during using PIVOT and lateral view

2020-07-15 Thread GitBox
SparkQA commented on pull request #29126: URL: https://github.com/apache/spark/pull/29126#issuecomment-659168426 **[Test build #125936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125936/testReport)** for PR 29126 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659118412 **[Test build #125926 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125926/testReport)** for PR 29032 at commit

[GitHub] [spark] SparkQA commented on pull request #29032: [SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

2020-07-15 Thread GitBox
SparkQA commented on pull request #29032: URL: https://github.com/apache/spark/pull/29032#issuecomment-659167771 **[Test build #125926 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125926/testReport)** for PR 29032 at commit

[GitHub] [spark] zhengruifeng commented on a change in pull request #29112: [SPARK-32310][ML][PySpark] ML params default value parity in classification, regression, clustering and fpm

2020-07-15 Thread GitBox
zhengruifeng commented on a change in pull request #29112: URL: https://github.com/apache/spark/pull/29112#discussion_r455517498 ## File path: mllib/src/main/scala/org/apache/spark/ml/clustering/BisectingKMeans.scala ## @@ -68,6 +68,12 @@ private[clustering] trait

[GitHub] [spark] zhengruifeng commented on pull request #29095: [SPARK-32298][ML] tree models prediction optimization

2020-07-15 Thread GitBox
zhengruifeng commented on pull request #29095: URL: https://github.com/apache/spark/pull/29095#issuecomment-659165588 friendly ping @huaxingao @srowen @viirya Different another attempt to save RAM, this should be a clear optimization. I found that those methods can not be marked

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-659162582 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-659162582 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659162307 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659162291 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
SparkQA commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659162409 **[Test build #125935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125935/testReport)** for PR 29015 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659162291 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659048554 **[Test build #125912 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125912/testReport)** for PR 27366 at commit

[GitHub] [spark] SparkQA commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-15 Thread GitBox
SparkQA commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659161699 **[Test build #125912 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125912/testReport)** for PR 27366 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659160481 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659160481 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] venkata91 edited a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spa

2020-07-15 Thread GitBox
venkata91 edited a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-658977388 > you can make a common function that has most of the code that gets called from 2 separate tests. one test passes with dynamic allocation on, the other with it off.

[GitHub] [spark] HyukjinKwon commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-15 Thread GitBox
HyukjinKwon commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659159126 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-659158284 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29129: [SPARK-31831] [SQL] [TESTS] put mocks in hive version subdirectory

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29129: URL: https://github.com/apache/spark/pull/29129#issuecomment-659158107 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-659158284 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29129: [SPARK-31831] [SQL] [TESTS] put mocks in hive version subdirectory

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29129: URL: https://github.com/apache/spark/pull/29129#issuecomment-659157810 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] AmplabJenkins commented on pull request #29129: [SPARK-31831] [SQL] [TESTS] put mocks in hive version subdirectory

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29129: URL: https://github.com/apache/spark/pull/29129#issuecomment-659157810 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] frankyin-factual opened a new pull request #29129: [SPARK-31831] [SQL] [TESTS] put mocks in hive version subdirectory

2020-07-15 Thread GitBox
frankyin-factual opened a new pull request #29129: URL: https://github.com/apache/spark/pull/29129 ### What changes were proposed in this pull request? put version dependent hive mocks into its own subdirectories. ### Why are the changes needed? Fix broken hive

[GitHub] [spark] frankyin-factual commented on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
frankyin-factual commented on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659157488 @HeartSaVioR @dongjoon-hyun https://github.com/apache/spark/pull/29129 This is an automated message

[GitHub] [spark] agrawaldevesh commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
agrawaldevesh commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659156968 Retest this please. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-654385667 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] cloud-fan commented on pull request #29014: [SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

2020-07-15 Thread GitBox
cloud-fan commented on pull request #29014: URL: https://github.com/apache/spark/pull/29014#issuecomment-659156705 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] agrawaldevesh commented on a change in pull request #28708: [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

2020-07-15 Thread GitBox
agrawaldevesh commented on a change in pull request #28708: URL: https://github.com/apache/spark/pull/28708#discussion_r455424422 ## File path: core/src/main/scala/org/apache/spark/network/netty/NettyBlockTransferService.scala ## @@ -168,7 +168,10 @@ private[spark] class

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29128: [SPARK-32329][TESTS] Rename HADOOP2_MODULE_PROFILES to HADOOP_MODULE_PROFILES

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29128: URL: https://github.com/apache/spark/pull/29128#issuecomment-659155710 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] AmplabJenkins commented on pull request #29128: [SPARK-32329][TESTS] Rename HADOOP2_MODULE_PROFILES to HADOOP_MODULE_PROFILES

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29128: URL: https://github.com/apache/spark/pull/29128#issuecomment-659156074 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] agrawaldevesh commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
agrawaldevesh commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659155613 jenkins retest this please This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins commented on pull request #29128: [SPARK-32329][TESTS] Rename HADOOP2_MODULE_PROFILES to HADOOP_MODULE_PROFILES

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29128: URL: https://github.com/apache/spark/pull/29128#issuecomment-659155710 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] williamhyun opened a new pull request #29128: [SPARK-XXX][TESTS] Rename HADOOP2_MODULE_PROFILES to HADOOP_MODULE_PROFILES

2020-07-15 Thread GitBox
williamhyun opened a new pull request #29128: URL: https://github.com/apache/spark/pull/29128 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] frankyin-factual commented on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
frankyin-factual commented on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659154829 I am working on a combination of 1) and 2). Will push shortly. This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28676: URL: https://github.com/apache/spark/pull/28676#issuecomment-659154211 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28676: URL: https://github.com/apache/spark/pull/28676#issuecomment-659154211 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] imback82 commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-15 Thread GitBox
imback82 commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455505839 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -60,6 +62,67 @@ case class

[GitHub] [spark] dongjoon-hyun commented on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
dongjoon-hyun commented on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659151759 Thank you. I'm fine for all combination (including Hive 2.3 only testing). Please feel free to choose an option. From my side, this also looks not urgent since this is

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659150287 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659150287 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HeartSaVioR commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
HeartSaVioR commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659149320 Technically it's a private API, even not tagged as developer API - that said, it doesn't break anything in Spark's perspective. If we have confusion with availability of

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due t

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659148768 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA commented on pull request #27694: [SPARK-30946][SS] Serde entry via DataInputStream/DataOutputStream with LZ4 compression on FileStream(Source/Sink)Log

2020-07-15 Thread GitBox
SparkQA commented on pull request #27694: URL: https://github.com/apache/spark/pull/27694#issuecomment-659148901 **[Test build #125934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125934/testReport)** for PR 27694 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due t

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659148755 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spark'

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659148755 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659148369 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659148369 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spar

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659083280 **[Test build #125918 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125918/testReport)** for PR 28287 at commit

[GitHub] [spark] SparkQA commented on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spark's blac

2020-07-15 Thread GitBox
SparkQA commented on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659148089 **[Test build #125918 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125918/testReport)** for PR 28287 at commit

[GitHub] [spark] dongjoon-hyun commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
dongjoon-hyun commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-659147381 Thank you for quick updating, @aokolnychyi . Also, thank you all for your opinions. This is an automated

[GitHub] [spark] HyukjinKwon commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
HyukjinKwon commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659146783 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HeartSaVioR edited a comment on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
HeartSaVioR edited a comment on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659144203 I guess we have several possible approaches here: 1. place the suite to the Hive-version specific directory (with new config on pom.xml to add the test source

[GitHub] [spark] HeartSaVioR commented on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
HeartSaVioR commented on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659144203 I guess we have several possible approaches here: 1. place the suite to the Hive-version specific directory (with new config on pom.xml to add the test source based

[GitHub] [spark] leanken commented on pull request #29104: [SPARK-32290][SQL] NotInSubquery SingleColumn Optimize

2020-07-15 Thread GitBox
leanken commented on pull request #29104: URL: https://github.com/apache/spark/pull/29104#issuecomment-659144171 > For example. > -- Case 4 > -- (one column null, other column matches a row in the subquery result -> row not returned) > SELECT * > FROM m > WHERE b = 1.0 --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-659142626 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due t

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659142963 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spark'

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659142963 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
SparkQA commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659143085 **[Test build #125933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125933/testReport)** for PR 28904 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-659142621 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #29002: [SPARK-32175][CORE] Fix the order between initialization for ExecutorPlugin and starting heartbeat thread

2020-07-15 Thread GitBox
SparkQA commented on pull request #29002: URL: https://github.com/apache/spark/pull/29002#issuecomment-659142838 **[Test build #125932 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125932/testReport)** for PR 29002 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spar

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659088466 **[Test build #125921 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125921/testReport)** for PR 28287 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-659142621 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28287: [SPARK-31418][SCHEDULER] Request more executors in case of dynamic allocation is enabled and a task becomes unschedulable due to spark's blac

2020-07-15 Thread GitBox
SparkQA commented on pull request #28287: URL: https://github.com/apache/spark/pull/28287#issuecomment-659142331 **[Test build #125921 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125921/testReport)** for PR 28287 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-659075914 **[Test build #125916 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125916/testReport)** for PR 29125 at commit

[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-15 Thread GitBox
SparkQA commented on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-659142220 **[Test build #125916 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125916/testReport)** for PR 29125 at commit

[GitHub] [spark] xuanyuanking commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-15 Thread GitBox
xuanyuanking commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659141316 `Well, I guess I already explained why compactLogs is the culprit of the memory issue, right? (#28904 (comment))` Yep that's right. I'm also looking at the code in

[GitHub] [spark] AmplabJenkins commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659141023 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659141023 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HyukjinKwon commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-15 Thread GitBox
HyukjinKwon commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659141249 @maropu, per the documentation [Spark Project Improvement Proposals (SPIP)](http://spark.apache.org/improvement-proposals.html), if you feel like it needs an SPIP, it does.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659139670 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] HyukjinKwon commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-15 Thread GitBox
HyukjinKwon commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659139846 I just saw the comment. Thanks for summarizing @revans2. This is an automated message from the Apache Git

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659139663 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] frankyin-factual commented on pull request #29069: [SPARK-31831][SQL][TESTS] Use subclasses for mock in HiveSessionImplSuite

2020-07-15 Thread GitBox
frankyin-factual commented on pull request #29069: URL: https://github.com/apache/spark/pull/29069#issuecomment-659139939 I will also take a look. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA commented on pull request #29089: [SPARK-32276][SQL] Remove redundant sorts before repartition nodes

2020-07-15 Thread GitBox
SparkQA commented on pull request #29089: URL: https://github.com/apache/spark/pull/29089#issuecomment-659139822 **[Test build #125931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125931/testReport)** for PR 29089 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659139663 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
SparkQA removed a comment on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659093134 **[Test build #125922 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125922/testReport)** for PR 29015 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-15 Thread GitBox
AmplabJenkins removed a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-659139079 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29015: [SPARK-32215] Expose a (protected) /workers/kill endpoint on the MasterWebUI

2020-07-15 Thread GitBox
SparkQA commented on pull request #29015: URL: https://github.com/apache/spark/pull/29015#issuecomment-659139325 **[Test build #125922 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125922/testReport)** for PR 29015 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-15 Thread GitBox
AmplabJenkins commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-659139079 This is an automated message from the Apache Git Service. To respond to the message, please log on to

  1   2   3   4   5   6   7   8   9   >