[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-24 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698758315 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33725/ -

[GitHub] [spark] cloud-fan commented on pull request #29756: [SPARK-32885][SS] Add DataStreamReader.table API

2020-09-24 Thread GitBox
cloud-fan commented on pull request #29756: URL: https://github.com/apache/spark/pull/29756#issuecomment-698755963 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [spark] cloud-fan closed pull request #29756: [SPARK-32885][SS] Add DataStreamReader.table API

2020-09-24 Thread GitBox
cloud-fan closed pull request #29756: URL: https://github.com/apache/spark/pull/29756 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698750504 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698750504 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
SparkQA removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698725725 **[Test build #129103 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129103/testReport)** for PR 29850 at commit [`57b3f98`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698750147 **[Test build #129103 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129103/testReport)** for PR 29850 at commit [`57b3f98`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698747734 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698747726 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698748122 **[Test build #129105 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129105/testReport)** for PR 29850 at commit [`57b3f98`](https://github.com

[GitHub] [spark] SparkQA removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-24 Thread GitBox
SparkQA removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698711104 **[Test build #129102 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129102/testReport)** for PR 29869 at commit [`ba76918`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698747726 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-24 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698747241 **[Test build #129102 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129102/testReport)** for PR 29869 at commit [`ba76918`](https://github.co

[GitHub] [spark] zhengruifeng commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
zhengruifeng commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698747256 retest this please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698746748 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698746743 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698746743 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698746730 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33724/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698746139 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698746139 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
SparkQA removed a comment on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698674623 **[Test build #129098 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129098/testReport)** for PR 29795 at commit [`53d83b6`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
SparkQA commented on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698745385 **[Test build #129098 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129098/testReport)** for PR 29795 at commit [`53d83b6`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external heavy accumulators before they entering into listener event loop

2020-09-24 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698742548 **[Test build #129104 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129104/testReport)** for PR 29869 at commit [`13b0eac`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698739307 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33724/ -

[GitHub] [spark] cloud-fan commented on a change in pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
cloud-fan commented on a change in pull request #29795: URL: https://github.com/apache/spark/pull/29795#discussion_r494764849 ## File path: sql/core/src/test/scala/org/apache/spark/sql/UpdateFieldsBenchmark.scala ## @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29843: [WIP][SPARK-29250][test-maven][test-hadoop2.7] Upgrade to Hadoop 3.2.1 and move to shaded client

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29843: URL: https://github.com/apache/spark/pull/29843#issuecomment-698737847 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #29843: [WIP][SPARK-29250][test-maven][test-hadoop2.7] Upgrade to Hadoop 3.2.1 and move to shaded client

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29843: URL: https://github.com/apache/spark/pull/29843#issuecomment-698737847 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] cloud-fan commented on a change in pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
cloud-fan commented on a change in pull request #29795: URL: https://github.com/apache/spark/pull/29795#discussion_r494764069 ## File path: sql/core/src/test/scala/org/apache/spark/sql/UpdateFieldsBenchmark.scala ## @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] [spark] cloud-fan commented on a change in pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
cloud-fan commented on a change in pull request #29795: URL: https://github.com/apache/spark/pull/29795#discussion_r494763126 ## File path: sql/core/src/test/scala/org/apache/spark/sql/UpdateFieldsBenchmark.scala ## @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] [spark] cloud-fan commented on a change in pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
cloud-fan commented on a change in pull request #29795: URL: https://github.com/apache/spark/pull/29795#discussion_r494762975 ## File path: sql/core/src/test/scala/org/apache/spark/sql/UpdateFieldsBenchmark.scala ## @@ -0,0 +1,310 @@ +/* + * Licensed to the Apache Software Fou

[GitHub] [spark] MLnick commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
MLnick commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698736348 > > @MLnick I just do a simple test, and it shows that we can obtain about 11% speedup. This seems a worthwhile optimization then, thanks! ---

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29828: [SPARK-32948][SQL] Optimize to_json and from_json expression chain

2020-09-24 Thread GitBox
dongjoon-hyun commented on a change in pull request #29828: URL: https://github.com/apache/spark/pull/29828#discussion_r494754892 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeJsonExprs.scala ## @@ -0,0 +1,38 @@ +/* + * Licensed to t

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29869: [WIP][SPARK-32994][CORE] Update external accumulators before they entering into Spark listener event loop

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698726138 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external accumulators before they entering into Spark listener event loop

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698726138 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29869: [WIP][SPARK-32994][CORE] Update external accumulators before they entering into Spark listener event loop

2020-09-24 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698726125 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33723/ ---

[GitHub] [spark] SparkQA commented on pull request #29850: [SPARK-32974][ML] FeatureHasher transform optimization

2020-09-24 Thread GitBox
SparkQA commented on pull request #29850: URL: https://github.com/apache/spark/pull/29850#issuecomment-698725725 **[Test build #129103 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129103/testReport)** for PR 29850 at commit [`57b3f98`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698724874 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-24 Thread GitBox
SparkQA commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698724864 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33722/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698724874 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] dongjoon-hyun closed pull request #29863: [SPARK-32877][SQL][TEST] Add test for Hive UDF complex decimal type

2020-09-24 Thread GitBox
dongjoon-hyun closed pull request #29863: URL: https://github.com/apache/spark/pull/29863 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] zhengruifeng commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
zhengruifeng commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698722302 @huaxingao Great catch! Yes, I need to modify https://github.com/apache/spark/pull/29850 to make sure only columns in `inputCols` can be taken into account. Thanks! -

[GitHub] [spark] SparkQA commented on pull request #29869: [SPARK-32994][CORE] Update external accumulators before they entering into Spark listener event loop

2020-09-24 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698721444 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33723/ -

[GitHub] [spark] SparkQA commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-24 Thread GitBox
SparkQA commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698718868 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33722/ -

[GitHub] [spark] cloud-fan commented on a change in pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
cloud-fan commented on a change in pull request #29795: URL: https://github.com/apache/spark/pull/29795#discussion_r494745836 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Column.scala ## @@ -901,39 +901,125 @@ class Column(val expr: Expression) extends Logging {

[GitHub] [spark] stczwd commented on a change in pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-09-24 Thread GitBox
stczwd commented on a change in pull request #29339: URL: https://github.com/apache/spark/pull/29339#discussion_r494743750 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -551,3 +552,31 @@ case class ShowFunctions(

[GitHub] [spark] cloud-fan commented on pull request #29339: [Spark-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-09-24 Thread GitBox
cloud-fan commented on pull request #29339: URL: https://github.com/apache/spark/pull/29339#issuecomment-698715577 Hi @stczwd , can you follow https://github.com/apache/spark/pull/29866 and implement the commands using `UnresolvedTableOrView`? cc @imback82 ---

[GitHub] [spark] huaxingao commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
huaxingao commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698714611 The change looks good to me, but do you need to change the implementation in https://github.com/apache/spark/pull/29850 ? Seems in that PR you are assuming ` categoricalCols`

[GitHub] [spark] cloud-fan closed pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
cloud-fan closed pull request #29866: URL: https://github.com/apache/spark/pull/29866 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] cloud-fan commented on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
cloud-fan commented on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698713409 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message

[GitHub] [spark] cloud-fan commented on a change in pull request #27604: [SPARK-30849][CORE][SHUFFLE]Fix application failed due to failed to get MapStatuses broadcast block

2020-09-24 Thread GitBox
cloud-fan commented on a change in pull request #27604: URL: https://github.com/apache/spark/pull/27604#discussion_r494738627 ## File path: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ## @@ -851,8 +852,16 @@ private[spark] class MapOutputTrackerWorker(conf: Spa

[GitHub] [spark] SparkQA commented on pull request #29869: [SPARK-32994][CORE] Update external accumulators before they entering into Spark listener event loop

2020-09-24 Thread GitBox
SparkQA commented on pull request #29869: URL: https://github.com/apache/spark/pull/29869#issuecomment-698711104 **[Test build #129102 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129102/testReport)** for PR 29869 at commit [`ba76918`](https://github.com

[GitHub] [spark] LantaoJin opened a new pull request #29869: [SPARK-32994][CORE] Update external accumulators before they entering into Spark listener event loop

2020-09-24 Thread GitBox
LantaoJin opened a new pull request #29869: URL: https://github.com/apache/spark/pull/29869 ### What changes were proposed in this pull request? Add a configuration to `LISTENER_BUS_ALLOW_EXTERNAL_ACCUMULATORS_ENTER_EVENT` and update the external accumulators which name is not started w

[GitHub] [spark] SparkQA commented on pull request #29800: [SPARK-32934][SQL] Improve the performance for NTH_VALUE and reactor the OffsetWindowFunction

2020-09-24 Thread GitBox
SparkQA commented on pull request #29800: URL: https://github.com/apache/spark/pull/29800#issuecomment-698709228 **[Test build #129101 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129101/testReport)** for PR 29800 at commit [`875b92b`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698708262 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
SparkQA removed a comment on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698693481 **[Test build #129100 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129100/testReport)** for PR 29868 at commit [`4317664`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698708262 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698707878 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698707878 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
SparkQA commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698708074 **[Test build #129100 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129100/testReport)** for PR 29868 at commit [`4317664`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
SparkQA commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698707870 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33721/ ---

[GitHub] [spark] CodingCat commented on a change in pull request #29831: [SPARK-32351][SQL] show pushed down partition filters in explain()

2020-09-24 Thread GitBox
CodingCat commented on a change in pull request #29831: URL: https://github.com/apache/spark/pull/29831#discussion_r494734806 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PrunePartitionSuiteBase.scala ## @@ -46,30 +46,39 @@ abstract class PrunePart

[GitHub] [spark] imback82 commented on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
imback82 commented on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698705568 cc @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] imback82 commented on a change in pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
imback82 commented on a change in pull request #29866: URL: https://github.com/apache/spark/pull/29866#discussion_r494632218 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ## @@ -1381,3 +1381,22 @@ case class ShowCreateTableAsSerdeCom

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698704837 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-24 Thread GitBox
SparkQA commented on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698704828 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33720/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698704837 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
SparkQA commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698703643 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33721/ -

[GitHub] [spark] SparkQA commented on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-24 Thread GitBox
SparkQA commented on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698699484 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33720/ -

[GitHub] [spark] holdenk commented on pull request #29854: [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage

2020-09-24 Thread GitBox
holdenk commented on pull request #29854: URL: https://github.com/apache/spark/pull/29854#issuecomment-698693708 So the launcher tests were broken by a new version of mini kube being released, it wasn’t broken by any specific PR (except maybe arguably the original PR adding integration tes

[GitHub] [spark] SparkQA commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
SparkQA commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698693481 **[Test build #129100 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129100/testReport)** for PR 29868 at commit [`4317664`](https://github.com

[GitHub] [spark] zhengruifeng commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
zhengruifeng commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698693361 friendly ping @srowen @huaxingao This is an automated message from the Apache Git Service. To respond to t

[GitHub] [spark] zhengruifeng commented on pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
zhengruifeng commented on pull request #29868: URL: https://github.com/apache/spark/pull/29868#issuecomment-698693191 repl: ``` import org.apache.spark.ml.feature._ import org.apache.spark.ml.linalg.{Vector, Vectors} val df = Seq((2.0, 1, "foo"),(3.0, 2, "bar")).toDF("real",

[GitHub] [spark] zhengruifeng opened a new pull request #29868: [SPARK-32973][ML][DOC] FeatureHasher does not check categoricalCols in inputCols

2020-09-24 Thread GitBox
zhengruifeng opened a new pull request #29868: URL: https://github.com/apache/spark/pull/29868 ### What changes were proposed in this pull request? 1, remove the comment: `Note, the relevant columns must also be set in inputCols`; 2, add a check, and if there are `categoricalCols` not

[GitHub] [spark] HyukjinKwon commented on pull request #29854: [SPARK-32937][SPARK-32980][K8S] Fix decom & launcher tests and add some comments to reduce chance of breakage

2020-09-24 Thread GitBox
HyukjinKwon commented on pull request #29854: URL: https://github.com/apache/spark/pull/29854#issuecomment-698689697 @holdenk, BTW which PR caused this failure? It's a bit odd that we fix a test as a followup here whereas you ask to revert at https://github.com/apache/spark/pull/29722#issu

[GitHub] [spark] SparkQA commented on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-24 Thread GitBox
SparkQA commented on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698689552 **[Test build #129099 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129099/testReport)** for PR 29860 at commit [`826a2e9`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698688261 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698688257 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins commented on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698688257 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
SparkQA commented on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698688247 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33719/ ---

[GitHub] [spark] HyukjinKwon commented on pull request #29860: [SPARK-32984][TESTS][SQL] Improve showing the differences between approved and actual plans of PlanStabilitySuite

2020-09-24 Thread GitBox
HyukjinKwon commented on pull request #29860: URL: https://github.com/apache/spark/pull/29860#issuecomment-698687917 retest this please This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [spark] zhengruifeng commented on a change in pull request #29852: [SPARK-21481][ML][FOLLOWUP][Trivial] HashingTF use util.collection.OpenHashMap instead of mutable.HashMap

2020-09-24 Thread GitBox
zhengruifeng commented on a change in pull request #29852: URL: https://github.com/apache/spark/pull/29852#discussion_r494713601 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala ## @@ -91,20 +90,13 @@ class HashingTF @Since("3.0.0") private[ml] (

[GitHub] [spark] beliefer commented on pull request #29747: [SPARK-31848][CORE][TEST] DAGSchedulerSuite: Break down the very huge test file

2020-09-24 Thread GitBox
beliefer commented on pull request #29747: URL: https://github.com/apache/spark/pull/29747#issuecomment-698685083 cc @jiangxb1987 This is an automated message from the Apache Git Service. To respond to the message, please lo

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698684577 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698684577 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
SparkQA removed a comment on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698612010 **[Test build #129090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129090/testReport)** for PR 29866 at commit [`d1051c6`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
SparkQA commented on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698684084 **[Test build #129090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129090/testReport)** for PR 29866 at commit [`d1051c6`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
SparkQA commented on pull request #29795: URL: https://github.com/apache/spark/pull/29795#issuecomment-698682551 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33719/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29863: [SPARK-32877][SQL][TEST][test-hive1.2][test-hadoop2.7] Add test for Hive UDF complex decimal type

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29863: URL: https://github.com/apache/spark/pull/29863#issuecomment-698679229 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29863: [SPARK-32877][SQL][TEST][test-hive1.2][test-hadoop2.7] Add test for Hive UDF complex decimal type

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29863: URL: https://github.com/apache/spark/pull/29863#issuecomment-698679229 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] LuciferYang edited a comment on pull request #29857: [SPARK-32972][ML] Fix UTs of `mllib` module in Scala 2.13 except RandomForestRegressorSuite

2020-09-24 Thread GitBox
LuciferYang edited a comment on pull request #29857: URL: https://github.com/apache/spark/pull/29857#issuecomment-698679002 > f you like you can try making the ordering of the Map in this code deterministic to see if that does it: > val topNodesForGroup: Map[Int, LearningNode] =

[GitHub] [spark] SparkQA removed a comment on pull request #29863: [SPARK-32877][SQL][TEST][test-hive1.2][test-hadoop2.7] Add test for Hive UDF complex decimal type

2020-09-24 Thread GitBox
SparkQA removed a comment on pull request #29863: URL: https://github.com/apache/spark/pull/29863#issuecomment-698650576 **[Test build #129095 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129095/testReport)** for PR 29863 at commit [`063ea6d`](https://gi

[GitHub] [spark] LuciferYang commented on pull request #29857: [SPARK-32972][ML] Fix UTs of `mllib` module in Scala 2.13 except RandomForestRegressorSuite

2020-09-24 Thread GitBox
LuciferYang commented on pull request #29857: URL: https://github.com/apache/spark/pull/29857#issuecomment-698679002 > f you like you can try making the ordering of the Map in this code deterministic to see if that does it: val topNodesForGroup: Map[Int, LearningNode] = node

[GitHub] [spark] SparkQA commented on pull request #29863: [SPARK-32877][SQL][TEST][test-hive1.2][test-hadoop2.7] Add test for Hive UDF complex decimal type

2020-09-24 Thread GitBox
SparkQA commented on pull request #29863: URL: https://github.com/apache/spark/pull/29863#issuecomment-698678880 **[Test build #129095 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129095/testReport)** for PR 29863 at commit [`063ea6d`](https://github.co

[GitHub] [spark] sarutak commented on pull request #29677: [SPARK-32820][SQL] Remove redundant shuffle exchanges inserted by EnsureRequirements

2020-09-24 Thread GitBox
sarutak commented on pull request #29677: URL: https://github.com/apache/spark/pull/29677#issuecomment-698678194 @c21 @imback82 @maropu @HyukjinKwon Any other feedback for this change? This is an automated message from th

[GitHub] [spark] sarutak commented on pull request #29827: [SPARK-32957][INFRA] Add a GitHub Actions job to run WebUI tests with Chrome

2020-09-24 Thread GitBox
sarutak commented on pull request #29827: URL: https://github.com/apache/spark/pull/29827#issuecomment-698677232 @dongjoon-hyun I agree. This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] sarutak commented on a change in pull request #29827: [SPARK-32957][INFRA] Add a GitHub Actions job to run WebUI tests with Chrome

2020-09-24 Thread GitBox
sarutak commented on a change in pull request #29827: URL: https://github.com/apache/spark/pull/29827#discussion_r494703217 ## File path: .github/workflows/build_and_test.yml ## @@ -273,6 +273,44 @@ jobs: cd docs jekyll build + webui-tests-with-chrome: +

[GitHub] [spark] fqaiser94 commented on a change in pull request #29795: [SPARK-32511][SQL] Add dropFields method to Column class

2020-09-24 Thread GitBox
fqaiser94 commented on a change in pull request #29795: URL: https://github.com/apache/spark/pull/29795#discussion_r494701237 ## File path: sql/core/src/test/scala/org/apache/spark/sql/UpdateFieldsPerformanceSuite.scala ## @@ -0,0 +1,229 @@ +/* + * Licensed to the Apache Softw

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698675212 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
AmplabJenkins commented on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698675212 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29866: [SPARK-32990][SQL] Migrate REFRESH TABLE to use UnresolvedTableOrView to resolve the identifier

2020-09-24 Thread GitBox
SparkQA removed a comment on pull request #29866: URL: https://github.com/apache/spark/pull/29866#issuecomment-698592724 **[Test build #129086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129086/testReport)** for PR 29866 at commit [`e6aa194`](https://gi

  1   2   3   4   5   6   >