[GitHub] [spark] SparkQA commented on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
SparkQA commented on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757683070 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38495/

[GitHub] [spark] SparkQA commented on pull request #31118: [SPARK-33084][CORE][SQL] Rename Unit test file and use fake ivy link

2021-01-10 Thread GitBox
SparkQA commented on pull request #31118: URL: https://github.com/apache/spark/pull/31118#issuecomment-757681272 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38497/

[GitHub] [spark] cloud-fan commented on pull request #31024: [SPARK-33979][SQL] Reorder predicate

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31024: URL: https://github.com/apache/spark/pull/31024#issuecomment-757679807 > Move UDF, LikeAny, LikeAll and CaseWhen to the end because these Expressions always CPU intensive. I'm not sure about it. CaseWhen can be fast if it has only one

[GitHub] [spark] cloud-fan commented on a change in pull request #31034: [SPARK-33989][SQL] Strip auto-generated cast when using Cast.sql

2021-01-10 Thread GitBox
cloud-fan commented on a change in pull request #31034: URL: https://github.com/apache/spark/pull/31034#discussion_r554846502 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -298,6 +300,30 @@ class Analyzer(override val

[GitHub] [spark] HyukjinKwon commented on pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
HyukjinKwon commented on pull request #31121: URL: https://github.com/apache/spark/pull/31121#issuecomment-757678385 Seems working fine: ![Screen Shot 2021-01-11 at 4 38 33 PM](https://user-images.githubusercontent.com/6477701/104155948-8795cb00-542b-11eb-9b26-7ca10bb39574.png)

[GitHub] [spark] HyukjinKwon closed pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
HyukjinKwon closed pull request #31121: URL: https://github.com/apache/spark/pull/31121 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
HyukjinKwon commented on pull request #31121: URL: https://github.com/apache/spark/pull/31121#issuecomment-757677380 Merged to master. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HyukjinKwon commented on pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
HyukjinKwon commented on pull request #31121: URL: https://github.com/apache/spark/pull/31121#issuecomment-757677286 Let me merge this in to recover the test coverage. Let's see if it works out of the box in the main repo too.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30829: [SPARK-33832][SQL] Add an option in AQE to mitigate skew even if it c…

2021-01-10 Thread GitBox
AmplabJenkins removed a comment on pull request #30829: URL: https://github.com/apache/spark/pull/30829#issuecomment-753645929 Can one of the admins verify this patch? This is an automated message from the Apache Git

[GitHub] [spark] SparkQA commented on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
SparkQA commented on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757676749 **[Test build #133912 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133912/testReport)** for PR 31119 at commit

[GitHub] [spark] SparkQA commented on pull request #30829: [SPARK-33832][SQL] Add an option in AQE to mitigate skew even if it c…

2021-01-10 Thread GitBox
SparkQA commented on pull request #30829: URL: https://github.com/apache/spark/pull/30829#issuecomment-757676704 **[Test build #133913 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133913/testReport)** for PR 30829 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #31122: [SPARK-34067][SQL] PartitionPruning push down pruningHasBenefit function into insertPredicate function to decrease calculate time

2021-01-10 Thread GitBox
AmplabJenkins commented on pull request #31122: URL: https://github.com/apache/spark/pull/31122#issuecomment-757674950 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #30775: [SPARK-33778][SQL] Allow typesafe join for LeftSemi and LeftAnti

2021-01-10 Thread GitBox
SparkQA commented on pull request #30775: URL: https://github.com/apache/spark/pull/30775#issuecomment-757674532 **[Test build #133914 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133914/testReport)** for PR 30775 at commit

[GitHub] [spark] SparkQA commented on pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
SparkQA commented on pull request #31121: URL: https://github.com/apache/spark/pull/31121#issuecomment-757674364 **[Test build #133911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133911/testReport)** for PR 31121 at commit

[GitHub] [spark] monkeyboy123 opened a new pull request #31122: [SPARK-34067][SQL] PartitionPruning push down pruningHasBenefit function into insertPredicate function to decrease calculate time

2021-01-10 Thread GitBox
monkeyboy123 opened a new pull request #31122: URL: https://github.com/apache/spark/pull/31122 ### What changes were proposed in this pull request? PartitionPruning push down pruningHasBenefit function into insertPredicate function to decrease calculate time ###

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
AmplabJenkins removed a comment on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757673065 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38496/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AmplabJenkins removed a comment on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757673062 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133906/

[GitHub] [spark] AmplabJenkins commented on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
AmplabJenkins commented on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757673065 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38496/

[GitHub] [spark] AmplabJenkins commented on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AmplabJenkins commented on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757673062 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133906/

[GitHub] [spark] SparkQA commented on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
SparkQA commented on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757671600 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38495/

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AngersZh commented on a change in pull request #31120: URL: https://github.com/apache/spark/pull/31120#discussion_r554828871 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -509,9 +509,9 @@ case class

[GitHub] [spark] AngersZhuuuu closed pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AngersZh closed pull request #31120: URL: https://github.com/apache/spark/pull/31120 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] cloud-fan commented on pull request #31035: [SPARK-31952][SQL] Fix incorrect memory spill metric when doing Aggregate

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31035: URL: https://github.com/apache/spark/pull/31035#issuecomment-757669676 @Ngone51 can you also open a PR for 3.0? thanks! This is an automated message from the Apache Git Service. To

[GitHub] [spark] cloud-fan closed pull request #31035: [SPARK-31952][SQL] Fix incorrect memory spill metric when doing Aggregate

2021-01-10 Thread GitBox
cloud-fan closed pull request #31035: URL: https://github.com/apache/spark/pull/31035 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] SparkQA commented on pull request #31118: [SPARK-33084][CORE][SQL] Rename Unit test file and use fake ivy link

2021-01-10 Thread GitBox
SparkQA commented on pull request #31118: URL: https://github.com/apache/spark/pull/31118#issuecomment-757669271 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38497/

[GitHub] [spark] cloud-fan commented on pull request #31035: [SPARK-31952][SQL] Fix incorrect memory spill metric when doing Aggregate

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31035: URL: https://github.com/apache/spark/pull/31035#issuecomment-757669155 thanks, merging to master/3.1! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA removed a comment on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
SparkQA removed a comment on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757653678 **[Test build #133906 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133906/testReport)** for PR 31120 at commit

[GitHub] [spark] SparkQA commented on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
SparkQA commented on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757665513 **[Test build #133906 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133906/testReport)** for PR 31120 at commit

[GitHub] [spark] cloud-fan commented on pull request #31112: [SPARK-34060][SQL] Fix Hive table caching while updating stats by `ALTER TABLE .. DROP PARTITION`

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31112: URL: https://github.com/apache/spark/pull/31112#issuecomment-757665048 @MaxGekk please open backport PRs for 3.1/3.0, thanks! This is an automated message from the Apache Git

[GitHub] [spark] cloud-fan closed pull request #31112: [SPARK-34060][SQL] Fix Hive table caching while updating stats by `ALTER TABLE .. DROP PARTITION`

2021-01-10 Thread GitBox
cloud-fan closed pull request #31112: URL: https://github.com/apache/spark/pull/31112 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #31112: [SPARK-34060][SQL] Fix Hive table caching while updating stats by `ALTER TABLE .. DROP PARTITION`

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31112: URL: https://github.com/apache/spark/pull/31112#issuecomment-757664750 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] cloud-fan closed pull request #31117: [SPARK-34055][SQL][TESTS][FOLLOWUP] Check partition adding to cached Hive table

2021-01-10 Thread GitBox
cloud-fan closed pull request #31117: URL: https://github.com/apache/spark/pull/31117 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
HyukjinKwon commented on a change in pull request #31121: URL: https://github.com/apache/spark/pull/31121#discussion_r554815437 ## File path: .github/workflows/cancel_duplicate_workflow_runs.yml ## @@ -7,6 +7,7 @@ on: jobs: cancel-duplicate-workflow-runs: +if:

[GitHub] [spark] cloud-fan commented on pull request #31117: [SPARK-34055][SQL][TESTS][FOLLOWUP] Check partition adding to cached Hive table

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31117: URL: https://github.com/apache/spark/pull/31117#issuecomment-757664403 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] cloud-fan commented on a change in pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
cloud-fan commented on a change in pull request #31120: URL: https://github.com/apache/spark/pull/31120#discussion_r554814398 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala ## @@ -509,9 +509,9 @@ case class

[GitHub] [spark] MaxGekk commented on pull request #31117: [SPARK-34055][SQL][TESTS][FOLLOWUP] Check partition adding to cached Hive table

2021-01-10 Thread GitBox
MaxGekk commented on pull request #31117: URL: https://github.com/apache/spark/pull/31117#issuecomment-757663052 @HyukjinKwon @cloud-fan Please, take a look at this. This is an automated message from the Apache Git Service.

[GitHub] [spark] MaxGekk commented on pull request #31112: [SPARK-34060][SQL] Fix Hive table caching while updating stats by `ALTER TABLE .. DROP PARTITION`

2021-01-10 Thread GitBox
MaxGekk commented on pull request #31112: URL: https://github.com/apache/spark/pull/31112#issuecomment-757662204 > which version do we start to have this perf issue? I have checked 3.0, it has the issue. This is an

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30775: [SPARK-33778][SQL] Allow typesafe join for LeftSemi and LeftAnti

2021-01-10 Thread GitBox
AngersZh commented on a change in pull request #30775: URL: https://github.com/apache/spark/pull/30775#discussion_r554810061 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -1240,6 +1240,40 @@ class Dataset[T] private[sql](

[GitHub] [spark] cloud-fan commented on pull request #30829: [SPARK-33832][SQL] Add an option in AQE to mitigate skew even if it c…

2021-01-10 Thread GitBox
cloud-fan commented on pull request #30829: URL: https://github.com/apache/spark/pull/30829#issuecomment-757660828 It's hard to calculate the cost of a shuffle and compare it with the benefit of skew join handling. We need some ways to tune it manually. But I don't understand why this

[GitHub] [spark] cloud-fan commented on pull request #30829: [SPARK-33832][SQL] Add an option in AQE to mitigate skew even if it c…

2021-01-10 Thread GitBox
cloud-fan commented on pull request #30829: URL: https://github.com/apache/spark/pull/30829#issuecomment-757659604 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan commented on a change in pull request #31103: [SPARK-34002][SQL] Fix the usage of encoder in ScalaUDF

2021-01-10 Thread GitBox
cloud-fan commented on a change in pull request #31103: URL: https://github.com/apache/spark/pull/31103#discussion_r554804917 ## File path: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ## @@ -1121,3 +1123,17 @@ class UDFRegistration private[sql]

[GitHub] [spark] viirya commented on a change in pull request #31103: [SPARK-34002][SQL] Fix the usage of encoder in ScalaUDF

2021-01-10 Thread GitBox
viirya commented on a change in pull request #31103: URL: https://github.com/apache/spark/pull/31103#discussion_r554801933 ## File path: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ## @@ -1121,3 +1123,17 @@ class UDFRegistration private[sql]

[GitHub] [spark] cloud-fan commented on a change in pull request #31103: [SPARK-34002][SQL] Fix the usage of encoder in ScalaUDF

2021-01-10 Thread GitBox
cloud-fan commented on a change in pull request #31103: URL: https://github.com/apache/spark/pull/31103#discussion_r554800605 ## File path: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ## @@ -1121,3 +1123,17 @@ class UDFRegistration private[sql]

[GitHub] [spark] AngersZhuuuu edited a comment on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AngersZh edited a comment on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757655581 > hmm, the test > >

[GitHub] [spark] AngersZhuuuu commented on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AngersZh commented on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757655581 > hmm, the test > >

[GitHub] [spark] wangyum commented on a change in pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
wangyum commented on a change in pull request #31119: URL: https://github.com/apache/spark/pull/31119#discussion_r554795409 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ## @@ -74,7 +74,9 @@ case class

[GitHub] [spark] SparkQA commented on pull request #30775: [SPARK-33778][SQL] Allow typesafe join for LeftSemi and LeftAnti

2021-01-10 Thread GitBox
SparkQA commented on pull request #30775: URL: https://github.com/apache/spark/pull/30775#issuecomment-757654374 **[Test build #133910 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133910/testReport)** for PR 30775 at commit

[GitHub] [spark] SparkQA commented on pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
SparkQA commented on pull request #31121: URL: https://github.com/apache/spark/pull/31121#issuecomment-757654199 **[Test build #133909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133909/testReport)** for PR 31121 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #30775: [SPARK-33778][SQL] Allow typesafe join for LeftSemi and LeftAnti

2021-01-10 Thread GitBox
cloud-fan commented on a change in pull request #30775: URL: https://github.com/apache/spark/pull/30775#discussion_r554794140 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -1240,6 +1240,40 @@ class Dataset[T] private[sql]( joinWith(other,

[GitHub] [spark] zhengruifeng commented on a change in pull request #30999: [SPARK-33971][SQL] Eliminate distinct from more aggregates

2021-01-10 Thread GitBox
zhengruifeng commented on a change in pull request #30999: URL: https://github.com/apache/spark/pull/30999#discussion_r554793464 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -349,11 +349,19 @@ abstract class

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
AmplabJenkins removed a comment on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757653535 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133907/

[GitHub] [spark] SparkQA commented on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
SparkQA commented on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757653678 **[Test build #133906 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133906/testReport)** for PR 31120 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
SparkQA removed a comment on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757651407 **[Test build #133907 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133907/testReport)** for PR 31119 at commit

[GitHub] [spark] SparkQA commented on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
SparkQA commented on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757653509 **[Test build #133907 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133907/testReport)** for PR 31119 at commit

[GitHub] [spark] cloud-fan commented on pull request #30775: [SPARK-33778][SQL] Allow typesafe join for LeftSemi and LeftAnti

2021-01-10 Thread GitBox
cloud-fan commented on pull request #30775: URL: https://github.com/apache/spark/pull/30775#issuecomment-757653562 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HyukjinKwon commented on pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
HyukjinKwon commented on pull request #31121: URL: https://github.com/apache/spark/pull/31121#issuecomment-757653539 cc @mik-laj and @dongjoon-hyun FYI This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] AmplabJenkins commented on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
AmplabJenkins commented on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757653535 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133907/

[GitHub] [spark] HyukjinKwon opened a new pull request #31121: [SPARK-34065][INFRA] Cancel the duplicated jobs only in PRs at GitHub Actions

2021-01-10 Thread GitBox
HyukjinKwon opened a new pull request #31121: URL: https://github.com/apache/spark/pull/31121 ### What changes were proposed in this pull request? Currently the jobs are being canceled in main repo branches. If a commit is merged, for example, to master branch before the test

[GitHub] [spark] cloud-fan commented on pull request #31112: [SPARK-34060][SQL] Fix Hive table caching while updating stats by `ALTER TABLE .. DROP PARTITION`

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31112: URL: https://github.com/apache/spark/pull/31112#issuecomment-757652448 @MaxGekk which version do we start to have this perf issue? This is an automated message from the Apache Git

[GitHub] [spark] AngersZhuuuu commented on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AngersZh commented on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757651508 checked the origin PR, seems no discuss about this, I believe it is missed https://github.com/apache/spark/pull/17320

[GitHub] [spark] SparkQA commented on pull request #31118: [SPARK-33084][CORE][SQL] Rename Unit test file and use fake ivy link

2021-01-10 Thread GitBox
SparkQA commented on pull request #31118: URL: https://github.com/apache/spark/pull/31118#issuecomment-757651432 **[Test build #133908 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133908/testReport)** for PR 31118 at commit

[GitHub] [spark] SparkQA commented on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
SparkQA commented on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757651407 **[Test build #133907 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133907/testReport)** for PR 31119 at commit

[GitHub] [spark] AngersZhuuuu edited a comment on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AngersZh edited a comment on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757650635 gentle ping @HyukjinKwon @MaxGekk @cloud-fan @maropu This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2021-01-10 Thread GitBox
AmplabJenkins removed a comment on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-757650608 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38494/

[GitHub] [spark] AngersZhuuuu commented on pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AngersZh commented on pull request #31120: URL: https://github.com/apache/spark/pull/31120#issuecomment-757650635 gentle ping @HyukjinKwon @MaxGekk @cloud-fan This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2021-01-10 Thread GitBox
AmplabJenkins commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-757650608 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38494/

[GitHub] [spark] pan3793 commented on pull request #30701: [DO-NOT-MERGE][test-maven] Test compatibility against Hadoop 3.2.2

2021-01-10 Thread GitBox
pan3793 commented on pull request #30701: URL: https://github.com/apache/spark/pull/30701#issuecomment-757650444 Hadoop 3.2.2-rc5 became 3.2.2 on Jan 9, does this PR have a chance to merge into 3.1? This is an automated

[GitHub] [spark] AngersZhuuuu opened a new pull request #31120: [34066][SQL] Fix misleading json function `from_json` 's example result

2021-01-10 Thread GitBox
AngersZh opened a new pull request #31120: URL: https://github.com/apache/spark/pull/31120 ### What changes were proposed in this pull request? Fix misleading json function example result, the true result is ``` +---+ |from_json({"a":1,

[GitHub] [spark] LantaoJin edited a comment on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
LantaoJin edited a comment on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757647778 Without the patch, the driver log, the Job 0 and Job 1 are still executing. ``` 21/01/10 21:02:33,406 INFO [HiveServer2-Handler-Pool: Thread-255]

[GitHub] [spark] LantaoJin commented on pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
LantaoJin commented on pull request #31119: URL: https://github.com/apache/spark/pull/31119#issuecomment-757647778 Without the patch, the driver log ``` 21/01/10 21:02:33,406 INFO [HiveServer2-Handler-Pool: Thread-255] thriftserver.SparkExecuteStatementOperation:57 : Submitting query

[GitHub] [spark] LantaoJin opened a new pull request #31119: [SPARK-34064][SQL] Cancel the running broadcast sub-jobs when SQL statement is cancelled

2021-01-10 Thread GitBox
LantaoJin opened a new pull request #31119: URL: https://github.com/apache/spark/pull/31119 ### What changes were proposed in this pull request? #24595 introduced `private val runId: UUID = UUID.randomUUID` in `BroadcastExchangeExec` to cancel the broadcast execution in the Future when

[GitHub] [spark] SparkQA commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2021-01-10 Thread GitBox
SparkQA commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-757641483 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38494/

[GitHub] [spark] yikf commented on pull request #31015: [SPARK-33991][CORE][WEBUI] Repair enumeration conversion error for AllJobsPage

2021-01-10 Thread GitBox
yikf commented on pull request #31015: URL: https://github.com/apache/spark/pull/31015#issuecomment-757634965 > The change looks OK to me. Hi,Thank you for your response,Can you merge it into master branch for me? Thank you so much!

[GitHub] [spark] tanelk commented on pull request #31024: [SPARK-33979][SQL] Reorder predicate

2021-01-10 Thread GitBox
tanelk commented on pull request #31024: URL: https://github.com/apache/spark/pull/31024#issuecomment-757633291 Thats strange, I ran the same test and got this optimized plan for my example: ``` == Optimized Logical Plan == Filter (((b#3653L > 1) AND (a#3652L > 10)) OR

[GitHub] [spark] SparkQA commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2021-01-10 Thread GitBox
SparkQA commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-757631609 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38494/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31113: [SPARK-34061][SQL] DISTINCT the INTERSECT children

2021-01-10 Thread GitBox
AmplabJenkins removed a comment on pull request #31113: URL: https://github.com/apache/spark/pull/31113#issuecomment-757630823 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133901/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31118: [SPARK-33084][CORE][SQL] Rename Unit test file and use fake ivy link

2021-01-10 Thread GitBox
AmplabJenkins removed a comment on pull request #31118: URL: https://github.com/apache/spark/pull/31118#issuecomment-757630827 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133903/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31004: [MINOR] Improve flaky NaiveBayes test

2021-01-10 Thread GitBox
AmplabJenkins removed a comment on pull request #31004: URL: https://github.com/apache/spark/pull/31004#issuecomment-757630825 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #31004: [MINOR] Improve flaky NaiveBayes test

2021-01-10 Thread GitBox
AmplabJenkins commented on pull request #31004: URL: https://github.com/apache/spark/pull/31004#issuecomment-757630825 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #31113: [SPARK-34061][SQL] DISTINCT the INTERSECT children

2021-01-10 Thread GitBox
AmplabJenkins commented on pull request #31113: URL: https://github.com/apache/spark/pull/31113#issuecomment-757630823 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133901/

[GitHub] [spark] AmplabJenkins commented on pull request #31118: [SPARK-33084][CORE][SQL] Rename Unit test file and use fake ivy link

2021-01-10 Thread GitBox
AmplabJenkins commented on pull request #31118: URL: https://github.com/apache/spark/pull/31118#issuecomment-757630827 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/133903/

[GitHub] [spark] SparkQA removed a comment on pull request #31113: [SPARK-34061][SQL] DISTINCT the INTERSECT children

2021-01-10 Thread GitBox
SparkQA removed a comment on pull request #31113: URL: https://github.com/apache/spark/pull/31113#issuecomment-757572700 **[Test build #133901 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133901/testReport)** for PR 31113 at commit

[GitHub] [spark] SparkQA commented on pull request #31113: [SPARK-34061][SQL] DISTINCT the INTERSECT children

2021-01-10 Thread GitBox
SparkQA commented on pull request #31113: URL: https://github.com/apache/spark/pull/31113#issuecomment-757629585 **[Test build #133901 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133901/testReport)** for PR 31113 at commit

[GitHub] [spark] HyukjinKwon closed pull request #31109: [SPARK-33970][SQL][TEST] Add test default partition in metastoredirectsql

2021-01-10 Thread GitBox
HyukjinKwon closed pull request #31109: URL: https://github.com/apache/spark/pull/31109 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on pull request #31109: [SPARK-33970][SQL][TEST] Add test default partition in metastoredirectsql

2021-01-10 Thread GitBox
HyukjinKwon commented on pull request #31109: URL: https://github.com/apache/spark/pull/31109#issuecomment-757629241 Merged to master and branch-3.1 (since this is rather a followup of https://github.com/apache/spark/pull/30534).

[GitHub] [spark] SparkQA removed a comment on pull request #31118: [SPARK-33084][CORE][SQL] Rename Unit test file and use fake ivy link

2021-01-10 Thread GitBox
SparkQA removed a comment on pull request #31118: URL: https://github.com/apache/spark/pull/31118#issuecomment-757594512 **[Test build #133903 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133903/testReport)** for PR 31118 at commit

[GitHub] [spark] SparkQA commented on pull request #31118: [SPARK-33084][CORE][SQL] Rename Unit test file and use fake ivy link

2021-01-10 Thread GitBox
SparkQA commented on pull request #31118: URL: https://github.com/apache/spark/pull/31118#issuecomment-757626507 **[Test build #133903 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133903/testReport)** for PR 31118 at commit

[GitHub] [spark] cloud-fan commented on pull request #31095: [SPARK-33591][SQL][3.0] Recognize `null` in partition spec values

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31095: URL: https://github.com/apache/spark/pull/31095#issuecomment-757624096 thanks, merging to 3.0! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] cloud-fan closed pull request #31095: [SPARK-33591][SQL][3.0] Recognize `null` in partition spec values

2021-01-10 Thread GitBox
cloud-fan closed pull request #31095: URL: https://github.com/apache/spark/pull/31095 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan closed pull request #31094: [SPARK-33591][SQL][3.1] Recognize `null` in partition spec values

2021-01-10 Thread GitBox
cloud-fan closed pull request #31094: URL: https://github.com/apache/spark/pull/31094 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #31094: [SPARK-33591][SQL][3.1] Recognize `null` in partition spec values

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31094: URL: https://github.com/apache/spark/pull/31094#issuecomment-757623678 thanks, merging to 3.1! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] cloud-fan commented on pull request #31063: [SPARK-33938][SQL][3.1] Optimize Like Any/All by LikeSimplification

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31063: URL: https://github.com/apache/spark/pull/31063#issuecomment-757622207 It's a 3.1 perf regression: https://github.com/apache/spark/pull/30975#issuecomment-755158004 This is an

[GitHub] [spark] SparkQA commented on pull request #31004: [MINOR] Improve flaky NaiveBayes test

2021-01-10 Thread GitBox
SparkQA commented on pull request #31004: URL: https://github.com/apache/spark/pull/31004#issuecomment-757622028 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38493/

[GitHub] [spark] SparkQA removed a comment on pull request #31004: [MINOR] Improve flaky NaiveBayes test

2021-01-10 Thread GitBox
SparkQA removed a comment on pull request #31004: URL: https://github.com/apache/spark/pull/31004#issuecomment-757606075 **[Test build #133904 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133904/testReport)** for PR 31004 at commit

[GitHub] [spark] SparkQA commented on pull request #31004: [MINOR] Improve flaky NaiveBayes test

2021-01-10 Thread GitBox
SparkQA commented on pull request #31004: URL: https://github.com/apache/spark/pull/31004#issuecomment-757619252 **[Test build #133904 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133904/testReport)** for PR 31004 at commit

[GitHub] [spark] SparkQA commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2021-01-10 Thread GitBox
SparkQA commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-757618774 **[Test build #133905 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/133905/testReport)** for PR 29695 at commit

[GitHub] [spark] tedyu edited a comment on pull request #30984: [SPARK-33915][SQL] Allow json expression to be pushable column

2021-01-10 Thread GitBox
tedyu edited a comment on pull request #30984: URL: https://github.com/apache/spark/pull/30984#issuecomment-757617064 @viirya As I mentioned in https://github.com/apache/spark/pull/30984#issuecomment-757436363, this approach is consistent with how existing implementation handles the

[GitHub] [spark] cloud-fan closed pull request #31106: [SPARK-34057][SQL] UnresolvedTableOrView should retain SQL text position for DDL commands

2021-01-10 Thread GitBox
cloud-fan closed pull request #31106: URL: https://github.com/apache/spark/pull/31106 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #31106: [SPARK-34057][SQL] UnresolvedTableOrView should retain SQL text position for DDL commands

2021-01-10 Thread GitBox
cloud-fan commented on pull request #31106: URL: https://github.com/apache/spark/pull/31106#issuecomment-757617587 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] tanelk commented on a change in pull request #30999: [SPARK-33971][SQL] Eliminate distinct from more aggregates

2021-01-10 Thread GitBox
tanelk commented on a change in pull request #30999: URL: https://github.com/apache/spark/pull/30999#discussion_r554702109 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -349,11 +349,19 @@ abstract class

  1   2   3   4   >