[GitHub] [spark] EnricoMi commented on a diff in pull request #38223: [SPARK-40770][PYTHON] Improved error messages for applyInPandas for schema mismatch

2022-11-06 Thread GitBox
EnricoMi commented on code in PR #38223: URL: https://github.com/apache/spark/pull/38223#discussion_r1015075283 ## python/pyspark/worker.py: ## @@ -188,22 +241,7 @@ def wrapped(key_series, value_series): elif len(argspec.args) == 2: key = tuple(s[0] for s

[GitHub] [spark] EnricoMi commented on a diff in pull request #38223: [SPARK-40770][PYTHON] Improved error messages for applyInPandas for schema mismatch

2022-11-06 Thread GitBox
EnricoMi commented on code in PR #38223: URL: https://github.com/apache/spark/pull/38223#discussion_r1015075103 ## python/pyspark/sql/tests/pandas/test_pandas_cogrouped_map.py: ## @@ -165,100 +148,191 @@ def merge_pandas(lft, _): ) def

[GitHub] [spark] EnricoMi commented on pull request #38509: [SPARK-41014][PYTHON][DOC] Improve documentation and typing of groupby and cogroup applyInPandas

2022-11-06 Thread GitBox
EnricoMi commented on PR #38509: URL: https://github.com/apache/spark/pull/38509#issuecomment-1305160983 @HyukjinKwon @zhengruifeng this are minor improvements for documentation and typing of PySpark applyInPandas methods. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] EnricoMi commented on pull request #38312: [SPARK-40819][SQL] Timestamp nanos behaviour regression

2022-11-06 Thread GitBox
EnricoMi commented on PR #38312: URL: https://github.com/apache/spark/pull/38312#issuecomment-1305155414 @cloud-fan where do we stand with this? Is this a regression? How do we proceed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] LuciferYang commented on pull request #38519: [MINOR][SQL] Remove unused an error class and query error methods

2022-11-06 Thread GitBox
LuciferYang commented on PR #38519: URL: https://github.com/apache/spark/pull/38519#issuecomment-1305147588 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] MaxGekk closed pull request #38519: [MINOR][SQL] Remove unused an error class and query error methods

2022-11-06 Thread GitBox
MaxGekk closed pull request #38519: [MINOR][SQL] Remove unused an error class and query error methods URL: https://github.com/apache/spark/pull/38519 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] MaxGekk commented on pull request #38519: [MINOR][SQL] Remove unused an error class and query error methods

2022-11-06 Thread GitBox
MaxGekk commented on PR #38519: URL: https://github.com/apache/spark/pull/38519#issuecomment-1305144565 Merging to master. Thank you, @itholic for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] panbingkun opened a new pull request, #38531: [SPARK-40755][SQL] Migrate type check failures of number formatting onto error classes

2022-11-06 Thread GitBox
panbingkun opened a new pull request, #38531: URL: https://github.com/apache/spark/pull/38531 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38318: [SPARK-40852][CONNECT][PYTHON] Introduce `StatFunction` in proto and implement `DataFrame.summary`

2022-11-06 Thread GitBox
zhengruifeng commented on code in PR #38318: URL: https://github.com/apache/spark/pull/38318#discussion_r1015041619 ## python/pyspark/sql/connect/dataframe.py: ## @@ -323,6 +323,14 @@ def unionByName(self, other: "DataFrame", allowMissingColumns: bool = False) -> def

[GitHub] [spark] zhengruifeng commented on pull request #38318: [SPARK-40852][CONNECT][PYTHON] Introduce `StatFunction` in proto and implement `DataFrame.summary`

2022-11-06 Thread GitBox
zhengruifeng commented on PR #38318: URL: https://github.com/apache/spark/pull/38318#issuecomment-1305136349 cc @cloud-fan @HyukjinKwon @amaliujia -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] cloud-fan commented on pull request #38513: [SPARK-40903][SQL][FOLLOWUP] Cast canonicalized Add as its original data type if necessary

2022-11-06 Thread GitBox
cloud-fan commented on PR #38513: URL: https://github.com/apache/spark/pull/38513#issuecomment-1305100077 I think canonicalization should not change the data type in the first place. Adding cast only hides the bug. What's worse, it doesn't help with the goal of canonicalization: match

[GitHub] [spark] LuciferYang commented on pull request #38530: [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`

2022-11-06 Thread GitBox
LuciferYang commented on PR #38530: URL: https://github.com/apache/spark/pull/38530#issuecomment-1305095279 Yes, if the exception test lacks the corresponding UT, I will add one -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cxzl25 commented on pull request #38489: [SPARK-41003][SQL] BHJ LeftAnti does not update numOutputRows when codegen is disabled

2022-11-06 Thread GitBox
cxzl25 commented on PR #38489: URL: https://github.com/apache/spark/pull/38489#issuecomment-1305078019 Help review. Thanks. @leanken-zz @cloud-fan @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] ulysses-you commented on a diff in pull request #38513: [SPARK-40903][SQL][FOLLOWUP] Cast canonicalized Add as its original data type if necessary

2022-11-06 Thread GitBox
ulysses-you commented on code in PR #38513: URL: https://github.com/apache/spark/pull/38513#discussion_r1015003195 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -481,12 +481,12 @@ case class Add( // TODO: do not reorder

[GitHub] [spark] gengliangwang commented on a diff in pull request #38513: [SPARK-40903][SQL][FOLLOWUP] Cast canonicalized Add as its original data type if necessary

2022-11-06 Thread GitBox
gengliangwang commented on code in PR #38513: URL: https://github.com/apache/spark/pull/38513#discussion_r101535 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -481,12 +481,12 @@ case class Add( // TODO: do not reorder

[GitHub] [spark] gengliangwang commented on a diff in pull request #38513: [SPARK-40903][SQL][FOLLOWUP] Cast canonicalized Add as its original data type if necessary

2022-11-06 Thread GitBox
gengliangwang commented on code in PR #38513: URL: https://github.com/apache/spark/pull/38513#discussion_r1014999188 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -481,12 +481,12 @@ case class Add( // TODO: do not reorder

[GitHub] [spark] itholic commented on a diff in pull request #38447: [SPARK-40973][SQL] Rename `_LEGACY_ERROR_TEMP_0055` to `UNCLOSED_BRACKETED_COMMENT`

2022-11-06 Thread GitBox
itholic commented on code in PR #38447: URL: https://github.com/apache/spark/pull/38447#discussion_r1014998869 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala: ## @@ -608,8 +608,12 @@ private[sql] object QueryParsingErrors extends

[GitHub] [spark] itholic commented on pull request #38422: [SPARK-40948][SQL] Introduce new error class: PATH_NOT_FOUND

2022-11-06 Thread GitBox
itholic commented on PR #38422: URL: https://github.com/apache/spark/pull/38422#issuecomment-1305067447 Thanks, @MaxGekk ! Just adjusted the comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] itholic commented on pull request #38530: [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`

2022-11-06 Thread GitBox
itholic commented on PR #38530: URL: https://github.com/apache/spark/pull/38530#issuecomment-1305066532 It would be great if we put some example into the PR description how the error message changes with `Before` and `After` example. -- This is an automated message from the Apache Git

[GitHub] [spark] LuciferYang commented on pull request #38530: [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE ` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`

2022-11-06 Thread GitBox
LuciferYang commented on PR #38530: URL: https://github.com/apache/spark/pull/38530#issuecomment-1305030163 cc @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] LuciferYang commented on a diff in pull request #38507: [SPARK-40372][SQL] Migrate failures of array type checks onto error classes

2022-11-06 Thread GitBox
LuciferYang commented on code in PR #38507: URL: https://github.com/apache/spark/pull/38507#discussion_r1014970870 ## core/src/main/resources/error/error-classes.json: ## @@ -225,14 +230,14 @@ "The should all be of type map, but it's ." ] }, -

[GitHub] [spark] LuciferYang commented on a diff in pull request #38507: [SPARK-40372][SQL] Migrate failures of array type checks onto error classes

2022-11-06 Thread GitBox
LuciferYang commented on code in PR #38507: URL: https://github.com/apache/spark/pull/38507#discussion_r1014970870 ## core/src/main/resources/error/error-classes.json: ## @@ -225,14 +230,14 @@ "The should all be of type map, but it's ." ] }, -

[GitHub] [spark] LuciferYang opened a new pull request, #38530: [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE ` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`

2022-11-06 Thread GitBox
LuciferYang opened a new pull request, #38530: URL: https://github.com/apache/spark/pull/38530 ### What changes were proposed in this pull request? This pr aims to use `UNEXPECTED_INPUT_TYPE ` instead of `MAP_FROM_ENTRIES_WRONG_TYPE` and remove definition of

[GitHub] [spark] LuciferYang commented on a diff in pull request #38507: [SPARK-40372][SQL] Migrate failures of array type checks onto error classes

2022-11-06 Thread GitBox
LuciferYang commented on code in PR #38507: URL: https://github.com/apache/spark/pull/38507#discussion_r1014968186 ## core/src/main/resources/error/error-classes.json: ## @@ -225,14 +230,14 @@ "The should all be of type map, but it's ." ] }, -

[GitHub] [spark] jackylee-ch commented on a diff in pull request #38496: [WIP][SPARK-40708][SQL] Auto update table statistics based on write metrics

2022-11-06 Thread GitBox
jackylee-ch commented on code in PR #38496: URL: https://github.com/apache/spark/pull/38496#discussion_r1014966247 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/CommandUtils.scala: ## @@ -53,19 +53,41 @@ class PathFilterIgnoreNonData(stagingDir: String)

[GitHub] [spark] ulysses-you commented on a diff in pull request #38513: [SPARK-40903][SQL][FOLLOWUP] Cast canonicalized Add as its original data type if necessary

2022-11-06 Thread GitBox
ulysses-you commented on code in PR #38513: URL: https://github.com/apache/spark/pull/38513#discussion_r1014959493 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -481,12 +481,12 @@ case class Add( // TODO: do not reorder

[GitHub] [spark] LuciferYang commented on a diff in pull request #38507: [SPARK-40372][SQL] Migrate failures of array type checks onto error classes

2022-11-06 Thread GitBox
LuciferYang commented on code in PR #38507: URL: https://github.com/apache/spark/pull/38507#discussion_r1014954543 ## core/src/main/resources/error/error-classes.json: ## @@ -225,14 +230,14 @@ "The should all be of type map, but it's ." ] }, -

[GitHub] [spark] zhengruifeng commented on pull request #38395: [SPARK-40917][SQL] Add a dedicated logical plan for `Summary`

2022-11-06 Thread GitBox
zhengruifeng commented on PR #38395: URL: https://github.com/apache/spark/pull/38395#issuecomment-1304998340 i am going to close this pr and continue work on https://github.com/apache/spark/pull/38318 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] zhengruifeng closed pull request #38395: [SPARK-40917][SQL] Add a dedicated logical plan for `Summary`

2022-11-06 Thread GitBox
zhengruifeng closed pull request #38395: [SPARK-40917][SQL] Add a dedicated logical plan for `Summary` URL: https://github.com/apache/spark/pull/38395 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] beliefer commented on pull request #37630: [SPARK-40193][SQL] Merge subquery plans with different filters

2022-11-06 Thread GitBox
beliefer commented on PR #37630: URL: https://github.com/apache/spark/pull/37630#issuecomment-1304985806 @peter-toth Could you fix these conflicts. I want test this PR. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] github-actions[bot] commented on pull request #36253: [SPARK-38932][SQL] Datasource v2 support report distinct keys

2022-11-06 Thread GitBox
github-actions[bot] commented on PR #36253: URL: https://github.com/apache/spark/pull/36253#issuecomment-1304942022 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] closed pull request #37065: [SPARK-38699][SQL] Use error classes in the execution errors of dictionary encoding

2022-11-06 Thread GitBox
github-actions[bot] closed pull request #37065: [SPARK-38699][SQL] Use error classes in the execution errors of dictionary encoding URL: https://github.com/apache/spark/pull/37065 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] closed pull request #37317: [SPARK-39894][SQL] Combine the similar binary comparison in boolean expression.

2022-11-06 Thread GitBox
github-actions[bot] closed pull request #37317: [SPARK-39894][SQL] Combine the similar binary comparison in boolean expression. URL: https://github.com/apache/spark/pull/37317 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] github-actions[bot] closed pull request #37235: [SPARK-39824][PYTHON][PS] Introduce index where and putmask func in pyspark

2022-11-06 Thread GitBox
github-actions[bot] closed pull request #37235: [SPARK-39824][PYTHON][PS] Introduce index where and putmask func in pyspark URL: https://github.com/apache/spark/pull/37235 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] github-actions[bot] commented on pull request #37316: [SPARK-39893][SQL] Push limit 1 to the aggregate's child plan if grouping expressions and aggregate expressions are foldable

2022-11-06 Thread GitBox
github-actions[bot] commented on PR #37316: URL: https://github.com/apache/spark/pull/37316#issuecomment-1304941982 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] srowen commented on pull request #38510: [MINOR][DOC] revisions for spark sql performance tuning to improve readability and grammar

2022-11-06 Thread GitBox
srowen commented on PR #38510: URL: https://github.com/apache/spark/pull/38510#issuecomment-1304934181 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] srowen closed pull request #38510: [MINOR][DOC] revisions for spark sql performance tuning to improve readability and grammar

2022-11-06 Thread GitBox
srowen closed pull request #38510: [MINOR][DOC] revisions for spark sql performance tuning to improve readability and grammar URL: https://github.com/apache/spark/pull/38510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] LuciferYang commented on pull request #38507: [SPARK-40372][SQL] Migrate failures of array type checks onto error classes

2022-11-06 Thread GitBox
LuciferYang commented on PR #38507: URL: https://github.com/apache/spark/pull/38507#issuecomment-1304929641 will update this one today -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
LuciferYang commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304926628 > Shouldn't guava changes be done in a separate PR? Currently, I will not be considered upgrade Guava for the following reasons: 1. Although Spark 3.4.0 will no longer

[GitHub] [spark] LuciferYang commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
LuciferYang commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304926354 > @LuciferYang feel free to continue with your PR. I marked [SPARK-41023](https://issues.apache.org/jira/browse/SPARK-41023) as a duplicate of

[GitHub] [spark] LuciferYang commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
LuciferYang commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304922578 @srowen test passed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] amaliujia commented on a diff in pull request #38488: [SPARK-41002][CONNECT][PYTHON] Compatible `take`, `head` and `first` API in Python client

2022-11-06 Thread GitBox
amaliujia commented on code in PR #38488: URL: https://github.com/apache/spark/pull/38488#discussion_r1013499840 ## python/pyspark/sql/connect/dataframe.py: ## @@ -211,14 +212,66 @@ def filter(self, condition: Expression) -> "DataFrame":

[GitHub] [spark] amaliujia commented on pull request #38529: [SPARK-41026][CONNECT] Support Repartition in Connect Proto

2022-11-06 Thread GitBox
amaliujia commented on PR #38529: URL: https://github.com/apache/spark/pull/38529#issuecomment-1304913709 R: @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] amaliujia opened a new pull request, #38529: [SPARK-41026][CONNECT] Support Repartition in Connect Proto

2022-11-06 Thread GitBox
amaliujia opened a new pull request, #38529: URL: https://github.com/apache/spark/pull/38529 ### What changes were proposed in this pull request? Support `Repartition` in Connect proto, which supports two API: `repartition` (shuffle=true) and `coalesce` (shuffle=false).

[GitHub] [spark] HeartSaVioR commented on pull request #38528: [SPARK-41025][SS] Introduce ComparableOffset to support offset range validation

2022-11-06 Thread GitBox
HeartSaVioR commented on PR #38528: URL: https://github.com/apache/spark/pull/38528#issuecomment-1304913137 cc. @zsxwing @jerrypeng Appreciate your review. Thanks in advance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HeartSaVioR opened a new pull request, #38528: [SPARK-41025][SS] Introduce ComparableOffset to support offset range validation

2022-11-06 Thread GitBox
HeartSaVioR opened a new pull request, #38528: URL: https://github.com/apache/spark/pull/38528 ### What changes were proposed in this pull request? This PR proposes to introduce a new interface ComparableOffset, which is a mix-in of streaming Offset interface to enable comparison

[GitHub] [spark] ljfgem commented on a diff in pull request #35636: [SPARK-31357][SQL][WIP] Catalog API for view metadata

2022-11-06 Thread GitBox
ljfgem commented on code in PR #35636: URL: https://github.com/apache/spark/pull/35636#discussion_r1014902247 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/V2ViewDescription.scala: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] jzhuge commented on a diff in pull request #35636: [SPARK-31357][SQL][WIP] Catalog API for view metadata

2022-11-06 Thread GitBox
jzhuge commented on code in PR #35636: URL: https://github.com/apache/spark/pull/35636#discussion_r1014901610 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/V2ViewDescription.scala: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] ljfgem commented on a diff in pull request #35636: [SPARK-31357][SQL][WIP] Catalog API for view metadata

2022-11-06 Thread GitBox
ljfgem commented on code in PR #35636: URL: https://github.com/apache/spark/pull/35636#discussion_r1014899090 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/V2ViewDescription.scala: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] ljfgem commented on a diff in pull request #35636: [SPARK-31357][SQL][WIP] Catalog API for view metadata

2022-11-06 Thread GitBox
ljfgem commented on code in PR #35636: URL: https://github.com/apache/spark/pull/35636#discussion_r1014899090 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/V2ViewDescription.scala: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] jzhuge commented on a diff in pull request #35636: [SPARK-31357][SQL][WIP] Catalog API for view metadata

2022-11-06 Thread GitBox
jzhuge commented on code in PR #35636: URL: https://github.com/apache/spark/pull/35636#discussion_r1014897627 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/V2ViewDescription.scala: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] jzhuge commented on a diff in pull request #35636: [SPARK-31357][SQL][WIP] Catalog API for view metadata

2022-11-06 Thread GitBox
jzhuge commented on code in PR #35636: URL: https://github.com/apache/spark/pull/35636#discussion_r1014897367 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/V2ViewDescription.scala: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] sunchao commented on pull request #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10

2022-11-06 Thread GitBox
sunchao commented on PR #38352: URL: https://github.com/apache/spark/pull/38352#issuecomment-1304899326 I'm going to start working on it this week. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] AmplabJenkins commented on pull request #38527: [SPARK-40875][CONNECT] Improve aggregate in Connect DSL

2022-11-06 Thread GitBox
AmplabJenkins commented on PR #38527: URL: https://github.com/apache/spark/pull/38527#issuecomment-1304898323 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] amaliujia commented on pull request #38527: [SPARK-40875][CONNECT] Improve aggregate in Connect DSL

2022-11-06 Thread GitBox
amaliujia commented on PR #38527: URL: https://github.com/apache/spark/pull/38527#issuecomment-1304893285 R: @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] amaliujia opened a new pull request, #38527: [SPARK-40875][CONNECT] Improve aggregate in Connect DSL

2022-11-06 Thread GitBox
amaliujia opened a new pull request, #38527: URL: https://github.com/apache/spark/pull/38527 ### What changes were proposed in this pull request? This PR adds the aggregate expressions (or named result expressions) for Aggregate in Connect proto and DSL. On the server side,

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38515: [SPARK-41015][SQL][PROTOBUF] UnitTest null check for data generator

2022-11-06 Thread GitBox
SandishKumarHN commented on code in PR #38515: URL: https://github.com/apache/spark/pull/38515#discussion_r1014891059 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala: ## @@ -3344,7 +3344,7 @@ private[sql] object QueryCompilationErrors

[GitHub] [spark] dwsmith1983 commented on pull request #38510: [MINOR][DOC] revisions for spark sql performance tuning to improve readability and grammar

2022-11-06 Thread GitBox
dwsmith1983 commented on PR #38510: URL: https://github.com/apache/spark/pull/38510#issuecomment-1304882120 @srowen so the streaming test suite passed now so it should be fine now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] srowen commented on pull request #38510: [MINOR][DOC] revisions for spark sql performance tuning to improve readability and grammar

2022-11-06 Thread GitBox
srowen commented on PR #38510: URL: https://github.com/apache/spark/pull/38510#issuecomment-1304879526 Weird, the error is on code that isn't in the repo (now). Can you fully rebase and force-push your changes? -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] dwsmith1983 commented on pull request #38510: [MINOR][DOC] revisions for spark sql performance tuning to improve readability and grammar

2022-11-06 Thread GitBox
dwsmith1983 commented on PR #38510: URL: https://github.com/apache/spark/pull/38510#issuecomment-1304878701 @srowen is the failure related to the bot tagging of connect? There is some connect file that was added after this tag. -- This is an automated message from the Apache Git Service.

[GitHub] [spark] ljfgem commented on a diff in pull request #35636: [SPARK-31357][SQL][WIP] Catalog API for view metadata

2022-11-06 Thread GitBox
ljfgem commented on code in PR #35636: URL: https://github.com/apache/spark/pull/35636#discussion_r1014877861 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/V2ViewDescription.scala: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] ljfgem commented on a diff in pull request #35636: [SPARK-31357][SQL][WIP] Catalog API for view metadata

2022-11-06 Thread GitBox
ljfgem commented on code in PR #35636: URL: https://github.com/apache/spark/pull/35636#discussion_r955031916 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/V2ViewDescription.scala: ## @@ -0,0 +1,55 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] MaxGekk commented on pull request #38519: [MINOR][SQL] Remove unused an error class and query error methods

2022-11-06 Thread GitBox
MaxGekk commented on PR #38519: URL: https://github.com/apache/spark/pull/38519#issuecomment-1304866695 @panbingkun @LuciferYang @itholic @cloud-fan @srielau Could you review this PR, please. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] MaxGekk commented on a diff in pull request #38490: [SPARK-41009][SQL] Rename the error class `_LEGACY_ERROR_TEMP_1070` to `LOCATION_ALREADY_EXISTS`

2022-11-06 Thread GitBox
MaxGekk commented on code in PR #38490: URL: https://github.com/apache/spark/pull/38490#discussion_r1014873100 ## core/src/main/resources/error/error-classes.json: ## @@ -668,6 +668,24 @@ } } }, + "LOCATION_ALREADY_EXISTS" : { +"message" : [ + "Cannot

[GitHub] [spark] MaxGekk commented on a diff in pull request #38521: [SPARK-41020][SQL] Rename the error class `_LEGACY_ERROR_TEMP_1019` to `STAR_GROUP_BY_POS`

2022-11-06 Thread GitBox
MaxGekk commented on code in PR #38521: URL: https://github.com/apache/spark/pull/38521#discussion_r1014871999 ## core/src/main/resources/error/error-classes.json: ## @@ -912,6 +912,11 @@ ], "sqlState" : "22023" }, + "STAR_GROUP_BY_POS" : { Review Comment: It

[GitHub] [spark] MaxGekk commented on a diff in pull request #38521: [SPARK-41020][SQL] Rename the error class `_LEGACY_ERROR_TEMP_1019` to `STAR_GROUP_BY_POS`

2022-11-06 Thread GitBox
MaxGekk commented on code in PR #38521: URL: https://github.com/apache/spark/pull/38521#discussion_r1014871999 ## core/src/main/resources/error/error-classes.json: ## @@ -912,6 +912,11 @@ ], "sqlState" : "22023" }, + "STAR_GROUP_BY_POS" : { Review Comment: It

[GitHub] [spark] MaxGekk closed pull request #38522: [SPARK-41022][SQL][TESTS] Test the error class: DEFAULT_DATABASE_NOT_EXISTS, INDEX_ALREADY_EXISTS, INDEX_NOT_FOUND, ROUTINE_NOT_FOUND

2022-11-06 Thread GitBox
MaxGekk closed pull request #38522: [SPARK-41022][SQL][TESTS] Test the error class: DEFAULT_DATABASE_NOT_EXISTS, INDEX_ALREADY_EXISTS, INDEX_NOT_FOUND, ROUTINE_NOT_FOUND URL: https://github.com/apache/spark/pull/38522 -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] MaxGekk commented on pull request #38522: [SPARK-41022][SQL][TESTS] Test the error class: DEFAULT_DATABASE_NOT_EXISTS, INDEX_ALREADY_EXISTS, INDEX_NOT_FOUND, ROUTINE_NOT_FOUND

2022-11-06 Thread GitBox
MaxGekk commented on PR #38522: URL: https://github.com/apache/spark/pull/38522#issuecomment-1304859631 +1, LGTM. Merging to master. Thank you, @panbingkun. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] srielau commented on a diff in pull request #38521: [SPARK-41020][SQL] Rename the error class `_LEGACY_ERROR_TEMP_1019` to `STAR_GROUP_BY_POS`

2022-11-06 Thread GitBox
srielau commented on code in PR #38521: URL: https://github.com/apache/spark/pull/38521#discussion_r1014863229 ## core/src/main/resources/error/error-classes.json: ## @@ -912,6 +912,11 @@ ], "sqlState" : "22023" }, + "STAR_GROUP_BY_POS" : { Review Comment:

[GitHub] [spark] srielau commented on a diff in pull request #38490: [SPARK-41009][SQL] Rename the error class `_LEGACY_ERROR_TEMP_1070` to `LOCATION_ALREADY_EXISTS`

2022-11-06 Thread GitBox
srielau commented on code in PR #38490: URL: https://github.com/apache/spark/pull/38490#discussion_r1014862790 ## core/src/main/resources/error/error-classes.json: ## @@ -668,6 +668,24 @@ } } }, + "LOCATION_ALREADY_EXISTS" : { +"message" : [ + "Cannot

[GitHub] [spark] aokolnychyi commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-11-06 Thread GitBox
aokolnychyi commented on code in PR #36304: URL: https://github.com/apache/spark/pull/36304#discussion_r1014855632 ## sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the

[GitHub] [spark] aokolnychyi commented on pull request #38526: [SPARK-38959][SQL][FOLLOW-UP] Address feedback for RowLevelOperationRuntimeGroupFiltering

2022-11-06 Thread GitBox
aokolnychyi commented on PR #38526: URL: https://github.com/apache/spark/pull/38526#issuecomment-1304834781 @cloud-fan, could you take a look to see if that's what you meant in PR #36304? There is also one open question

[GitHub] [spark] aokolnychyi opened a new pull request, #38526: [SPARK-38959][SQL][FOLLOW-UP] Address feedback for RowLevelOperationRuntimeGroupFiltering

2022-11-06 Thread GitBox
aokolnychyi opened a new pull request, #38526: URL: https://github.com/apache/spark/pull/38526 ### What changes were proposed in this pull request? This PR is to address the feedback on PR #36304 after that change was merged. ### Why are the changes needed?

[GitHub] [spark] aokolnychyi commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-11-06 Thread GitBox
aokolnychyi commented on code in PR #36304: URL: https://github.com/apache/spark/pull/36304#discussion_r1014850793 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala: ## @@ -50,7 +50,8 @@ class SparkOptimizer( override def defaultBatches:

[GitHub] [spark] aokolnychyi commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-11-06 Thread GitBox
aokolnychyi commented on code in PR #36304: URL: https://github.com/apache/spark/pull/36304#discussion_r1014850793 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala: ## @@ -50,7 +50,8 @@ class SparkOptimizer( override def defaultBatches:

[GitHub] [spark] aokolnychyi commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-11-06 Thread GitBox
aokolnychyi commented on code in PR #36304: URL: https://github.com/apache/spark/pull/36304#discussion_r1014850975 ## sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the

[GitHub] [spark] aokolnychyi commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-11-06 Thread GitBox
aokolnychyi commented on code in PR #36304: URL: https://github.com/apache/spark/pull/36304#discussion_r1014850975 ## sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the

[GitHub] [spark] aokolnychyi commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-11-06 Thread GitBox
aokolnychyi commented on code in PR #36304: URL: https://github.com/apache/spark/pull/36304#discussion_r1014850793 ## sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala: ## @@ -50,7 +50,8 @@ class SparkOptimizer( override def defaultBatches:

[GitHub] [spark] aokolnychyi commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-11-06 Thread GitBox
aokolnychyi commented on code in PR #36304: URL: https://github.com/apache/spark/pull/36304#discussion_r1014850614 ## sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the

[GitHub] [spark] aokolnychyi commented on a diff in pull request #36304: [SPARK-38959][SQL] DS V2: Support runtime group filtering in row-level commands

2022-11-06 Thread GitBox
aokolnychyi commented on code in PR #36304: URL: https://github.com/apache/spark/pull/36304#discussion_r1014850115 ## sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/RowLevelOperationRuntimeGroupFiltering.scala: ## @@ -0,0 +1,98 @@ +/* + * Licensed to the

[GitHub] [spark] srowen commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
srowen commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304827262 Looks OK if tests pass. I think we need to update the two together, so this resolves both JIRAs -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] bsikander commented on pull request #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10

2022-11-06 Thread GitBox
bsikander commented on PR #38352: URL: https://github.com/apache/spark/pull/38352#issuecomment-1304825343 @sunchao thank you for you efforts. When can we expect the release of 3.2.3? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] srowen commented on pull request #38499: [MINOR][DOC] updated some grammar and a missed period in the tuning doc

2022-11-06 Thread GitBox
srowen commented on PR #38499: URL: https://github.com/apache/spark/pull/38499#issuecomment-1304822113 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] srowen closed pull request #38499: [MINOR][DOC] updated some grammar and a missed period in the tuning doc

2022-11-06 Thread GitBox
srowen closed pull request #38499: [MINOR][DOC] updated some grammar and a missed period in the tuning doc URL: https://github.com/apache/spark/pull/38499 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] srowen commented on pull request #38525: [SPARK-40950][BUILD][FOLLOWUP] Fix Scala 2.13 Mima check

2022-11-06 Thread GitBox
srowen commented on PR #38525: URL: https://github.com/apache/spark/pull/38525#issuecomment-1304811696 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] srowen closed pull request #38525: [SPARK-40950][BUILD][FOLLOWUP] Fix Scala 2.13 Mima check

2022-11-06 Thread GitBox
srowen closed pull request #38525: [SPARK-40950][BUILD][FOLLOWUP] Fix Scala 2.13 Mima check URL: https://github.com/apache/spark/pull/38525 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] bjornjorgensen commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
bjornjorgensen commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304801289 @pjfanning Yes, but I just wanna ask @LuciferYang if he has been looking at it. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] pjfanning commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
pjfanning commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304800274 Shouldn't guava changes be done in a separate PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] bjornjorgensen commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
bjornjorgensen commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304798904 @LuciferYang Have you tried to upgrade guava to 31.1-jre? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] pjfanning commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
pjfanning commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304798867 @LuciferYang feel free to continue with your PR. I marked SPARK-41023 as a duplicate of SPARK-40911. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] bjornjorgensen commented on pull request #38523: [SPARK-41023][BUILD] Upgrade Jackson to 2.14.0

2022-11-06 Thread GitBox
bjornjorgensen commented on PR #38523: URL: https://github.com/apache/spark/pull/38523#issuecomment-1304797597 https://issues.apache.org/jira/browse/SPARK-40911 I think that @pjfanning will work on this one. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] panbingkun commented on a diff in pull request #38520: [SPARK-41021][SQL][TESTS] Test some subclasses of error class DATATYPE_MISMATCH

2022-11-06 Thread GitBox
panbingkun commented on code in PR #38520: URL: https://github.com/apache/spark/pull/38520#discussion_r1014823662 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala: ## @@ -581,6 +591,19 @@ class JsonExpressionsSuite extends

[GitHub] [spark] panbingkun commented on a diff in pull request #38520: [SPARK-41021][SQL][TESTS] Test some subclasses of error class DATATYPE_MISMATCH

2022-11-06 Thread GitBox
panbingkun commented on code in PR #38520: URL: https://github.com/apache/spark/pull/38520#discussion_r1014823446 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala: ## @@ -406,6 +407,15 @@ class JsonExpressionsSuite extends

[GitHub] [spark] panbingkun commented on a diff in pull request #38520: [SPARK-41021][SQL][TESTS] Test some subclasses of error class DATATYPE_MISMATCH

2022-11-06 Thread GitBox
panbingkun commented on code in PR #38520: URL: https://github.com/apache/spark/pull/38520#discussion_r1014823040 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala: ## @@ -346,6 +358,18 @@ class CollectionExpressionsSuite

[GitHub] [spark] panbingkun commented on a diff in pull request #38520: [SPARK-41021][SQL][TESTS] Test some subclasses of error class DATATYPE_MISMATCH

2022-11-06 Thread GitBox
panbingkun commented on code in PR #38520: URL: https://github.com/apache/spark/pull/38520#discussion_r1014823040 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala: ## @@ -346,6 +358,18 @@ class CollectionExpressionsSuite

[GitHub] [spark] HyukjinKwon closed pull request #38524: [SPARK-41024][BUILD] Upgrade scala-maven-plugin to 4.7.2

2022-11-06 Thread GitBox
HyukjinKwon closed pull request #38524: [SPARK-41024][BUILD] Upgrade scala-maven-plugin to 4.7.2 URL: https://github.com/apache/spark/pull/38524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #38524: [SPARK-41024][BUILD] Upgrade scala-maven-plugin to 4.7.2

2022-11-06 Thread GitBox
HyukjinKwon commented on PR #38524: URL: https://github.com/apache/spark/pull/38524#issuecomment-1304783719 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] mridulm commented on pull request #38525: [SPARK-40950][BUILD][FOLLOWUP] Fix Scala 2.13 Mima check

2022-11-06 Thread GitBox
mridulm commented on PR #38525: URL: https://github.com/apache/spark/pull/38525#issuecomment-1304775260 Will wait for tests to pass ... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on pull request #38491: [MINOR][CONNECT] Remove unused import in commands.proto

2022-11-06 Thread GitBox
LuciferYang commented on PR #38491: URL: https://github.com/apache/spark/pull/38491#issuecomment-1304764271 looks fine -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] LuciferYang commented on pull request #38525: [SPARK-40950][BUILD][FOLLOWUP] Fix Scala 2.13 Mima check

2022-11-06 Thread GitBox
LuciferYang commented on PR #38525: URL: https://github.com/apache/spark/pull/38525#issuecomment-1304762343 cc @@dongjoon-hyun @srowen @mridulm @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

  1   2   >