[GitHub] [spark] panbingkun opened a new pull request, #39218: [SPARK-41714][BUILD] Update maven-checkstyle-plugin from 3.1.2 to 3.2.0

2022-12-26 Thread GitBox
panbingkun opened a new pull request, #39218: URL: https://github.com/apache/spark/pull/39218 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] zhengruifeng closed pull request #39209: [SPARK-41703][CONNECT][PYTHON] Combine NullType and typed_null in Literal

2022-12-26 Thread GitBox
zhengruifeng closed pull request #39209: [SPARK-41703][CONNECT][PYTHON] Combine NullType and typed_null in Literal URL: https://github.com/apache/spark/pull/39209 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [spark] zhengruifeng commented on pull request #39209: [SPARK-41703][CONNECT][PYTHON] Combine NullType and typed_null in Literal

2022-12-26 Thread GitBox
zhengruifeng commented on PR #39209: URL: https://github.com/apache/spark/pull/39209#issuecomment-1364986470 merged into master, thanks @HyukjinKwon for the reivews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [spark] LuciferYang commented on a diff in pull request #39202: [WIP][SPARK-41685][UI] Support Protobuf serializer for the KVStore in History server

2022-12-26 Thread GitBox
LuciferYang commented on code in PR #39202: URL: https://github.com/apache/spark/pull/39202#discussion_r1057124135 ## core/src/main/scala/org/apache/spark/internal/config/History.scala: ## @@ -79,6 +79,21 @@ private[spark] object History { .stringConf .createOptional

[GitHub] [spark] LuciferYang commented on pull request #39202: [WIP][SPARK-41685][UI] Support Protobuf serializer for the KVStore in History server

2022-12-26 Thread GitBox
LuciferYang commented on PR #39202: URL: https://github.com/apache/spark/pull/39202#issuecomment-1364993175 > Make it possible for SHS to read the live UI rocksdb instance. Will this be supported in the future or has it been supported now? -- This is an automated message fro

[GitHub] [spark] LuciferYang commented on a diff in pull request #38865: [SPARK-41232][SQL][PYTHON] Adding array_append function

2022-12-26 Thread GitBox
LuciferYang commented on code in PR #38865: URL: https://github.com/apache/spark/pull/38865#discussion_r1057134901 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -4600,3 +4600,111 @@ case class ArrayExcept(left: Express

[GitHub] [spark] HyukjinKwon commented on pull request #39210: [SPARK-41704][BUILD] Upgrade `sbt-assembly` from 2.0.0 to 2.1.0

2022-12-26 Thread GitBox
HyukjinKwon commented on PR #39210: URL: https://github.com/apache/spark/pull/39210#issuecomment-1364999051 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon closed pull request #39210: [SPARK-41704][BUILD] Upgrade `sbt-assembly` from 2.0.0 to 2.1.0

2022-12-26 Thread GitBox
HyukjinKwon closed pull request #39210: [SPARK-41704][BUILD] Upgrade `sbt-assembly` from 2.0.0 to 2.1.0 URL: https://github.com/apache/spark/pull/39210 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

[GitHub] [spark] uzadude opened a new pull request, #39219: [WIP][SPARK-41277] Auto infer bucketing info for shuffled actions

2022-12-26 Thread GitBox
uzadude opened a new pull request, #39219: URL: https://github.com/apache/spark/pull/39219 ### What changes were proposed in this pull request? This PR proposes to auto-infer bucketing information from actions that contain a shuffle. ### Why are the changes needed? Seems

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38865: [SPARK-41232][SQL][PYTHON] Adding array_append function

2022-12-26 Thread GitBox
zhengruifeng commented on code in PR #38865: URL: https://github.com/apache/spark/pull/38865#discussion_r1057139556 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -4600,3 +4600,133 @@ case class ArrayExcept(left: Expres

[GitHub] [spark] ulysses-you opened a new pull request, #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
ulysses-you opened a new pull request, #39220: URL: https://github.com/apache/spark/pull/39220 ### What changes were proposed in this pull request? This pr aims to make ctas use a nested execution instead of running data writing cmmand. So, we can clean up ctas itself t

[GitHub] [spark] ulysses-you commented on a diff in pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
ulysses-you commented on code in PR #39220: URL: https://github.com/apache/spark/pull/39220#discussion_r1057141344 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala: ## @@ -143,27 +141,7 @@ case class CreateDataSourceTableAsSelectComm

[GitHub] [spark] beliefer commented on pull request #39213: [SPARK-41706][CONNECT][PYTHON] `pyspark_types_to_proto_types` should supports `MapType`

2022-12-26 Thread GitBox
beliefer commented on PR #39213: URL: https://github.com/apache/spark/pull/39213#issuecomment-1365006353 The failure GA is unrelated to this PR. ping @HyukjinKwon @grundprinzip @zhengruifeng @amaliujia -- This is an automated message from the Apache Git Service. To respond to the messa

[GitHub] [spark] beliefer commented on pull request #38799: [SPARK-37099][SQL] Introduce the group limit of Window for rank-based filter to optimize top-k computation

2022-12-26 Thread GitBox
beliefer commented on PR #38799: URL: https://github.com/apache/spark/pull/38799#issuecomment-1365007428 ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] zhengruifeng commented on pull request #38867: [SPARK-41234][SQL][PYTHON] Add `array_insert` function

2022-12-26 Thread GitBox
zhengruifeng commented on PR #38867: URL: https://github.com/apache/spark/pull/38867#issuecomment-1365013704 @Daniel-Davies Sorry for the late reply. On `Input item parameter is null:` The issue of NULL handling was controversial, and we spend some time to discuss with some SQL e

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

2022-12-26 Thread GitBox
zhengruifeng commented on code in PR #38947: URL: https://github.com/apache/spark/pull/38947#discussion_r1057148245 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala: ## @@ -1840,6 +1840,47 @@ class CollectionExpressionsSui

[GitHub] [spark] itholic commented on pull request #39137: [SPARK-41586][SPARK-41598][PYTHON] Introduce PySpark errors package and error classes

2022-12-26 Thread GitBox
itholic commented on PR #39137: URL: https://github.com/apache/spark/pull/39137#issuecomment-1365018902 Thanks @grundprinzip for the review. I agree that your comments and feel it's pretty reasonable. Actually, I once submitted a PR that implemented the framework on PySpark-side (h

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39214: [WIP][SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon commented on code in PR #39214: URL: https://github.com/apache/spark/pull/39214#discussion_r1057150277 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -99,6 +100,47 @@ class SparkConnectPlanner(session:

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39214: [WIP][SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon commented on code in PR #39214: URL: https://github.com/apache/spark/pull/39214#discussion_r1057150277 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -99,6 +100,47 @@ class SparkConnectPlanner(session:

[GitHub] [spark] zhengruifeng commented on pull request #38867: [SPARK-41234][SQL][PYTHON] Add `array_insert` function

2022-12-26 Thread GitBox
zhengruifeng commented on PR #38867: URL: https://github.com/apache/spark/pull/38867#issuecomment-1365028273 also cc @beliefer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [spark] lyy-pineapple commented on pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-12-26 Thread GitBox
lyy-pineapple commented on PR #38171: URL: https://github.com/apache/spark/pull/38171#issuecomment-1365029419 > https://user-images.githubusercontent.com/8748814/204439049-53f0bd4f-9ea0-4289-8268-d16aef5b4334.png";> > > @lyy-pineapple Would you share the test sql pattern? I test some c

[GitHub] [spark] zhengruifeng commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

2022-12-26 Thread GitBox
zhengruifeng commented on PR #38947: URL: https://github.com/apache/spark/pull/38947#issuecomment-1365029531 also cc @beliefer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[GitHub] [spark] LuciferYang commented on pull request #38947: [SPARK-41233][SQL] Add `array_prepend` function

2022-12-26 Thread GitBox
LuciferYang commented on PR #38947: URL: https://github.com/apache/spark/pull/38947#issuecomment-1365031813 @navinvishy cloud you resolve the conflict? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

[GitHub] [spark] LuciferYang commented on pull request #38867: [SPARK-41234][SQL][PYTHON] Add `array_insert` function

2022-12-26 Thread GitBox
LuciferYang commented on PR #38867: URL: https://github.com/apache/spark/pull/38867#issuecomment-1365032081 @Daniel-Davies could you resolve the conflict? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38874: [SPARK-41235][SQL][PYTHON]High-order function: array_compact implementation

2022-12-26 Thread GitBox
zhengruifeng commented on code in PR #38874: URL: https://github.com/apache/spark/pull/38874#discussion_r1057161110 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -4600,3 +4600,51 @@ case class ArrayExcept(left: Express

[GitHub] [spark] lyy-pineapple commented on a diff in pull request #38171: [SPARK-9213] [SQL] Improve regular expression performance (via joni)

2022-12-26 Thread GitBox
lyy-pineapple commented on code in PR #38171: URL: https://github.com/apache/spark/pull/38171#discussion_r1057161457 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressionsJoni.scala: ## @@ -0,0 +1,471 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] zhengruifeng closed pull request #39216: [SPARK-41710][CONNECT][PYTHON] Implement `Column.between`

2022-12-26 Thread GitBox
zhengruifeng closed pull request #39216: [SPARK-41710][CONNECT][PYTHON] Implement `Column.between` URL: https://github.com/apache/spark/pull/39216 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] zhengruifeng commented on pull request #39216: [SPARK-41710][CONNECT][PYTHON] Implement `Column.between`

2022-12-26 Thread GitBox
zhengruifeng commented on PR #39216: URL: https://github.com/apache/spark/pull/39216#issuecomment-1365044421 merged into master, thanks @HyukjinKwon for the reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [spark] LuciferYang commented on a diff in pull request #39215: [SPARK-41709][CORE][SQL][UI] Explicitly define `Seq` as `collection.Seq` to avoid `toSeq` when create ui objects from protobuf

2022-12-26 Thread GitBox
LuciferYang commented on code in PR #39215: URL: https://github.com/apache/spark/pull/39215#discussion_r1057167898 ## core/src/main/scala/org/apache/spark/status/api/v1/api.scala: ## @@ -461,7 +461,7 @@ class ApplicationEnvironmentInfo private[spark] ( val systemProperties:

[GitHub] [spark] LuciferYang commented on a diff in pull request #39215: [SPARK-41709][CORE][SQL][UI] Explicitly define `Seq` as `collection.Seq` to avoid `toSeq` when create ui objects from protobuf

2022-12-26 Thread GitBox
LuciferYang commented on code in PR #39215: URL: https://github.com/apache/spark/pull/39215#discussion_r1057167898 ## core/src/main/scala/org/apache/spark/status/api/v1/api.scala: ## @@ -461,7 +461,7 @@ class ApplicationEnvironmentInfo private[spark] ( val systemProperties:

[GitHub] [spark] codecov-commenter commented on pull request #39215: [SPARK-41709][CORE][SQL][UI] Explicitly define `Seq` as `collection.Seq` to avoid `toSeq` when create ui objects from protobuf obje

2022-12-26 Thread GitBox
codecov-commenter commented on PR #39215: URL: https://github.com/apache/spark/pull/39215#issuecomment-1365070569 # [Codecov](https://codecov.io/gh/apache/spark/pull/39215?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Soft

[GitHub] [spark] grundprinzip commented on pull request #39137: [SPARK-41586][SPARK-41598][PYTHON] Introduce PySpark errors package and error classes

2022-12-26 Thread GitBox
grundprinzip commented on PR #39137: URL: https://github.com/apache/spark/pull/39137#issuecomment-1365099279 > * I worried that maybe it would not be easy to maintenance when the rules on one side (PySpark vs JVM) were arbitrarily changed in the future. So, I wanted to manage all errors in

[GitHub] [spark] infoankitp commented on a diff in pull request #38865: [SPARK-41232][SQL][PYTHON] Adding array_append function

2022-12-26 Thread GitBox
infoankitp commented on code in PR #38865: URL: https://github.com/apache/spark/pull/38865#discussion_r1057196837 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala: ## @@ -2596,4 +2596,113 @@ class CollectionExpressionsSuit

[GitHub] [spark] infoankitp commented on a diff in pull request #38865: [SPARK-41232][SQL][PYTHON] Adding array_append function

2022-12-26 Thread GitBox
infoankitp commented on code in PR #38865: URL: https://github.com/apache/spark/pull/38865#discussion_r1057197385 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -4600,3 +4600,133 @@ case class ArrayExcept(left: Expressi

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39214: [SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon commented on code in PR #39214: URL: https://github.com/apache/spark/pull/39214#discussion_r1057211943 ## python/pyspark/sql/connect/session.py: ## @@ -120,6 +121,11 @@ def __init__(self, connectionString: str, userId: Optional[str] = None): # Parse the con

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39214: [SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon commented on code in PR #39214: URL: https://github.com/apache/spark/pull/39214#discussion_r1057212456 ## python/pyspark/sql/tests/test_catalog.py: ## @@ -16,20 +16,21 @@ # Review Comment: The changes made in reused PySpark tests are two: 1. Do not check

[GitHub] [spark] HyukjinKwon commented on pull request #39214: [SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon commented on PR #39214: URL: https://github.com/apache/spark/pull/39214#issuecomment-1365124966 Should be ready for a look. cc @amaliujia @zhengruifeng @hvanhovell @grundprinzip -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39214: [SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon commented on code in PR #39214: URL: https://github.com/apache/spark/pull/39214#discussion_r1057211943 ## python/pyspark/sql/connect/session.py: ## @@ -120,6 +121,11 @@ def __init__(self, connectionString: str, userId: Optional[str] = None): # Parse the con

[GitHub] [spark] cloud-fan commented on a diff in pull request #39133: [SPARK-41595][SQL] Support generator function explode/explode_outer in the FROM clause

2022-12-26 Thread GitBox
cloud-fan commented on code in PR #39133: URL: https://github.com/apache/spark/pull/39133#discussion_r1057233265 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreePatterns.scala: ## @@ -134,6 +134,7 @@ object TreePattern extends Enumeration { val UNRESOL

[GitHub] [spark] infoankitp commented on a diff in pull request #38865: [SPARK-41232][SQL][PYTHON] Adding array_append function

2022-12-26 Thread GitBox
infoankitp commented on code in PR #38865: URL: https://github.com/apache/spark/pull/38865#discussion_r1057236664 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -4600,3 +4600,111 @@ case class ArrayExcept(left: Expressi

[GitHub] [spark] infoankitp commented on a diff in pull request #38865: [SPARK-41232][SQL][PYTHON] Adding array_append function

2022-12-26 Thread GitBox
infoankitp commented on code in PR #38865: URL: https://github.com/apache/spark/pull/38865#discussion_r1057236944 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala: ## @@ -4600,3 +4600,111 @@ case class ArrayExcept(left: Expressi

[GitHub] [spark] bjornjorgensen commented on pull request #39196: [SPARK-41695][BUILD][3.3] Upgrade netty to 4.1.86.Final

2022-12-26 Thread GitBox
bjornjorgensen commented on PR #39196: URL: https://github.com/apache/spark/pull/39196#issuecomment-1365190148 My intentions are to explain to a new contributor how I do it and what tools I use. Sonar is built on best practice rules, the problem is that not everything hits equally well. Her

[GitHub] [spark] shrprasa opened a new pull request, #39221: [SPARK-41719] [CORE]: SSLOptions sub settings should be set only when ssl is enabled

2022-12-26 Thread GitBox
shrprasa opened a new pull request, #39221: URL: https://github.com/apache/spark/pull/39221 ### What changes were proposed in this pull request? In SSLOptions rest of the settings should be set only when ssl is enabled. ### Why are the changes needed? If ${ns}.enabled is fals

[GitHub] [spark] cloud-fan opened a new pull request, #39222: [SPARK-41720][SQL] Rename UnresolvedFunc to UnresolvedFunctionName

2022-12-26 Thread GitBox
cloud-fan opened a new pull request, #39222: URL: https://github.com/apache/spark/pull/39222 ### What changes were proposed in this pull request? It's a bit confusing to have both `UnresolvedFunc` and `UnresolvedFunction`. This PR renames `UnresolvedFunc` to `UnresolvedFunctio

[GitHub] [spark] cloud-fan commented on pull request #39222: [SPARK-41720][SQL] Rename UnresolvedFunc to UnresolvedFunctionName

2022-12-26 Thread GitBox
cloud-fan commented on PR #39222: URL: https://github.com/apache/spark/pull/39222#issuecomment-1365195933 cc @rxin @viirya -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon commented on pull request #39214: [SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon commented on PR #39214: URL: https://github.com/apache/spark/pull/39214#issuecomment-1365200657 This PR supports all as the same as the current PySpark does for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HyukjinKwon opened a new pull request, #39223: [WIP][SPARK-41717][CONNECT] Implement the command logic for print and _repr_html_

2022-12-26 Thread GitBox
HyukjinKwon opened a new pull request, #39223: URL: https://github.com/apache/spark/pull/39223 ### What changes were proposed in this pull request? TBD ### Why are the changes needed? TBD ### Does this PR introduce _any_ user-facing change? TBD ### How wa

[GitHub] [spark] bjornjorgensen commented on a diff in pull request #39180: [SPARK-41649][CONNECT] Deduplicate docstrings in pyspark.sql.connect.window

2022-12-26 Thread GitBox
bjornjorgensen commented on code in PR #39180: URL: https://github.com/apache/spark/pull/39180#discussion_r1057246975 ## python/pyspark/sql/connect/window.py: ## @@ -306,263 +217,27 @@ class Window: @staticmethod def partitionBy(*cols: Union["ColumnOrName", List["Col

[GitHub] [spark] beliefer commented on a diff in pull request #38865: [SPARK-41232][SQL][PYTHON] Adding array_append function

2022-12-26 Thread GitBox
beliefer commented on code in PR #38865: URL: https://github.com/apache/spark/pull/38865#discussion_r1057262332 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala: ## @@ -2596,4 +2596,113 @@ class CollectionExpressionsSuite

[GitHub] [spark] srowen commented on a diff in pull request #39215: [SPARK-41709][CORE][SQL][UI] Explicitly define `Seq` as `collection.Seq` to avoid `toSeq` when create ui objects from protobuf objec

2022-12-26 Thread GitBox
srowen commented on code in PR #39215: URL: https://github.com/apache/spark/pull/39215#discussion_r1057328506 ## sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala: ## @@ -486,7 +486,7 @@ private class LiveExecutionData(val executionId: Long) e

[GitHub] [spark] Daniel-Davies commented on pull request #38867: [SPARK-41234][SQL][PYTHON] Add `array_insert` function

2022-12-26 Thread GitBox
Daniel-Davies commented on PR #38867: URL: https://github.com/apache/spark/pull/38867#issuecomment-1365494699 @zhengruifeng Great direction provided by your message above, thank you. Thank you very much also for consulting with experts to resolve the conflict of ideas. I'll implement the be

[GitHub] [spark] HyukjinKwon commented on pull request #39214: [SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon commented on PR #39214: URL: https://github.com/apache/spark/pull/39214#issuecomment-1365495742 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon closed pull request #39214: [SPARK-41707][CONNECT] Implement Catalog API in Spark Connect

2022-12-26 Thread GitBox
HyukjinKwon closed pull request #39214: [SPARK-41707][CONNECT] Implement Catalog API in Spark Connect URL: https://github.com/apache/spark/pull/39214 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] HyukjinKwon commented on pull request #39217: [SPARK-41711][BUILD] Upgrade protobuf-java to 3.21.12

2022-12-26 Thread GitBox
HyukjinKwon commented on PR #39217: URL: https://github.com/apache/spark/pull/39217#issuecomment-1365501507 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon closed pull request #39217: [SPARK-41711][BUILD] Upgrade protobuf-java to 3.21.12

2022-12-26 Thread GitBox
HyukjinKwon closed pull request #39217: [SPARK-41711][BUILD] Upgrade protobuf-java to 3.21.12 URL: https://github.com/apache/spark/pull/39217 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

[GitHub] [spark] HyukjinKwon commented on pull request #39222: [SPARK-41720][SQL] Rename UnresolvedFunc to UnresolvedFunctionName

2022-12-26 Thread GitBox
HyukjinKwon commented on PR #39222: URL: https://github.com/apache/spark/pull/39222#issuecomment-1365502129 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon closed pull request #39222: [SPARK-41720][SQL] Rename UnresolvedFunc to UnresolvedFunctionName

2022-12-26 Thread GitBox
HyukjinKwon closed pull request #39222: [SPARK-41720][SQL] Rename UnresolvedFunc to UnresolvedFunctionName URL: https://github.com/apache/spark/pull/39222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #39180: [SPARK-41649][CONNECT] Deduplicate docstrings in pyspark.sql.connect.window

2022-12-26 Thread GitBox
HyukjinKwon commented on code in PR #39180: URL: https://github.com/apache/spark/pull/39180#discussion_r1057361102 ## python/pyspark/sql/connect/window.py: ## @@ -306,263 +217,27 @@ class Window: @staticmethod def partitionBy(*cols: Union["ColumnOrName", List["Column

[GitHub] [spark] github-actions[bot] closed pull request #37721: [SPARK-40272][CORE]Support service port custom with range

2022-12-26 Thread GitBox
github-actions[bot] closed pull request #37721: [SPARK-40272][CORE]Support service port custom with range URL: https://github.com/apache/spark/pull/37721 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] github-actions[bot] commented on pull request #37910: [SPARK-40469][CORE] Avoid creating directory failures

2022-12-26 Thread GitBox
github-actions[bot] commented on PR #37910: URL: https://github.com/apache/spark/pull/37910#issuecomment-1365517957 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #37625: [SPARK-40177][SQL] Simplify condition of form (a==b) || (a==null&&b==null) to a<=>b

2022-12-26 Thread GitBox
github-actions[bot] commented on PR #37625: URL: https://github.com/apache/spark/pull/37625#issuecomment-1365517972 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] HyukjinKwon commented on pull request #39223: [SPARK-41717][CONNECT] Implement the command logic for print and _repr_html_

2022-12-26 Thread GitBox
HyukjinKwon commented on PR #39223: URL: https://github.com/apache/spark/pull/39223#issuecomment-1365522709 cc @amaliujia @zhengruifeng @grundprinzip FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon opened a new pull request, #39224: [SPARK-41721][CONNECT][TESTS] Enable doctests in pyspark.sql.connect.catalog

2022-12-26 Thread GitBox
HyukjinKwon opened a new pull request, #39224: URL: https://github.com/apache/spark/pull/39224 ### What changes were proposed in this pull request? This PR proposes to enable doctests in `pyspark.sql.connect.catalog` that is virtually the same as `pyspark.sql.catalog`. ### Why

[GitHub] [spark] HyukjinKwon commented on pull request #39223: [SPARK-41717][CONNECT] Deduplicate print and _repr_html_ at LogicalPlan

2022-12-26 Thread GitBox
HyukjinKwon commented on PR #39223: URL: https://github.com/apache/spark/pull/39223#issuecomment-1365536717 Related tests passed. Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

[GitHub] [spark] HyukjinKwon closed pull request #39223: [SPARK-41717][CONNECT] Deduplicate print and _repr_html_ at LogicalPlan

2022-12-26 Thread GitBox
HyukjinKwon closed pull request #39223: [SPARK-41717][CONNECT] Deduplicate print and _repr_html_ at LogicalPlan URL: https://github.com/apache/spark/pull/39223 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] techaddict opened a new pull request, #39225: [WIP][SPARK-41654][CONNECT][TESTS] doctests for pyspark.sqlconnect.window

2022-12-26 Thread GitBox
techaddict opened a new pull request, #39225: URL: https://github.com/apache/spark/pull/39225 ### What changes were proposed in this pull request? This PR proposes to enable doctests in pyspark.sql.connect.window that is virtually the same as pyspark.sql.window. ### Why are the cha

[GitHub] [spark] techaddict commented on a diff in pull request #39225: [WIP][SPARK-41654][CONNECT][TESTS] doctests for pyspark.sqlconnect.window

2022-12-26 Thread GitBox
techaddict commented on code in PR #39225: URL: https://github.com/apache/spark/pull/39225#discussion_r1057398834 ## python/pyspark/sql/connect/window.py: ## @@ -201,7 +201,7 @@ def __repr__(self) -> str: return "WindowSpec(" + ", ".join(strs) + ")" -WindowSpec.__do

[GitHub] [spark] techaddict commented on a diff in pull request #39225: [WIP][SPARK-41654][CONNECT][TESTS] doctests for pyspark.sqlconnect.window

2022-12-26 Thread GitBox
techaddict commented on code in PR #39225: URL: https://github.com/apache/spark/pull/39225#discussion_r1057399255 ## python/pyspark/sql/connect/window.py: ## @@ -218,27 +218,201 @@ class Window: @staticmethod def partitionBy(*cols: Union["ColumnOrName", List["ColumnO

[GitHub] [spark] techaddict commented on pull request #39225: [WIP][SPARK-41654][CONNECT][TESTS] doctests for pyspark.sqlconnect.window

2022-12-26 Thread GitBox
techaddict commented on PR #39225: URL: https://github.com/apache/spark/pull/39225#issuecomment-1365550356 @HyukjinKwon can you take a look ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [spark] LuciferYang commented on pull request #39217: [SPARK-41711][BUILD] Upgrade protobuf-java to 3.21.12

2022-12-26 Thread GitBox
LuciferYang commented on PR #39217: URL: https://github.com/apache/spark/pull/39217#issuecomment-1365557057 Thanks @srowen @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] LuciferYang commented on a diff in pull request #39215: [SPARK-41709][CORE][SQL][UI] Explicitly define `Seq` as `collection.Seq` to avoid `toSeq` when create ui objects from protobuf

2022-12-26 Thread GitBox
LuciferYang commented on code in PR #39215: URL: https://github.com/apache/spark/pull/39215#discussion_r1057406919 ## sql/core/src/main/scala/org/apache/spark/sql/execution/ui/SQLAppStatusListener.scala: ## @@ -486,7 +486,7 @@ private class LiveExecutionData(val executionId: Lon

[GitHub] [spark] ulysses-you commented on a diff in pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
ulysses-you commented on code in PR #39220: URL: https://github.com/apache/spark/pull/39220#discussion_r1057141344 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala: ## @@ -143,27 +141,7 @@ case class CreateDataSourceTableAsSelectComm

[GitHub] [spark] ulysses-you commented on a diff in pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
ulysses-you commented on code in PR #39220: URL: https://github.com/apache/spark/pull/39220#discussion_r1057410190 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala: ## @@ -143,29 +141,9 @@ case class CreateDataSourceTableAsSelectComm

[GitHub] [spark] ulysses-you commented on pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
ulysses-you commented on PR #39220: URL: https://github.com/apache/spark/pull/39220#issuecomment-1365567962 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. T

[GitHub] [spark] LuciferYang opened a new pull request, #39226: [SPARK-41694][CORE] Add new config to clean up `spark.ui.store.path` directory when `SparkContext.stop()`

2022-12-26 Thread GitBox
LuciferYang opened a new pull request, #39226: URL: https://github.com/apache/spark/pull/39226 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

[GitHub] [spark] zhengruifeng opened a new pull request, #39227: [SPARK-41722][CONNECT][PYTHON] Implement 3 missing time window functions

2022-12-26 Thread GitBox
zhengruifeng opened a new pull request, #39227: URL: https://github.com/apache/spark/pull/39227 ### What changes were proposed in this pull request? Implement 3 missing time window functions ### Why are the changes needed? For API coverage after this PR, following one

[GitHub] [spark] zhengruifeng commented on a diff in pull request #39227: [SPARK-41722][CONNECT][PYTHON] Implement 3 missing time window functions

2022-12-26 Thread GitBox
zhengruifeng commented on code in PR #39227: URL: https://github.com/apache/spark/pull/39227#discussion_r1057415979 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -691,6 +691,46 @@ class SparkConnectPlanner(sessio

[GitHub] [spark] cloud-fan commented on a diff in pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
cloud-fan commented on code in PR #39220: URL: https://github.com/apache/spark/pull/39220#discussion_r1057416904 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala: ## @@ -143,29 +141,9 @@ case class CreateDataSourceTableAsSelectComman

[GitHub] [spark] cloud-fan commented on a diff in pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
cloud-fan commented on code in PR #39220: URL: https://github.com/apache/spark/pull/39220#discussion_r1057417702 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala: ## @@ -21,43 +21,23 @@ import scala.util.control.NonFatal impo

[GitHub] [spark] cloud-fan commented on a diff in pull request #39182: [SPARK-41440][CONNECT][PYTHON] Create a new proto message for `RandomSplit`

2022-12-26 Thread GitBox
cloud-fan commented on code in PR #39182: URL: https://github.com/apache/spark/pull/39182#discussion_r1057418565 ## connector/connect/common/src/main/protobuf/spark/connect/relations.proto: ## @@ -378,10 +379,6 @@ message Sample { // (Optional) The random seed. optional

[GitHub] [spark] LuciferYang commented on pull request #39226: [SPARK-41694][CORE] Add new config to clean up `spark.ui.store.path` directory when `SparkContext.stop()`

2022-12-26 Thread GitBox
LuciferYang commented on PR #39226: URL: https://github.com/apache/spark/pull/39226#issuecomment-1365582194 cc @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] LuciferYang commented on a diff in pull request #39226: [SPARK-41694][CORE] Add new config to clean up `spark.ui.store.path` directory when `SparkContext.stop()`

2022-12-26 Thread GitBox
LuciferYang commented on code in PR #39226: URL: https://github.com/apache/spark/pull/39226#discussion_r1057419363 ## core/src/main/scala/org/apache/spark/status/AppStatusStore.scala: ## @@ -36,7 +36,8 @@ import org.apache.spark.util.kvstore.KVStore */ private[spark] class Ap

[GitHub] [spark] cloud-fan commented on pull request #39186: [SPARK-41690][SQL][CONNECT] Agnostic Encoders

2022-12-26 Thread GitBox
cloud-fan commented on PR #39186: URL: https://github.com/apache/spark/pull/39186#issuecomment-1365585284 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] cloud-fan closed pull request #39186: [SPARK-41690][SQL][CONNECT] Agnostic Encoders

2022-12-26 Thread GitBox
cloud-fan closed pull request #39186: [SPARK-41690][SQL][CONNECT] Agnostic Encoders URL: https://github.com/apache/spark/pull/39186 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

[GitHub] [spark] ulysses-you commented on a diff in pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
ulysses-you commented on code in PR #39220: URL: https://github.com/apache/spark/pull/39220#discussion_r1057422003 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala: ## @@ -143,29 +141,9 @@ case class CreateDataSourceTableAsSelectComm

[GitHub] [spark] zhengruifeng opened a new pull request, #39228: [SPARK-41723][CONNECT][PYTHON] Implement `sequence` function

2022-12-26 Thread GitBox
zhengruifeng opened a new pull request, #39228: URL: https://github.com/apache/spark/pull/39228 ### What changes were proposed in this pull request? Implement `sequence` function ### Why are the changes needed? for API coverage ### Does this PR introduce _any_ user-fac

[GitHub] [spark] zhengruifeng commented on pull request #39227: [SPARK-41722][CONNECT][PYTHON] Implement 3 missing time window functions

2022-12-26 Thread GitBox
zhengruifeng commented on PR #39227: URL: https://github.com/apache/spark/pull/39227#issuecomment-1365586515 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] ulysses-you commented on a diff in pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
ulysses-you commented on code in PR #39220: URL: https://github.com/apache/spark/pull/39220#discussion_r1057422467 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala: ## @@ -21,43 +21,23 @@ import scala.util.control.NonFatal im

[GitHub] [spark] zhengruifeng commented on pull request #39228: [SPARK-41723][CONNECT][PYTHON] Implement `sequence` function

2022-12-26 Thread GitBox
zhengruifeng commented on PR #39228: URL: https://github.com/apache/spark/pull/39228#issuecomment-1365587203 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] LuciferYang commented on pull request #38874: [SPARK-41235][SQL][PYTHON]High-order function: array_compact implementation

2022-12-26 Thread GitBox
LuciferYang commented on PR #38874: URL: https://github.com/apache/spark/pull/38874#issuecomment-1365587749 Should we merge this one first? The other 3 new functions may need to resolve conflicts after this merge. -- This is an automated message from the Apache Git Service. To res

[GitHub] [spark] cloud-fan commented on a diff in pull request #39099: [SPARK-41554] fix changing of Decimal scale when scale decreased by m…

2022-12-26 Thread GitBox
cloud-fan commented on code in PR #39099: URL: https://github.com/apache/spark/pull/39099#discussion_r1057423980 ## sql/catalyst/src/test/scala/org/apache/spark/sql/types/DecimalSuite.scala: ## @@ -384,4 +387,51 @@ class DecimalSuite extends SparkFunSuite with PrivateMethodTest

[GitHub] [spark] cloud-fan commented on a diff in pull request #39099: [SPARK-41554] fix changing of Decimal scale when scale decreased by m…

2022-12-26 Thread GitBox
cloud-fan commented on code in PR #39099: URL: https://github.com/apache/spark/pull/39099#discussion_r1057424394 ## sql/catalyst/src/main/scala/org/apache/spark/sql/types/Decimal.scala: ## @@ -374,7 +374,7 @@ final class Decimal extends Ordered[Decimal] with Serializable {

[GitHub] [spark] amaliujia closed pull request #38807: [SPARK-41270][CONNECT] Add Catalog tableExists and databaseExists in Connect proto

2022-12-26 Thread GitBox
amaliujia closed pull request #38807: [SPARK-41270][CONNECT] Add Catalog tableExists and databaseExists in Connect proto URL: https://github.com/apache/spark/pull/38807 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] zhengruifeng opened a new pull request, #39229: [SPARK-41473][CONNECT][PYTHON] Implement `format_number` function

2022-12-26 Thread GitBox
zhengruifeng opened a new pull request, #39229: URL: https://github.com/apache/spark/pull/39229 ### What changes were proposed in this pull request? Implement `format_number` function ### Why are the changes needed? for API coverage ### Does this PR introduce _

[GitHub] [spark] cloud-fan commented on pull request #39062: [SPARK-41516] [SQL] Allow jdbc dialects to override the query used to create a table

2022-12-26 Thread GitBox
cloud-fan commented on PR #39062: URL: https://github.com/apache/spark/pull/39062#issuecomment-1365589357 cc @beliefer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [spark] cloud-fan commented on pull request #38874: [SPARK-41235][SQL][PYTHON]High-order function: array_compact implementation

2022-12-26 Thread GitBox
cloud-fan commented on PR #38874: URL: https://github.com/apache/spark/pull/38874#issuecomment-1365589473 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] zhengruifeng commented on a diff in pull request #39229: [SPARK-41473][CONNECT][PYTHON] Implement `format_number` function

2022-12-26 Thread GitBox
zhengruifeng commented on code in PR #39229: URL: https://github.com/apache/spark/pull/39229#discussion_r1057425058 ## python/pyspark/sql/connect/functions.py: ## @@ -1629,11 +1629,11 @@ def encode(col: "ColumnOrName", charset: str) -> Column: encode.__doc__ = pysparkfuncs.enco

[GitHub] [spark] cloud-fan closed pull request #38874: [SPARK-41235][SQL][PYTHON]High-order function: array_compact implementation

2022-12-26 Thread GitBox
cloud-fan closed pull request #38874: [SPARK-41235][SQL][PYTHON]High-order function: array_compact implementation URL: https://github.com/apache/spark/pull/38874 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [spark] cloud-fan commented on a diff in pull request #39220: [SPARK-41713][SQL] Make CTAS hold a nested execution for data writing

2022-12-26 Thread GitBox
cloud-fan commented on code in PR #39220: URL: https://github.com/apache/spark/pull/39220#discussion_r1057425206 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala: ## @@ -143,29 +141,9 @@ case class CreateDataSourceTableAsSelectComman

[GitHub] [spark] zhengruifeng commented on pull request #39229: [SPARK-41473][CONNECT][PYTHON] Implement `format_number` function

2022-12-26 Thread GitBox
zhengruifeng commented on PR #39229: URL: https://github.com/apache/spark/pull/39229#issuecomment-1365589844 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

  1   2   >