Re: [PR] [SPARK-46378][SQL] Still remove Sort after converting Aggregate to Project [spark]

2023-12-11 Thread via GitHub
cloud-fan commented on PR #44310: URL: https://github.com/apache/spark/pull/44310#issuecomment-1851464587 cc @wangyum @ulysses-you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[PR] [SPARK-46378][SQL] Still remove Sort after converting Aggregate to Project [spark]

2023-12-11 Thread via GitHub
cloud-fan opened a new pull request, #44310: URL: https://github.com/apache/spark/pull/44310 ### What changes were proposed in this pull request? This is a follow-up of https://github.com/apache/spark/pull/33397 to avoid sub-optimal plans. After converting `Aggregate` to `Proj

Re: [PR] [SPARK-34888][SS] Introduce UpdatingSessionIterator adjusting session window on elements [spark]

2023-12-11 Thread via GitHub
MaxGekk commented on code in PR #31986: URL: https://github.com/apache/spark/pull/31986#discussion_r1423563160 ## sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/UpdatingSessionsIterator.scala: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Foundat

[PR] [SPARK-46377][INFRA] Upgrade labeler action to v5 [spark]

2023-12-11 Thread via GitHub
panbingkun opened a new pull request, #44309: URL: https://github.com/apache/spark/pull/44309 ### What changes were proposed in this pull request? The pr aims to upgrade `labeler action` from v4 to v5. ### Why are the changes needed? The full release notes: https://github.com/ac

[PR] [SPARK-46360][PYTHON][FOLLOWUP][DOCS] Add `getMessage` to API reference [spark]

2023-12-11 Thread via GitHub
itholic opened a new pull request, #44308: URL: https://github.com/apache/spark/pull/44308 ### What changes were proposed in this pull request? This PR followups https://github.com/apache/spark/pull/44292 to add documentation. ### Why are the changes needed? To a

Re: [PR] [SPARK-46370][SQL] Fix bug when querying from table after changing column defaults [spark]

2023-12-11 Thread via GitHub
cloud-fan commented on PR #44302: URL: https://github.com/apache/spark/pull/44302#issuecomment-1851407507 Can you retrigger the test? The failed test seems unrelated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[PR] [SPARK-46376][SQL][TESTS] Simplify the code to generate the Spark tarball `filename` in the `HiveExternalCatalogVersionsSuite`. [spark]

2023-12-11 Thread via GitHub
LuciferYang opened a new pull request, #44307: URL: https://github.com/apache/spark/pull/44307 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-46371][BUILD] Clean up outdated items in `.rat-excludes` [spark]

2023-12-11 Thread via GitHub
yaooqinn commented on PR #44293: URL: https://github.com/apache/spark/pull/44293#issuecomment-1851357727 Personally, I don't think it's unnecessary to clean them as they never bother me through development. -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] [SPARK-46371][BUILD] Clean up outdated items in `.rat-excludes` [spark]

2023-12-11 Thread via GitHub
yaooqinn commented on code in PR #44293: URL: https://github.com/apache/spark/pull/44293#discussion_r1423455231 ## dev/.rat-excludes: ## @@ -48,32 +44,17 @@ jquery.mustache.js pyspark-coverage-site/* cloudpickle/* join.py -SparkExprTyper.scala SparkILoop.scala -SparkILoopIni

Re: [PR] [SPARK-46090][SQL] Support plan fragment level SQL configs in AQE [spark]

2023-12-11 Thread via GitHub
beliefer commented on PR #44013: URL: https://github.com/apache/spark/pull/44013#issuecomment-1851348137 > @beliefer for example, change the initial shuffle partition number per plan fragment to avoid too big or too small, change the advisory partition size according to the feature of plan

Re: [PR] [Don't merge & review] Upgrade rocksdbjni 20231121 [spark]

2023-12-11 Thread via GitHub
LuciferYang commented on PR #43924: URL: https://github.com/apache/spark/pull/43924#issuecomment-1851346632 > > Has this test been completed? Can we upgrade to the new version now? > > Let's continue to verify until the end of this week? ok -- This is an automated message fro

Re: [PR] [Don't merge & review] Upgrade rocksdbjni 20231121 [spark]

2023-12-11 Thread via GitHub
panbingkun commented on PR #43924: URL: https://github.com/apache/spark/pull/43924#issuecomment-1851344687 > Has this test been completed? Can we upgrade to the new version now? Let's continue to verify until the end of this week? -- This is an automated message from the Apache Git

Re: [PR] [SPARK-46253][PYTHON] Plan Python data source read using MapInArrow [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on code in PR #44170: URL: https://github.com/apache/spark/pull/44170#discussion_r1423409926 ## python/pyspark/sql/worker/plan_data_source_read.py: ## @@ -146,16 +175,102 @@ def main(infile: IO, outfile: IO) -> None: message_parameters={"ty

Re: [PR] [SPARK-46253][PYTHON] Plan Python data source read using MapInArrow [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on code in PR #44170: URL: https://github.com/apache/spark/pull/44170#discussion_r1423409469 ## python/pyspark/sql/worker/plan_data_source_read.py: ## @@ -146,16 +175,102 @@ def main(infile: IO, outfile: IO) -> None: message_parameters={"ty

Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

2023-12-11 Thread via GitHub
HyukjinKwon closed pull request #44292: [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API URL: https://github.com/apache/spark/pull/44292 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on PR #44292: URL: https://github.com/apache/spark/pull/44292#issuecomment-1851305909 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-43299][FOLLOWUP][CONNECT][SS] Followup on StreamingQueryExceptions [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on PR #44306: URL: https://github.com/apache/spark/pull/44306#issuecomment-1851305429 Looks good but I think it'd be great if this can be signed off by @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

Re: [PR] [Don't merge & review] Upgrade rocksdbjni 20231121 [spark]

2023-12-11 Thread via GitHub
LuciferYang commented on PR #43924: URL: https://github.com/apache/spark/pull/43924#issuecomment-1851302133 Has this test been completed? Can we upgrade to the new version now? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-34888][SS] Introduce UpdatingSessionIterator adjusting session window on elements [spark]

2023-12-11 Thread via GitHub
HeartSaVioR commented on code in PR #31986: URL: https://github.com/apache/spark/pull/31986#discussion_r1423400816 ## sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/UpdatingSessionsIterator.scala: ## @@ -0,0 +1,218 @@ +/* + * Licensed to the Apache Software Fou

Re: [PR] [SPARK-46358][CONNECT] Simplify the condition check in the `ResponseValidator#verifyResponse` [spark]

2023-12-11 Thread via GitHub
LuciferYang commented on PR #44291: URL: https://github.com/apache/spark/pull/44291#issuecomment-1851294879 thanks @beliefer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46090][SQL] Support plan fragment level SQL configs in AQE [spark]

2023-12-11 Thread via GitHub
ulysses-you commented on PR #44013: URL: https://github.com/apache/spark/pull/44013#issuecomment-1851269758 @beliefer for example, change the initial shuffle partition number per plan fragment to avoid too big or too small, change the advisory partition size according to the feature of plan

Re: [PR] [SPARK-46240][SQL] Support inject executed plan prep rules in SparkSessionExtensions [spark]

2023-12-11 Thread via GitHub
jiang13021 commented on PR #44254: URL: https://github.com/apache/spark/pull/44254#issuecomment-1851260258 @ulysses-you It works. The ColumnarRule does indeed cover my use case, although the name seems a little bit strange. I will close this pull request. @cloud-fan @beliefer Thank you fo

Re: [PR] [SPARK-46240][SQL] Support inject executed plan prep rules in SparkSessionExtensions [spark]

2023-12-11 Thread via GitHub
jiang13021 closed pull request #44254: [SPARK-46240][SQL] Support inject executed plan prep rules in SparkSessionExtensions URL: https://github.com/apache/spark/pull/44254 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] [SPARK-46253][PYTHON] Plan Python data source read using MapInArrow [spark]

2023-12-11 Thread via GitHub
allisonwang-db commented on PR #44170: URL: https://github.com/apache/spark/pull/44170#issuecomment-1851231052 cc @cloud-fan @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
LuciferYang commented on PR #44277: URL: https://github.com/apache/spark/pull/44277#issuecomment-1851229010 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] [SPARK-46090][SQL] Support plan fragment level SQL configs in AQE [spark]

2023-12-11 Thread via GitHub
beliefer commented on PR #44013: URL: https://github.com/apache/spark/pull/44013#issuecomment-1851218457 @ulysses-you Could you explain what scenario would require adjusting the SQL configs segment by segment? I'm just out of curiosity. -- This is an automated message from the Apache Git

Re: [PR] [SPARK-46240][SQL] Support inject executed plan prep rules in SparkSessionExtensions [spark]

2023-12-11 Thread via GitHub
beliefer commented on PR #44254: URL: https://github.com/apache/spark/pull/44254#issuecomment-1851207013 Based on some previous practical situations, the name of the injection interface always conflicts with its actual purpose. -- This is an automated message from the Apache Git Service.

Re: [PR] [SPARK-46358][CONNECT] Simplify the condition check in the `ResponseValidator#verifyResponse` [spark]

2023-12-11 Thread via GitHub
LuciferYang commented on PR #44291: URL: https://github.com/apache/spark/pull/44291#issuecomment-1851201866 Thanks @dongjoon-hyun ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-46090][SQL] Support plan fragment level SQL configs in AQE [spark]

2023-12-11 Thread via GitHub
ulysses-you commented on code in PR #44013: URL: https://github.com/apache/spark/pull/44013#discussion_r1423311867 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveRuleContext.scala: ## @@ -0,0 +1,88 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-46240][SQL] Support inject executed plan prep rules in SparkSessionExtensions [spark]

2023-12-11 Thread via GitHub
ulysses-you commented on PR #44254: URL: https://github.com/apache/spark/pull/44254#issuecomment-1851151312 I think columnar rule is enough if we want to apply rules without AQE. Say: - if the plan has exchange, then we will go to AQE - if the plan does not have exchange, then the Ensur

Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

2023-12-11 Thread via GitHub
itholic commented on code in PR #44292: URL: https://github.com/apache/spark/pull/44292#discussion_r1423294143 ## python/pyspark/errors/exceptions/base.py: ## @@ -89,13 +91,28 @@ def getSqlState(self) -> Optional[str]: See Also :meth:`PySparkE

Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

2023-12-11 Thread via GitHub
itholic commented on code in PR #44292: URL: https://github.com/apache/spark/pull/44292#discussion_r1423294143 ## python/pyspark/errors/exceptions/base.py: ## @@ -89,13 +91,28 @@ def getSqlState(self) -> Optional[str]: See Also :meth:`PySparkE

Re: [PR] [SPARK-43299][FOLLOWUP][CONNECT][SS] Followup on StreamingQueryExceptions [spark]

2023-12-11 Thread via GitHub
WweiL commented on PR #44306: URL: https://github.com/apache/spark/pull/44306#issuecomment-1851145447 @MaxGekk @HyukjinKwon Tagging people that I know who have context : ) Could you take a look when you have time? Thank you! -- This is an automated message from the Apache Git Service.

Re: [PR] [SPARK-46361][WIP][PYTHON][CORE] Spark dataset chunk read api [spark]

2023-12-11 Thread via GitHub
WeichenXu123 commented on code in PR #44294: URL: https://github.com/apache/spark/pull/44294#discussion_r1423278530 ## sql/core/src/main/scala/org/apache/spark/sql/api/python/ChunkReadUtils.scala: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] [SPARK-43427][PROTOBUF] spark protobuf: allow upcasting unsigned integer types [spark]

2023-12-11 Thread via GitHub
HyukjinKwon closed pull request #43773: [SPARK-43427][PROTOBUF] spark protobuf: allow upcasting unsigned integer types URL: https://github.com/apache/spark/pull/43773 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-43427][PROTOBUF] spark protobuf: allow upcasting unsigned integer types [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on PR #43773: URL: https://github.com/apache/spark/pull/43773#issuecomment-1851118725 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46248][SQL] XML: Support for ignoreCorruptFiles and ignoreMissingFiles options [spark]

2023-12-11 Thread via GitHub
sandip-db commented on code in PR #44163: URL: https://github.com/apache/spark/pull/44163#discussion_r1423270340 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala: ## @@ -604,6 +606,24 @@ class XmlTokenizer( return Some(str) }

Re: [PR] [SPARK-45019][BUILD] Make workflow scala213 on container & clean env [spark]

2023-12-11 Thread via GitHub
github-actions[bot] closed pull request #42733: [SPARK-45019][BUILD] Make workflow scala213 on container & clean env URL: https://github.com/apache/spark/pull/42733 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] [SPARK-43299][FOLLOWUP][CONNECT][SS] Followup on StreamingQueryExceptions [spark]

2023-12-11 Thread via GitHub
WweiL opened a new pull request, #44306: URL: https://github.com/apache/spark/pull/44306 ### What changes were proposed in this pull request? Follow up on streaming query exception in Scala connect client. 1. Change back API document, remove the "todo"s 2. Add a new test

Re: [PR] [SPARK-46370][SQL] Fix bug when querying from table after changing column defaults [spark]

2023-12-11 Thread via GitHub
dtenedor commented on code in PR #44302: URL: https://github.com/apache/spark/pull/44302#discussion_r1423244953 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala: ## @@ -374,6 +374,7 @@ case class AlterTableChangeColumnCommand( // TODO: support chang

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1423230523 ## sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvalPythonUDTFExec.scala: ## @@ -55,84 +55,102 @@ trait EvalPythonUDTFExec extends UnaryExecNode {

[PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-11 Thread via GitHub
HyukjinKwon opened a new pull request, #44305: URL: https://github.com/apache/spark/pull/44305 ### What changes were proposed in this pull request? This PR is same as https://github.com/apache/spark/pull/44233 but does not use `V1Table` but the original DSv2 interface by reusing UDTF

Re: [PR] [SPARK-46369][CORE] Remove `kill` link from `RELAUNCHING` drivers in `MasterPage` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on PR #44301: URL: https://github.com/apache/spark/pull/44301#issuecomment-1851037197 Merged to master/3.5/3.4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

Re: [PR] [SPARK-46369][CORE] Remove `kill` link from `RELAUNCHING` drivers in `MasterPage` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun closed pull request #44301: [SPARK-46369][CORE] Remove `kill` link from `RELAUNCHING` drivers in `MasterPage` URL: https://github.com/apache/spark/pull/44301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

Re: [PR] [SPARK-46369][CORE] Remove `kill` link from `RELAUNCHING` drivers in `MasterPage` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on PR #44301: URL: https://github.com/apache/spark/pull/44301#issuecomment-1851035013 Thank you so much for your time, @MaxGekk and @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [WIP] [SPARK-38668] Split out watcher pod [spark]

2023-12-11 Thread via GitHub
asmitalim closed pull request #44304: [WIP] [SPARK-38668] Split out watcher pod URL: https://github.com/apache/spark/pull/44304 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] [WIP] [SPARK-38668] Spkit [spark]

2023-12-11 Thread via GitHub
asmitalim opened a new pull request, #44304: URL: https://github.com/apache/spark/pull/44304 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### H

Re: [PR] [SPARK-46369][CORE] Remove `kill` link from `RELAUNCHING` drivers in `MasterPage` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on PR #44301: URL: https://github.com/apache/spark/pull/44301#issuecomment-1850998118 Could you review this PR, @MaxGekk ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
MaxGekk closed pull request #44277: [SPARK-46351][SQL] Require an error class in `AnalysisException` URL: https://github.com/apache/spark/pull/44277 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
MaxGekk commented on PR #44277: URL: https://github.com/apache/spark/pull/44277#issuecomment-1850993726 Merging to master. Thank you, @dongjoon-hyun for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[PR] [WIP][SPARK-46179][SQL] Add CrossDbmsQueryTestSuites, which allows generating golden files with Postgres/other DBMS [spark]

2023-12-11 Thread via GitHub
andylam-db opened a new pull request, #44303: URL: https://github.com/apache/spark/pull/44303 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] Assign SQL configs to groups for use in documentation [spark]

2023-12-11 Thread via GitHub
nchammas commented on code in PR #44300: URL: https://github.com/apache/spark/pull/44300#discussion_r1423186667 ## core/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala: ## @@ -197,6 +199,13 @@ private[spark] case class ConfigBuilder(key: String) { this

Re: [PR] [SPARK-46369][CORE] Remove `kill` link from `RELAUNCHING` drivers in `MasterPage` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on PR #44301: URL: https://github.com/apache/spark/pull/44301#issuecomment-1850980630 Could you review this one-line removal of MasterPage UI, @HyukjinKwon ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] [SPARK-46370][SQL] Fix bug when querying from table after changing column defaults [spark]

2023-12-11 Thread via GitHub
cloud-fan commented on code in PR #44302: URL: https://github.com/apache/spark/pull/44302#discussion_r1423175769 ## sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala: ## @@ -2608,6 +2608,25 @@ class InsertSuite extends DataSourceTest with SharedSparkSession

Re: [PR] [SPARK-46370][SQL] Fix bug when querying from table after changing column defaults [spark]

2023-12-11 Thread via GitHub
cloud-fan commented on code in PR #44302: URL: https://github.com/apache/spark/pull/44302#discussion_r1423174851 ## sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala: ## @@ -374,6 +374,7 @@ case class AlterTableChangeColumnCommand( // TODO: support chan

Re: [PR] [SPARK-46370][SQL] Fix bug when querying from table after changing column defaults [spark]

2023-12-11 Thread via GitHub
dtenedor commented on PR #44302: URL: https://github.com/apache/spark/pull/44302#issuecomment-1850952795 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[PR] [SPARK-46370][SQL] Fix bug when querying from table after changing column defaults [spark]

2023-12-11 Thread via GitHub
dtenedor opened a new pull request, #44302: URL: https://github.com/apache/spark/pull/44302 ### What changes were proposed in this pull request? This PR fixes a bug when querying from table after changing defaults: ``` drop table if exists t; create table t(i int, s string

Re: [PR] Assign SQL configs to groups for use in documentation [spark]

2023-12-11 Thread via GitHub
nchammas commented on code in PR #44300: URL: https://github.com/apache/spark/pull/44300#discussion_r1423147435 ## sql/core/src/main/scala/org/apache/spark/sql/api/python/PythonSQLUtils.scala: ## @@ -72,6 +72,14 @@ private[sql] object PythonSQLUtils extends Logging { Functi

Re: [PR] Assign SQL configs to groups for use in documentation [spark]

2023-12-11 Thread via GitHub
nchammas commented on code in PR #44300: URL: https://github.com/apache/spark/pull/44300#discussion_r1423145509 ## core/src/main/scala/org/apache/spark/internal/config/ConfigBuilder.scala: ## @@ -197,6 +199,13 @@ private[spark] case class ConfigBuilder(key: String) { this

Re: [PR] Assign SQL configs to groups for use in documentation [spark]

2023-12-11 Thread via GitHub
nchammas commented on code in PR #44300: URL: https://github.com/apache/spark/pull/44300#discussion_r1423145979 ## docs/README.md: ## @@ -46,8 +46,6 @@ $ cd docs $ bundle install ``` -Note: If you are on a system with both Ruby 1.9 and Ruby 2.0 you may need to replace gem w

Re: [PR] Assign SQL configs to groups for use in documentation [spark]

2023-12-11 Thread via GitHub
nchammas commented on code in PR #44300: URL: https://github.com/apache/spark/pull/44300#discussion_r1423139009 ## docs/sql-performance-tuning.md: ## @@ -34,30 +34,9 @@ memory usage and GC pressure. You can call `spark.catalog.uncacheTable("tableNam Configuration of in-memory

[PR] [SPARK-46369][CORE] Remove `kill` link from `RELAUNCHING` drivers in `MasterPage` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun opened a new pull request, #44301: URL: https://github.com/apache/spark/pull/44301 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[PR] Assign SQL configs to groups for use in documentation [spark]

2023-12-11 Thread via GitHub
nchammas opened a new pull request, #44300: URL: https://github.com/apache/spark/pull/44300 ### What changes were proposed in this pull request? Assign the various SQL configs to groups that can then be referenced in our documentation. This will replace the large number of manually ma

Re: [PR] [SPARK-46202][CONNECT] Expose new ArtifactManager APIs to support custom target directories [spark]

2023-12-11 Thread via GitHub
vicennial commented on code in PR #44109: URL: https://github.com/apache/spark/pull/44109#discussion_r1423128412 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala: ## @@ -590,6 +590,48 @@ class SparkSession private[sql] ( @Experimental

Re: [PR] [SPARK-46202][CONNECT] Expose new ArtifactManager APIs to support custom target directories [spark]

2023-12-11 Thread via GitHub
hvanhovell commented on code in PR #44109: URL: https://github.com/apache/spark/pull/44109#discussion_r1423125577 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/ArtifactManager.scala: ## @@ -108,6 +104,56 @@ class ArtifactManager( */ def ad

Re: [PR] [SPARK-43427][PROTOBUF] spark protobuf: allow upcasting unsigned integer types [spark]

2023-12-11 Thread via GitHub
justaparth commented on code in PR #43773: URL: https://github.com/apache/spark/pull/43773#discussion_r1423125231 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala: ## @@ -67,9 +68,23 @@ object SchemaConverters extends Logging {

Re: [PR] [SPARK-46202][CONNECT] Expose new ArtifactManager APIs to support custom target directories [spark]

2023-12-11 Thread via GitHub
hvanhovell commented on code in PR #44109: URL: https://github.com/apache/spark/pull/44109#discussion_r1423124563 ## connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala: ## @@ -590,6 +590,48 @@ class SparkSession private[sql] ( @Experimental

Re: [PR] [SPARK-43427][PROTOBUF] spark protobuf: allow upcasting unsigned integer types [spark]

2023-12-11 Thread via GitHub
justaparth commented on code in PR #43773: URL: https://github.com/apache/spark/pull/43773#discussion_r1423122070 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala: ## @@ -67,9 +68,23 @@ object SchemaConverters extends Logging {

Re: [PR] [SPARK-46272][SQL] Support CTAS using DSv2 sources [spark]

2023-12-11 Thread via GitHub
cloud-fan commented on code in PR #44190: URL: https://github.com/apache/spark/pull/44190#discussion_r1423115936 ## sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2Suite.scala: ## @@ -723,6 +724,158 @@ class DataSourceV2Suite extends QueryTest with SharedSpar

Re: [PR] [SPARK-45816][SQL] Return `NULL` when overflowing during casting from timestamp to integers [spark]

2023-12-11 Thread via GitHub
cloud-fan commented on code in PR #43694: URL: https://github.com/apache/spark/pull/43694#discussion_r1423111720 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala: ## @@ -1658,22 +1661,26 @@ case class Cast( integralType: String, f

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
MaxGekk commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423086649 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, + "_LE

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423084217 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, +

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423084217 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, +

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423081460 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteMergeIntoTable.scala: ## @@ -440,7 +442,9 @@ object RewriteMergeIntoTable extends Rew

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
MaxGekk commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423078408 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, + "_LE

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423079180 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, +

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
MaxGekk commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423078408 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, + "_LE

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423073920 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, +

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423073920 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, +

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on code in PR #44277: URL: https://github.com/apache/spark/pull/44277#discussion_r1423068901 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -6705,6 +6705,259 @@ "Failed to get block , which is not a shuffle block" ] }, +

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on PR #44277: URL: https://github.com/apache/spark/pull/44277#issuecomment-1850809094 Ack, @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46351][SQL] Require an error class in `AnalysisException` [spark]

2023-12-11 Thread via GitHub
MaxGekk commented on PR #44277: URL: https://github.com/apache/spark/pull/44277#issuecomment-1850806303 @LuciferYang @beliefer @dongjoon-hyun @cloud-fan @panbingkun Please, review this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] [SPARK-45816][SQL] Return `NULL` when overflowing during casting from timestamp to integers [spark]

2023-12-11 Thread via GitHub
viirya commented on code in PR #43694: URL: https://github.com/apache/spark/pull/43694#discussion_r1423059585 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala: ## @@ -1658,22 +1661,26 @@ case class Cast( integralType: String, from

Re: [PR] [SPARK-45816][SQL] Return `NULL` when overflowing during casting from timestamp to integers [spark]

2023-12-11 Thread via GitHub
cloud-fan commented on code in PR #43694: URL: https://github.com/apache/spark/pull/43694#discussion_r1423032246 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala: ## @@ -1658,22 +1661,26 @@ case class Cast( integralType: String, f

Re: [PR] [SPARK-43427][PROTOBUF] spark protobuf: allow upcasting unsigned integer types [spark]

2023-12-11 Thread via GitHub
rangadi commented on code in PR #43773: URL: https://github.com/apache/spark/pull/43773#discussion_r1422991198 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/SchemaConverters.scala: ## @@ -67,9 +68,23 @@ object SchemaConverters extends Logging {

Re: [PR] [SPARK-44001][PROTOBUF] Add option to allow unwrapping protobuf well known wrapper types [spark]

2023-12-11 Thread via GitHub
rangadi commented on code in PR #43767: URL: https://github.com/apache/spark/pull/43767#discussion_r1401347343 ## connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufFunctionsSuite.scala: ## @@ -1600,6 +1600,195 @@ class ProtobufFunctionsSuite extends QueryTe

Re: [PR] [SPARK-46248][SQL] XML: Support for ignoreCorruptFiles and ignoreMissingFiles options [spark]

2023-12-11 Thread via GitHub
sandip-db commented on code in PR #44163: URL: https://github.com/apache/spark/pull/44163#discussion_r1422984743 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala: ## @@ -750,7 +752,20 @@ object StaxXmlParser { throw QueryExecutionErro

Re: [PR] [SPARK-46273][SQL] Support INSERT INTO/OVERWRITE using DSv2 sources [spark]

2023-12-11 Thread via GitHub
HyukjinKwon closed pull request #44213: [SPARK-46273][SQL] Support INSERT INTO/OVERWRITE using DSv2 sources URL: https://github.com/apache/spark/pull/44213 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-46273][SQL] Support INSERT INTO/OVERWRITE using DSv2 sources [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on PR #44213: URL: https://github.com/apache/spark/pull/44213#issuecomment-1850655560 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46341][PS][TESTS][FOLLOWUP] Sort indices before dataframe comparison [spark]

2023-12-11 Thread via GitHub
HyukjinKwon closed pull request #44295: [SPARK-46341][PS][TESTS][FOLLOWUP] Sort indices before dataframe comparison URL: https://github.com/apache/spark/pull/44295 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] [SPARK-46341][PS][TESTS][FOLLOWUP] Sort indices before dataframe comparison [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on PR #44295: URL: https://github.com/apache/spark/pull/44295#issuecomment-1850653541 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46347][PS][TESTS] Reorganize `RollingTests ` [spark]

2023-12-11 Thread via GitHub
HyukjinKwon closed pull request #44281: [SPARK-46347][PS][TESTS] Reorganize `RollingTests ` URL: https://github.com/apache/spark/pull/44281 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

Re: [PR] [SPARK-46347][PS][TESTS] Reorganize `RollingTests ` [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on PR #44281: URL: https://github.com/apache/spark/pull/44281#issuecomment-1850647006 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46355][SQL] XML: Close InputStreamReader on read completion [spark]

2023-12-11 Thread via GitHub
HyukjinKwon closed pull request #44287: [SPARK-46355][SQL] XML: Close InputStreamReader on read completion URL: https://github.com/apache/spark/pull/44287 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] [SPARK-46355][SQL] XML: Close InputStreamReader on read completion [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on PR #44287: URL: https://github.com/apache/spark/pull/44287#issuecomment-1850560469 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46360][PYTHON] Enhance error message debugging with new `getMessage` API [spark]

2023-12-11 Thread via GitHub
HyukjinKwon commented on code in PR #44292: URL: https://github.com/apache/spark/pull/44292#discussion_r1422853846 ## python/pyspark/errors/exceptions/base.py: ## @@ -89,13 +91,28 @@ def getSqlState(self) -> Optional[str]: See Also :meth:`PySp

Re: [PR] [SPARK-46357] Replace incorrect documentation use of setConf with conf.set [spark]

2023-12-11 Thread via GitHub
nchammas commented on PR #44290: URL: https://github.com/apache/spark/pull/44290#issuecomment-1850508338 cc @gengliangwang for this minor documentation fix. (I just picked you since I see you recently worked on this file. Let me know if that's not appropriate.) -- This is an automated mes

Re: [PR] [SPARK-46358][CONNECT] Simplify the condition check in the `ResponseValidator#verifyResponse` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun commented on PR #44291: URL: https://github.com/apache/spark/pull/44291#issuecomment-1850476296 Merged to master for Apache Spark 4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-46358][CONNECT] Simplify the condition check in the `ResponseValidator#verifyResponse` [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun closed pull request #44291: [SPARK-46358][CONNECT] Simplify the condition check in the `ResponseValidator#verifyResponse` URL: https://github.com/apache/spark/pull/44291 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [PR] Branch 3.4 [spark]

2023-12-11 Thread via GitHub
dongjoon-hyun closed pull request #44296: Branch 3.4 URL: https://github.com/apache/spark/pull/44296 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: rev

  1   2   >