[PR] [SPARK-48308][Core] Unify getting data schema without partition columns in FileSourceStrategy [spark]

2024-05-16 Thread via GitHub
johanl-db opened a new pull request, #46619: URL: https://github.com/apache/spark/pull/46619 ### What changes were proposed in this pull request? Compute the schema of the data without partition columns only once in FileSourceStrategy. ### Why are the changes needed? In File

Re: [PR] [SPARK-48300][SQL] Codegen Support for `from_xml` [spark]

2024-05-16 Thread via GitHub
panbingkun commented on code in PR #46609: URL: https://github.com/apache/spark/pull/46609#discussion_r1603226466 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/xmlExpressions.scala: ## @@ -104,7 +103,7 @@ case class XmlToStructs( if (mode != Permi

Re: [PR] [SPARK-48301][SQL] Rename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE` to `CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE` [spark]

2024-05-16 Thread via GitHub
srielau commented on code in PR #46608: URL: https://github.com/apache/spark/pull/46608#discussion_r1603214384 ## common/utils/src/main/resources/error/error-conditions.json: ## @@ -2675,9 +2675,9 @@ "ANALYZE TABLE(S) ... COMPUTE STATISTICS ... must be either NOSCAN

Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]

2024-05-16 Thread via GitHub
mihailom-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1603211716 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,240 @@ class CollationSQLExpressionsSuite }) } +

Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]

2024-05-16 Thread via GitHub
mihailom-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1603210909 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala: ## @@ -2938,20 +2940,20 @@ case class Extract(field: Expression,

Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]

2024-05-16 Thread via GitHub
mihailom-db commented on code in PR #46618: URL: https://github.com/apache/spark/pull/46618#discussion_r1603212139 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSQLExpressionsSuite.scala: ## @@ -1584,6 +1584,240 @@ class CollationSQLExpressionsSuite }) } +

Re: [PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]

2024-05-16 Thread via GitHub
nebojsa-db commented on PR #46618: URL: https://github.com/apache/spark/pull/46618#issuecomment-2115033658 Please take a look @nikolamand-db @stefankandic @uros-db @mihailom-db @dbatomic -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[PR] [SPARK-48159][SQL] Extending support for collated strings on datetime expressions [spark]

2024-05-16 Thread via GitHub
nebojsa-db opened a new pull request, #46618: URL: https://github.com/apache/spark/pull/46618 ### What changes were proposed in this pull request? This PR introduces changes that will allow for collated strings to be passed to various datetime expressions or return value as collated strin

Re: [PR] [SPARK-48288] Add source data type for connector cast expression [spark]

2024-05-16 Thread via GitHub
cloud-fan closed pull request #46596: [SPARK-48288] Add source data type for connector cast expression URL: https://github.com/apache/spark/pull/46596 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-48288] Add source data type for connector cast expression [spark]

2024-05-16 Thread via GitHub
cloud-fan commented on PR #46596: URL: https://github.com/apache/spark/pull/46596#issuecomment-2115022038 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] [SPARK-48307][SQL] InlineCTE should keep not-inlined relations in the original WithCTE node [spark]

2024-05-16 Thread via GitHub
cloud-fan commented on PR #46617: URL: https://github.com/apache/spark/pull/46617#issuecomment-2115021011 cc @viirya @amaliujia @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[PR] [SPARK-48307][SQL] InlineCTE should keep not-inlined relations in the original WithCTE node [spark]

2024-05-16 Thread via GitHub
cloud-fan opened a new pull request, #46617: URL: https://github.com/apache/spark/pull/46617 ### What changes were proposed in this pull request? I noticed an outdated comment in the rule `InlineCTE` ``` // CTEs in SQL Commands have been inlined by `CTESubstitution`

Re: [PR] [SPARK-48175][SQL][PYTHON] Store collation information in metadata and not in type for SER/DE [spark]

2024-05-16 Thread via GitHub
olaky commented on code in PR #46280: URL: https://github.com/apache/spark/pull/46280#discussion_r1603195283 ## python/pyspark/sql/tests/test_types.py: ## @@ -549,6 +549,129 @@ def test_convert_list_to_str(self): self.assertEqual(df.count(), 1) self.assertEqual

Re: [PR] [SPARK-48175][SQL][PYTHON] Store collation information in metadata and not in type for SER/DE [spark]

2024-05-16 Thread via GitHub
olaky commented on code in PR #46280: URL: https://github.com/apache/spark/pull/46280#discussion_r1603191384 ## common/unsafe/src/main/java/org/apache/spark/sql/catalyst/util/CollationFactory.java: ## @@ -36,11 +36,45 @@ * Provides functionality to the UTF8String object which

Re: [PR] [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar [spark]

2024-05-16 Thread via GitHub
cloud-fan commented on PR #46603: URL: https://github.com/apache/spark/pull/46603#issuecomment-2114945903 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] [SPARK-48296][SQL] Codegen Support for `to_xml` [spark]

2024-05-16 Thread via GitHub
yaooqinn commented on PR #46591: URL: https://github.com/apache/spark/pull/46591#issuecomment-2114796080 Thanks @panbingkun, merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

Re: [PR] [SPARK-48296][SQL] Codegen Support for `to_xml` [spark]

2024-05-16 Thread via GitHub
yaooqinn closed pull request #46591: [SPARK-48296][SQL] Codegen Support for `to_xml` URL: https://github.com/apache/spark/pull/46591 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[PR] [SPARK-48306][SQL] Improve UDT in error message [spark]

2024-05-16 Thread via GitHub
yaooqinn opened a new pull request, #46616: URL: https://github.com/apache/spark/pull/46616 ### What changes were proposed in this pull request? This PR improves UDT in error message when error occurs. Currently, an UDT is displayed as it's inner SQL type, which is ambiguo

Re: [PR] [SPARK-48213][SQL] Do not push down predicate if non-cheap expression exceed reused limit [spark]

2024-05-16 Thread via GitHub
zml1206 commented on PR #46499: URL: https://github.com/apache/spark/pull/46499#issuecomment-2114700517 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

Re: [PR] [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar [spark]

2024-05-16 Thread via GitHub
yaooqinn commented on PR #46603: URL: https://github.com/apache/spark/pull/46603#issuecomment-2114698939 Merged to master and 3.5 Thank you @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

Re: [PR] [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar [spark]

2024-05-16 Thread via GitHub
yaooqinn closed pull request #46603: [SPARK-48297][SQL] Fix a regression TRANSFORM clause with char/varchar URL: https://github.com/apache/spark/pull/46603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test [spark]

2024-05-16 Thread via GitHub
HyukjinKwon commented on PR #46614: URL: https://github.com/apache/spark/pull/46614#issuecomment-2114527010 because we don't run the tests without ANSI in PR builder. It only runs in the daily build defined here https://github.com/apache/spark/actions/workflows/build_non_ansi.yml -- Thi

Re: [PR] [SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test [spark]

2024-05-16 Thread via GitHub
srielau commented on PR #46614: URL: https://github.com/apache/spark/pull/46614#issuecomment-2114516109 @HyukjinKwon Any idea why this did not trip the original PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

Re: [PR] [SPARK-48291][CORE][FOLLOWUP] Rename Java *LoggerSuite* as *SparkLoggerSuite* [spark]

2024-05-16 Thread via GitHub
panbingkun commented on PR #46615: URL: https://github.com/apache/spark/pull/46615#issuecomment-2114502403 cc @gengliangwang @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[PR] [SPARK-48291][CORE][FOLLOWUP] Rename Java *LoggerSuite* as *SparkLoggerSuite* [spark]

2024-05-16 Thread via GitHub
panbingkun opened a new pull request, #46615: URL: https://github.com/apache/spark/pull/46615 ### What changes were proposed in this pull request? The pr is follow up https://github.com/apache/spark/pull/46600 to . Similarly, to maintain consistency, should be renamed to #

Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-16 Thread via GitHub
HyukjinKwon commented on code in PR #46267: URL: https://github.com/apache/spark/pull/46267#discussion_r1602870592 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala: ## @@ -742,7 +742,7 @@ abstract class SessionCatalogSuite extends An

[PR] [SPARK-48031][SQL][FOLLOW-UP] Use ANSI-enabled cast in view lookup test [spark]

2024-05-16 Thread via GitHub
HyukjinKwon opened a new pull request, #46614: URL: https://github.com/apache/spark/pull/46614 ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/46267 that uses ANSI-enabled cast in the tests. It intentionally uses ANSI-

[PR] [WIP][SQL] Add collation support for CurrentLike expressions [spark]

2024-05-16 Thread via GitHub
uros-db opened a new pull request, #46613: URL: https://github.com/apache/spark/pull/46613 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

[PR] [SPARK-48303][CORE] Reorganize `LogKey` [spark]

2024-05-16 Thread via GitHub
panbingkun opened a new pull request, #46612: URL: https://github.com/apache/spark/pull/46612 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-48301][SQL] Rename `CREATE_FUNC_WITH_IF_NOT_EXISTS_AND_REPLACE` to `CREATE_ROUTINE_WITH_IF_NOT_EXISTS_AND_REPLACE` [spark]

2024-05-16 Thread via GitHub
zhengruifeng commented on PR #46608: URL: https://github.com/apache/spark/pull/46608#issuecomment-2114331791 cc @srielau and @allisonwang-db -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[PR] [SPARK-48238][BUILD][YARN] Replace AmIpFilter with re-implemented YarnAMIpFilter [spark]

2024-05-16 Thread via GitHub
pan3793 opened a new pull request, #46611: URL: https://github.com/apache/spark/pull/46611 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was

Re: [PR] [SPARK-48036][DOCS][FOLLOWUP] Update sql-ref-ansi-compliance.md [spark]

2024-05-16 Thread via GitHub
yaooqinn commented on PR #46610: URL: https://github.com/apache/spark/pull/46610#issuecomment-2114276731 Merged to master. Thank you @ulysses-you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-48036][DOCS][FOLLOWUP] Update sql-ref-ansi-compliance.md [spark]

2024-05-16 Thread via GitHub
yaooqinn closed pull request #46610: [SPARK-48036][DOCS][FOLLOWUP] Update sql-ref-ansi-compliance.md URL: https://github.com/apache/spark/pull/46610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] [SPARK-42789][SQL] Rewrite multiple GetJsonObject that consumes same JSON to single JsonTuple [spark]

2024-05-16 Thread via GitHub
AngersZh commented on code in PR #45020: URL: https://github.com/apache/spark/pull/45020#discussion_r1602762575 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RewriteGetJsonObject.scala: ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foun

Re: [PR] [SPARK-46741][SQL] Cache Table with CTE won't work [spark]

2024-05-16 Thread via GitHub
AngersZh commented on PR #44767: URL: https://github.com/apache/spark/pull/44767#issuecomment-2114228690 ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

<    1   2