Re: [PR] [SPARK-46408][SQL] Support date_sub on V2ExpressionBuilder [spark]

2023-12-14 Thread via GitHub
caicancai commented on PR #44357: URL: https://github.com/apache/spark/pull/44357#issuecomment-1857434549 cc @beliefer -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-46382][SQL] XML: Capture values interspersed between elements [spark]

2023-12-14 Thread via GitHub
sandip-db commented on code in PR #44318: URL: https://github.com/apache/spark/pull/44318#discussion_r1427373383 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/xml/StaxXmlParser.scala: ## @@ -567,6 +589,61 @@ class StaxXmlParser( castTo(data,

[PR] [WIP] Add rule context [spark]

2023-12-14 Thread via GitHub
ulysses-you opened a new pull request, #44367: URL: https://github.com/apache/spark/pull/44367 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

Re: [PR] [SPARK-40193][SQL] Merge subquery plans with different filters [spark]

2023-12-14 Thread via GitHub
unigof commented on PR #37630: URL: https://github.com/apache/spark/pull/37630#issuecomment-1857411012 @peter-toth I'm excited to see that you're still updating this PR!! Is this pr base on spark 3.5? And support datasource v2? -- This is an automated message from the Apache

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-14 Thread via GitHub
wankunde commented on code in PR #43982: URL: https://github.com/apache/spark/pull/43982#discussion_r1427645442 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala: ## @@ -329,6 +334,48 @@ object UnwrapCastInBinaryComparison

[PR] [SPARK-46069][SQL][FOLLOWUP] Simplify the code and add more UT [spark]

2023-12-14 Thread via GitHub
wankunde opened a new pull request, #44366: URL: https://github.com/apache/spark/pull/44366 ### What changes were proposed in this pull request? * Use `falseIfNotNull` method to simplify the code * Add UT for `isStartOfDay = true` cases ### Why are the changes needed?

Re: [PR] [SPARK-46418][PS][TESTS] Reorganize `ReshapeTests` [spark]

2023-12-14 Thread via GitHub
zhengruifeng commented on PR #44365: URL: https://github.com/apache/spark/pull/44365#issuecomment-1857386000 cc @HyukjinKwon @itholic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[PR] [SPARK-46418][PS][TESTS] Reorganize `ReshapeTests` [spark]

2023-12-14 Thread via GitHub
zhengruifeng opened a new pull request, #44365: URL: https://github.com/apache/spark/pull/44365 ### What changes were proposed in this pull request? break `ReshapeTests` into multiple small tests ### Why are the changes needed? 1, it parity test is slow, sometimes takes >5

Re: [PR] [SPARK-46417][SQL] Do not fail when calling hive.getTable and throwException is false [spark]

2023-12-14 Thread via GitHub
yaooqinn commented on code in PR #44364: URL: https://github.com/apache/spark/pull/44364#discussion_r1427622193 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala: ## @@ -923,7 +923,13 @@ private[client] class Shim_v2_0 extends Shim with Logging {

Re: [PR] [SPARK-45904][SQL][CONNECT] Mode function should supports sort with order direction [spark]

2023-12-14 Thread via GitHub
beliefer commented on PR #43786: URL: https://github.com/apache/spark/pull/43786#issuecomment-1857368908 I will close due to https://github.com/apache/spark/pull/44184 merged. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-45904][SQL][CONNECT] Mode function should supports sort with order direction [spark]

2023-12-14 Thread via GitHub
beliefer closed pull request #43786: [SPARK-45904][SQL][CONNECT] Mode function should supports sort with order direction URL: https://github.com/apache/spark/pull/43786 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-46406][SQL] Assign a name to the error class _LEGACY_ERROR_TEMP_1023 [spark]

2023-12-14 Thread via GitHub
beliefer commented on PR #44355: URL: https://github.com/apache/spark/pull/44355#issuecomment-1857367375 The GA failure is unrelated. @MaxGekk Do you have time to take a review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

2023-12-14 Thread via GitHub
pan3793 commented on code in PR #44352: URL: https://github.com/apache/spark/pull/44352#discussion_r1427596518 ## sql/core/src/test/resources/sql-tests/analyzer-results/udf/postgreSQL/udf-select_having.sql.out: ## @@ -102,12 +102,11 @@ Project [udf(b)#x, udf(c)#x] SELECT

Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44352: URL: https://github.com/apache/spark/pull/44352#discussion_r1427591478 ## sql/core/src/test/resources/sql-tests/analyzer-results/udf/postgreSQL/udf-select_having.sql.out: ## @@ -102,12 +102,11 @@ Project [udf(b)#x, udf(c)#x] SELECT

Re: [PR] [SPARK-46417][SQL] Do not fail when calling hive.getTable and throwException is false [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on PR #44364: URL: https://github.com/apache/spark/pull/44364#issuecomment-1857326100 @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[PR] [SPARK-46417][SQL] Do not fail when calling hive.getTable and throwException is false [spark]

2023-12-14 Thread via GitHub
cloud-fan opened a new pull request, #44364: URL: https://github.com/apache/spark/pull/44364 ### What changes were proposed in this pull request? Uses can set up their own HMS and let Spark connects to it. We have no control over it and somtimes it's not even Hive but just a

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-14 Thread via GitHub
HyukjinKwon closed pull request #44349: [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support URL: https://github.com/apache/spark/pull/44349 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on PR #44349: URL: https://github.com/apache/spark/pull/44349#issuecomment-1857310013 Tests passed. Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1427568443 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala: ## @@ -105,13 +106,14 @@ case class DataSource( //

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1427568014 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala: ## @@ -105,13 +106,14 @@ case class DataSource( //

Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

2023-12-14 Thread via GitHub
pan3793 commented on code in PR #44352: URL: https://github.com/apache/spark/pull/44352#discussion_r1427566432 ## sql/core/src/test/resources/sql-tests/analyzer-results/udf/postgreSQL/udf-select_having.sql.out: ## @@ -102,12 +102,11 @@ Project [udf(b)#x, udf(c)#x] SELECT

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1427567309 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala: ## @@ -105,13 +106,14 @@ case class DataSource( //

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1427566941 ## sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala: ## @@ -780,7 +780,7 @@ class SparkSession private( DataSource.lookupDataSource(runner,

Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

2023-12-14 Thread via GitHub
pan3793 commented on code in PR #44352: URL: https://github.com/apache/spark/pull/44352#discussion_r1427566432 ## sql/core/src/test/resources/sql-tests/analyzer-results/udf/postgreSQL/udf-select_having.sql.out: ## @@ -102,12 +102,11 @@ Project [udf(b)#x, udf(c)#x] SELECT

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1427566069 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala: ## @@ -105,13 +106,14 @@ case class DataSource( //

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1427565917 ## sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala: ## @@ -780,7 +780,7 @@ class SparkSession private( DataSource.lookupDataSource(runner,

Re: [PR] [SPARK-46225][CONNECT] Collapse withColumns calls [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44162: URL: https://github.com/apache/spark/pull/44162#discussion_r1427563351 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala: ## @@ -872,6 +872,20 @@ case class TempResolvedColumn( final override val

Re: [PR] [SPARK-46225][CONNECT] Collapse withColumns calls [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44162: URL: https://github.com/apache/spark/pull/44162#discussion_r1427562850 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala: ## @@ -872,6 +872,20 @@ case class TempResolvedColumn( final override val

Re: [PR] [SPARK-46225][CONNECT] Collapse withColumns calls [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44162: URL: https://github.com/apache/spark/pull/44162#discussion_r1427562393 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala: ## @@ -872,6 +872,20 @@ case class TempResolvedColumn( final override val

Re: [PR] [SPARK-46225][CONNECT] Collapse withColumns calls [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44162: URL: https://github.com/apache/spark/pull/44162#discussion_r1427562140 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3469,6 +3470,52 @@ class Analyzer(override val catalogManager:

Re: [PR] [SPARK-45593][BUILD] Building a runnable distribution from master code running spark-sql raise error [spark]

2023-12-14 Thread via GitHub
Yikf commented on code in PR #43436: URL: https://github.com/apache/spark/pull/43436#discussion_r1427561278 ## connector/connect/client/jvm/pom.xml: ## @@ -124,6 +124,10 @@ io.grpc.** + Review Comment: yea --

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1427560724 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala: ## @@ -708,17 +732,18 @@ object DataSource extends Logging { def

Re: [PR] [SPARK-46416][CORE] Add `@tailrec` to `HadoopFSUtils#shouldFilterOutPath` [spark]

2023-12-14 Thread via GitHub
LuciferYang commented on PR #44345: URL: https://github.com/apache/spark/pull/44345#issuecomment-1857296727 Merged into master. Thanks @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-46416][CORE] Add `@tailrec` to `HadoopFSUtils#shouldFilterOutPath` [spark]

2023-12-14 Thread via GitHub
LuciferYang closed pull request #44345: [SPARK-46416][CORE] Add `@tailrec` to `HadoopFSUtils#shouldFilterOutPath` URL: https://github.com/apache/spark/pull/44345 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-46416][CORE] Add `@tailrec` to `HadoopFSUtils#shouldFilterOutPath` [spark]

2023-12-14 Thread via GitHub
LuciferYang commented on code in PR #44345: URL: https://github.com/apache/spark/pull/44345#discussion_r1427560138 ## core/src/main/scala/org/apache/spark/util/HadoopFSUtils.scala: ## @@ -349,6 +349,7 @@ private[spark] object HadoopFSUtils extends Logging { private val

Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44352: URL: https://github.com/apache/spark/pull/44352#discussion_r1427552666 ## sql/core/src/test/resources/sql-tests/analyzer-results/udf/postgreSQL/udf-select_having.sql.out: ## @@ -102,12 +102,11 @@ Project [udf(b)#x, udf(c)#x] SELECT

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1427550280 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala: ## @@ -708,17 +732,18 @@ object DataSource extends Logging { def

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1426984033 ## sql/core/src/main/scala/org/apache/spark/sql/execution/python/MapInBatchEvaluatorFactory.scala: ## @@ -36,7 +36,7 @@ class MapInBatchEvaluatorFactory(

Re: [PR] [SPARK-45597][PYTHON][SQL] Support creating table using a Python data source in SQL (DSv2 exec) [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44305: URL: https://github.com/apache/spark/pull/44305#discussion_r1426971568 ## sql/core/src/main/scala/org/apache/spark/sql/execution/python/UserDefinedPythonDataSource.scala: ## @@ -20,58 +20,200 @@ package

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44349: URL: https://github.com/apache/spark/pull/44349#discussion_r1427536554 ## python/pyspark/errors/exceptions/captured.py: ## @@ -332,3 +345,37 @@ class UnknownException(CapturedException, BaseUnknownException): """ None of

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44349: URL: https://github.com/apache/spark/pull/44349#discussion_r1427536459 ## python/pyspark/errors/exceptions/captured.py: ## @@ -332,3 +345,37 @@ class UnknownException(CapturedException, BaseUnknownException): """ None of

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-14 Thread via GitHub
garlandz-db commented on code in PR #44349: URL: https://github.com/apache/spark/pull/44349#discussion_r1427535520 ## python/pyspark/errors/exceptions/captured.py: ## @@ -332,3 +345,37 @@ class UnknownException(CapturedException, BaseUnknownException): """ None of

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-14 Thread via GitHub
zhengruifeng commented on code in PR #44349: URL: https://github.com/apache/spark/pull/44349#discussion_r1427526487 ## python/pyspark/errors/exceptions/connect.py: ## @@ -374,3 +425,37 @@ class SparkNoSuchElementException(SparkConnectGrpcException, BaseNoSuchElementEx """

Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

2023-12-14 Thread via GitHub
pan3793 commented on code in PR #44352: URL: https://github.com/apache/spark/pull/44352#discussion_r1427525939 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala: ## @@ -323,12 +323,14 @@ trait ColumnResolutionHelper extends

Re: [PR] [SPARK-46414][UI] Use prependBaseUri to render javascript imports [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on PR #44363: URL: https://github.com/apache/spark/pull/44363#issuecomment-1857234989 cc @rednaxelafx too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46253][PYTHON] Plan Python data source read using MapInArrow [spark]

2023-12-14 Thread via GitHub
zhengruifeng commented on code in PR #44170: URL: https://github.com/apache/spark/pull/44170#discussion_r1427498004 ## python/pyspark/sql/worker/plan_data_source_read.py: ## @@ -146,16 +176,94 @@ def main(infile: IO, outfile: IO) -> None:

Re: [PR] [SPARK-46414][UI] Use prependBaseUri to render javascript imports [spark]

2023-12-14 Thread via GitHub
yaooqinn commented on PR #44363: URL: https://github.com/apache/spark/pull/44363#issuecomment-1857186254 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[PR] [SPARK-46414][UI] Use prependBaseUri to render javascript imports [spark]

2023-12-14 Thread via GitHub
yaooqinn opened a new pull request, #44363: URL: https://github.com/apache/spark/pull/44363 ### What changes were proposed in this pull request? Use prependBaseUri to render javascript imports ### Why are the changes needed? Fix a regression when a proxy

Re: [PR] [SPARK-46302][BUILD] Fix maven daily testing [spark]

2023-12-14 Thread via GitHub
panbingkun commented on PR #44208: URL: https://github.com/apache/spark/pull/44208#issuecomment-1857169015 In this PR, some version tests have been skipped through environment variables. After obtaining "Approve", I will remove some logic added for testing and observation. -- This is

Re: [PR] [SPARK-28386][SQL] Cannot resolve ORDER BY columns with GROUP BY and HAVING [spark]

2023-12-14 Thread via GitHub
ulysses-you commented on code in PR #44352: URL: https://github.com/apache/spark/pull/44352#discussion_r1427477125 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ColumnResolutionHelper.scala: ## @@ -323,12 +323,14 @@ trait ColumnResolutionHelper extends

Re: [PR] [SPARK-46003][UI][TESTS] Create an ui-test module with Jest to test ui javascript code [spark]

2023-12-14 Thread via GitHub
yaooqinn commented on code in PR #43903: URL: https://github.com/apache/spark/pull/43903#discussion_r1427469572 ## core/src/main/scala/org/apache/spark/ui/exec/ExecutorThreadDumpPage.scala: ## @@ -128,8 +128,15 @@ private[ui] class ExecutorThreadDumpPage( // scalastyle:off

Re: [PR] [SPARK-46384][SPARK-46404][SS][UI] Fix Operation Duration Stack Chart on Structured Streaming Page [spark]

2023-12-14 Thread via GitHub
yaooqinn commented on PR #44346: URL: https://github.com/apache/spark/pull/44346#issuecomment-1857142676 Thanks all. Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46384][SPARK-46404][SS][UI] Fix Operation Duration Stack Chart on Structured Streaming Page [spark]

2023-12-14 Thread via GitHub
yaooqinn closed pull request #44346: [SPARK-46384][SPARK-46404][SS][UI] Fix Operation Duration Stack Chart on Structured Streaming Page URL: https://github.com/apache/spark/pull/44346 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-46392][SQL] Push down Cast expression in DSv2 [spark]

2023-12-14 Thread via GitHub
monkeyboy123 commented on PR #44333: URL: https://github.com/apache/spark/pull/44333#issuecomment-1857130063 > And if I try another table, but this time `USING parquet`, I see `PartitionFilters` accepts the cast without issue: BTW, What is your Final Physical Plan? is

Re: [PR] [SPARK-46407][PS][TESTS] Reorganize `OpsOnDiffFramesDisabledTests` [spark]

2023-12-14 Thread via GitHub
zhengruifeng commented on PR #44354: URL: https://github.com/apache/spark/pull/44354#issuecomment-1857125553 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46407][PS][TESTS] Reorganize `OpsOnDiffFramesDisabledTests` [spark]

2023-12-14 Thread via GitHub
zhengruifeng closed pull request #44354: [SPARK-46407][PS][TESTS] Reorganize `OpsOnDiffFramesDisabledTests` URL: https://github.com/apache/spark/pull/44354 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45005][PS][TESTS][FOLLOWUPS] Deduplicate `test_properties` [spark]

2023-12-14 Thread via GitHub
zhengruifeng commented on PR #44353: URL: https://github.com/apache/spark/pull/44353#issuecomment-1857124428 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45005][PS][TESTS][FOLLOWUPS] Deduplicate `test_properties` [spark]

2023-12-14 Thread via GitHub
zhengruifeng closed pull request #44353: [SPARK-45005][PS][TESTS][FOLLOWUPS] Deduplicate `test_properties` URL: https://github.com/apache/spark/pull/44353 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-45516][CONNECT] Include QueryContext in SparkThrowable proto message [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #43352: URL: https://github.com/apache/spark/pull/43352#discussion_r1427431997 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -115,6 +115,17 @@ private[connect] object ErrorUtils extends

Re: [PR] [SPARK-45516][CONNECT] Include QueryContext in SparkThrowable proto message [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #43352: URL: https://github.com/apache/spark/pull/43352#discussion_r1427431997 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -115,6 +115,17 @@ private[connect] object ErrorUtils extends

Re: [PR] [SPARK-46302][BUILD] Fix maven daily testing [spark]

2023-12-14 Thread via GitHub
panbingkun commented on code in PR #44208: URL: https://github.com/apache/spark/pull/44208#discussion_r1427454717 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala: ## @@ -66,6 +66,13 @@ class HiveExternalCatalogVersionsSuite extends

Re: [PR] [SPARK-45506][CONNECT] Add ivy URI support to SparkConnect addArtifact [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #43354: URL: https://github.com/apache/spark/pull/43354#discussion_r1427451350 ## common/utils/src/main/scala/org/apache/spark/util/IvyTestUtils.scala: ## @@ -15,30 +15,26 @@ * limitations under the License. */ -package

Re: [PR] [SPARK-46392][SQL] Push down Cast expression in DSv2 [spark]

2023-12-14 Thread via GitHub
monkeyboy123 commented on PR #44333: URL: https://github.com/apache/spark/pull/44333#issuecomment-1857105731 > Anyway, I don't mean to waste your time. I was just trying to reproduce the issue, but it seems there are more details involved that I don't follow. Thanks for your review,

Re: [PR] [SPARK-46400][CORE][SQL] When there are corrupted files in the local maven repo, skip this cache and try again [spark]

2023-12-14 Thread via GitHub
panbingkun commented on code in PR #44343: URL: https://github.com/apache/spark/pull/44343#discussion_r1427444877 ## common/utils/src/main/scala/org/apache/spark/util/MavenUtils.scala: ## @@ -405,6 +415,7 @@ private[spark] object MavenUtils extends Logging { def

Re: [PR] [SPARK-45516][CONNECT] Include QueryContext in SparkThrowable proto message [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #43352: URL: https://github.com/apache/spark/pull/43352#discussion_r1427441146 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -115,6 +115,17 @@ private[connect] object ErrorUtils extends

Re: [PR] [SPARK-45516][CONNECT] Include QueryContext in SparkThrowable proto message [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #43352: URL: https://github.com/apache/spark/pull/43352#discussion_r1427440309 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -115,6 +115,17 @@ private[connect] object ErrorUtils extends

Re: [PR] [SPARK-45516][CONNECT] Include QueryContext in SparkThrowable proto message [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #43352: URL: https://github.com/apache/spark/pull/43352#discussion_r1427431997 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -115,6 +115,17 @@ private[connect] object ErrorUtils extends

Re: [PR] [SPARK-46402][PYTHON] Add getMessageParameters and getQueryContext support [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #44349: URL: https://github.com/apache/spark/pull/44349#discussion_r1427432703 ## python/pyspark/errors/exceptions/captured.py: ## @@ -332,3 +345,37 @@ class UnknownException(CapturedException, BaseUnknownException): """ None of

Re: [PR] [SPARK-45516][CONNECT] Include QueryContext in SparkThrowable proto message [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #43352: URL: https://github.com/apache/spark/pull/43352#discussion_r1427431997 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -115,6 +115,17 @@ private[connect] object ErrorUtils extends

Re: [PR] [SPARK-36680][SQL] Supports Dynamic Table Options for Spark SQL [spark]

2023-12-14 Thread via GitHub
github-actions[bot] commented on PR #41683: URL: https://github.com/apache/spark/pull/41683#issuecomment-1857066500 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-43149][SQL] `CreateDataSourceTableCommand` should create metadata first [spark]

2023-12-14 Thread via GitHub
github-actions[bot] commented on PR #42574: URL: https://github.com/apache/spark/pull/42574#issuecomment-1857066405 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-46302][BUILD] Fix maven daily testing [spark]

2023-12-14 Thread via GitHub
panbingkun commented on code in PR #44208: URL: https://github.com/apache/spark/pull/44208#discussion_r1427424297 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala: ## @@ -212,45 +219,49 @@ class HiveExternalCatalogVersionsSuite extends

Re: [PR] [SPARK-45516][CONNECT] Include QueryContext in SparkThrowable proto message [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #43352: URL: https://github.com/apache/spark/pull/43352#discussion_r1427415118 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -115,6 +115,17 @@ private[connect] object ErrorUtils extends

Re: [PR] [SPARK-45516][CONNECT] Include QueryContext in SparkThrowable proto message [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #43352: URL: https://github.com/apache/spark/pull/43352#discussion_r1427412536 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/utils/ErrorUtils.scala: ## @@ -115,6 +115,17 @@ private[connect] object ErrorUtils extends

Re: [PR] [SPARK-46294][SQL] Clean up semantics of init vs zero value [spark]

2023-12-14 Thread via GitHub
cloud-fan closed pull request #44222: [SPARK-46294][SQL] Clean up semantics of init vs zero value URL: https://github.com/apache/spark/pull/44222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46294][SQL] Clean up semantics of init vs zero value [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on PR #44222: URL: https://github.com/apache/spark/pull/44222#issuecomment-1856973646 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46003][UI][TESTS] Create an ui-test module with Jest to test ui javascript code [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on code in PR #43903: URL: https://github.com/apache/spark/pull/43903#discussion_r1427397294 ## core/src/main/scala/org/apache/spark/ui/exec/ExecutorThreadDumpPage.scala: ## @@ -128,8 +128,15 @@ private[ui] class ExecutorThreadDumpPage( //

[PR] [SPARK-46413][PYTHON] Validate returnType of Arrow Python UDF [spark]

2023-12-14 Thread via GitHub
xinrong-meng opened a new pull request, #44362: URL: https://github.com/apache/spark/pull/44362 ### What changes were proposed in this pull request? Validate returnType of Arrow Python UDF ### Why are the changes needed? Better error handling and consistency with other types

[PR] [SPARK-46289][SQL] Support ordering UDTs in interpreted mode [spark]

2023-12-14 Thread via GitHub
bersprockets opened a new pull request, #44361: URL: https://github.com/apache/spark/pull/44361 ### What changes were proposed in this pull request? When comparing two UDT values in interpreted mode, treat each value as an instance of the UDT's underlying type. ### Why are the

Re: [PR] [SPARK-40193][SQL] Merge subquery plans with different filters [spark]

2023-12-14 Thread via GitHub
peter-toth commented on PR #37630: URL: https://github.com/apache/spark/pull/37630#issuecomment-1856588974 > Hey, is this part of generalized subquery fusion? https://www.usenix.org/conference/osdi20/presentation/sarthi No, this PR is not based on the above paper but our goals seems

Re: [PR] [SPARK-40193][SQL] Merge subquery plans with different filters [spark]

2023-12-14 Thread via GitHub
benjamin-j-c commented on PR #37630: URL: https://github.com/apache/spark/pull/37630#issuecomment-1856547567 Hey, is this part of generalized subquery fusion? https://www.usenix.org/conference/osdi20/presentation/sarthi -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-46409][CONNECT] Fix spark-connect-scala-client launch script [spark]

2023-12-14 Thread via GitHub
HyukjinKwon commented on PR #44360: URL: https://github.com/apache/spark/pull/44360#issuecomment-1856512811 Mind rerunning https://github.com/vsevolodstep-db/spark/actions/runs/7210676936/job/19644434461? -- This is an automated message from the Apache Git Service. To respond to the

[PR] [SPARK-46409][CONNECT] Fix spark-connect-scala-client launch script [spark]

2023-12-14 Thread via GitHub
vsevolodstep-db opened a new pull request, #44360: URL: https://github.com/apache/spark/pull/44360 ### What changes were proposed in this pull request? Currently, ClosureCleaner integration with SparkConnect is breaking Scala UDFs, as ClosureCleaner relies heavily on using reflection,

Re: [PR] [SPARK-46331][SQL] Removing CodegenFallback from subset of DateTime expressions and version() expression [spark]

2023-12-14 Thread via GitHub
dtenedor commented on code in PR #44261: URL: https://github.com/apache/spark/pull/44261#discussion_r1427197722 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveInlineTables.scala: ## @@ -33,12 +34,14 @@ import

Re: [PR] [SPARK-42307][SQL] Adding in a better name for `_LEGACY_ERROR_TEMP_2232` [spark]

2023-12-14 Thread via GitHub
MaxGekk commented on PR #44337: URL: https://github.com/apache/spark/pull/44337#issuecomment-1856486265 @hannahkamundson Could you add a test for the error class, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-46366][SQL] Use WITH expression in BETWEEN to avoid duplicate expressions [spark]

2023-12-14 Thread via GitHub
dtenedor commented on code in PR #44299: URL: https://github.com/apache/spark/pull/44299#discussion_r1427195654 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3436,6 +3437,21 @@ class Analyzer(override val catalogManager:

Re: [PR] [SPARK-46207][SQL] Support MergeInto in DataFrameWriterV2 [spark]

2023-12-14 Thread via GitHub
huaxingao commented on code in PR #44119: URL: https://github.com/apache/spark/pull/44119#discussion_r1427185535 ## sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriterV2.scala: ## @@ -167,6 +173,229 @@ final class DataFrameWriterV2[T] private[sql](table: String, ds:

Re: [PR] [SPARK-46207][SQL] Support MergeInto in DataFrameWriterV2 [spark]

2023-12-14 Thread via GitHub
huaxingao commented on code in PR #44119: URL: https://github.com/apache/spark/pull/44119#discussion_r1427185349 ## sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala: ## @@ -4129,6 +4129,37 @@ class Dataset[T] private[sql]( new DataFrameWriterV2[T](table, this)

Re: [PR] [SPARK-46225][CONNECT] Collapse withColumns calls [spark]

2023-12-14 Thread via GitHub
hvanhovell commented on code in PR #44162: URL: https://github.com/apache/spark/pull/44162#discussion_r1427140961 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3469,6 +3470,52 @@ class Analyzer(override val catalogManager:

Re: [PR] [SPARK-23890][SQL] Support CHANGE COLUMN to add nested fields to structs [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on PR #21012: URL: https://github.com/apache/spark/pull/21012#issuecomment-1856394162 With DS v2, I think you can use `ADD COLUMN` to add nested fields? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-46043][SQL][FOLLOWUP] Remove the catalog and identifier check in DataSourceV2Relation [spark]

2023-12-14 Thread via GitHub
cloud-fan closed pull request #44348: [SPARK-46043][SQL][FOLLOWUP] Remove the catalog and identifier check in DataSourceV2Relation URL: https://github.com/apache/spark/pull/44348 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-46043][SQL][FOLLOWUP] Remove the catalog and identifier check in DataSourceV2Relation [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on PR #44348: URL: https://github.com/apache/spark/pull/44348#issuecomment-1856389058 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-45807][SQL] Improve ViewCatalog API [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44330: URL: https://github.com/apache/spark/pull/44330#discussion_r1427094692 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewInfo.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-45807][SQL] Improve ViewCatalog API [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44330: URL: https://github.com/apache/spark/pull/44330#discussion_r1427093844 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewInfo.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-45807][SQL] Improve ViewCatalog API [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #44330: URL: https://github.com/apache/spark/pull/44330#discussion_r1427093145 ## sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewInfo.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #43982: URL: https://github.com/apache/spark/pull/43982#discussion_r1427090570 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala: ## @@ -329,6 +334,48 @@ object

Re: [PR] [SPARK-46069][SQL] Support unwrap timestamp type to date type [spark]

2023-12-14 Thread via GitHub
cloud-fan commented on code in PR #43982: URL: https://github.com/apache/spark/pull/43982#discussion_r1427091458 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala: ## @@ -329,6 +334,48 @@ object

Re: [PR] [SPARK-42307][SQL] Adding in a better name for `_LEGACY_ERROR_TEMP_2232` [spark]

2023-12-14 Thread via GitHub
hannahkamundson commented on code in PR #44337: URL: https://github.com/apache/spark/pull/44337#discussion_r1427088760 ## common/utils/src/main/resources/error/error-classes.json: ## @@ -2750,6 +2750,12 @@ ], "sqlState" : "42000" }, + "NULL_INDEX" : { +

Re: [PR] [SPARK-23890][SQL] Support CHANGE COLUMN to add nested fields to structs [spark]

2023-12-14 Thread via GitHub
ottomata commented on PR #21012: URL: https://github.com/apache/spark/pull/21012#issuecomment-1856345956 https://issues.apache.org/jira/browse/SPARK-23890 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-43299][FOLLOWUP][CONNECT][SS] Followup on StreamingQueryExceptions [spark]

2023-12-14 Thread via GitHub
WweiL commented on PR #44306: URL: https://github.com/apache/spark/pull/44306#issuecomment-1856341214 @MaxGekk Hi Max can I get another pass? Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

  1   2   >