Re: [PR] [SPARK-47577][CORE][PART1] Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
pan3793 commented on code in PR #45834: URL: https://github.com/apache/spark/pull/45834#discussion_r1549005195 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -21,17 +21,57 @@ package org.apache.spark.internal * All structured logging keys should b

Re: [PR] [SPARK-47577][CORE][PART1] Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
pan3793 commented on code in PR #45834: URL: https://github.com/apache/spark/pull/45834#discussion_r1549003961 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -21,17 +21,57 @@ package org.apache.spark.internal * All structured logging keys should b

Re: [PR] [SPARK-47703][SQL] Modify the simpleString of DataSourceV2ScanRelation to distinguish it from DataSourceV2Relation [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on PR #45829: URL: https://github.com/apache/spark/pull/45829#issuecomment-2033674679 what's diff between before and after? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[PR] [SPARK-43362][SQL][FOLLOWUP] Special handling of JSON type for MySQL … [spark]

2024-04-02 Thread via GitHub
beryllw opened a new pull request, #45835: URL: https://github.com/apache/spark/pull/45835 ### What changes were proposed in this pull request? Some MySQL JDBC drivers converts JSON type into Types.CHAR with a precision of Int.Max. When receiving CHAR with Int.Max precision

Re: [PR] [SPARK-47704][SQL] JSON parsing fails with "java.lang.ClassCastException" when spark.sql.json.enablePartialResults is enabled [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on PR #45833: URL: https://github.com/apache/spark/pull/45833#issuecomment-2033673584 cc @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-47628][SQL] Fix Postgres bit array issue 'Cannot cast to boolean' [spark]

2024-04-02 Thread via GitHub
yaooqinn commented on code in PR #45751: URL: https://github.com/apache/spark/pull/45751#discussion_r1548995350 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala: ## @@ -70,21 +70,21 @@ private case class PostgresDialect() extends JdbcDialect with SQLCo

Re: [PR] [SPARK-47649][SQL] Make the parameter `inputs` of the function `[csv|parquet|orc|json|text|xml](paths: String*)` non empty [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on PR #45776: URL: https://github.com/apache/spark/pull/45776#issuecomment-2033671882 In addition, the errors I get are: ``` scala> spark.read.json().show() 24/04/03 15:37:24 WARN DataSource: All paths were ignored: org.apache.spark.sql.Analys

Re: [PR] [SPARK-47553][SS] Add Java support for transformWithState operator APIs [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45758: URL: https://github.com/apache/spark/pull/45758#discussion_r1548984946 ## sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala: ## @@ -676,6 +677,33 @@ class KeyValueGroupedDataset[K, V] private[sql]( ) }

Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

2024-04-02 Thread via GitHub
itholic commented on code in PR #45377: URL: https://github.com/apache/spark/pull/45377#discussion_r1548983997 ## python/pyspark/sql/tests/test_dataframe.py: ## @@ -825,6 +828,172 @@ def test_duplicate_field_names(self): self.assertEqual(df.schema, schema) self

Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

2024-04-02 Thread via GitHub
itholic commented on code in PR #45377: URL: https://github.com/apache/spark/pull/45377#discussion_r1548983997 ## python/pyspark/sql/tests/test_dataframe.py: ## @@ -825,6 +828,172 @@ def test_duplicate_field_names(self): self.assertEqual(df.schema, schema) self

Re: [PR] [SPARK-47602][CORE] Resource managers: Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
gengliangwang commented on code in PR #45808: URL: https://github.com/apache/spark/pull/45808#discussion_r1548975304 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala: ## @@ -532,7 +533,8 @@ class ExecutorPodsA

Re: [PR] [SPARK-47602][CORE] Resource managers: Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
gengliangwang commented on code in PR #45808: URL: https://github.com/apache/spark/pull/45808#discussion_r1548975304 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala: ## @@ -532,7 +533,8 @@ class ExecutorPodsA

Re: [PR] [SPARK-47577][Core][PART1] Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
gengliangwang commented on PR #45834: URL: https://github.com/apache/spark/pull/45834#issuecomment-2033641746 cc @panbingkun @pan3793 @itholic as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-47452][INFRA][FOLLOWUP] Enforce to install `six` to `Python 3.10` [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun closed pull request #45832: [SPARK-47452][INFRA][FOLLOWUP] Enforce to install `six` to `Python 3.10` URL: https://github.com/apache/spark/pull/45832 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-47452][INFRA][FOLLOWUP] Enforce to install `six` to `Python 3.10` [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on PR #45832: URL: https://github.com/apache/spark/pull/45832#issuecomment-2033641613 Thank you, @HyukjinKwon ! Let me merge this since the image building is already tested. -- This is an automated message from the Apache Git Service. To respond to the message,

[PR] [SPARK-47577][Core][PART1] Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
gengliangwang opened a new pull request, #45834: URL: https://github.com/apache/spark/pull/45834 ### What changes were proposed in this pull request? Migrate logError with variables to structured logging framework. This is part1 for the logError entries of the following API in

[PR] [SPARK-47704][SQL] JSON parsing fails with "java.lang.ClassCastException" when spark.sql.json.enablePartialResults is enabled [spark]

2024-04-02 Thread via GitHub
sadikovi opened a new pull request, #45833: URL: https://github.com/apache/spark/pull/45833 ### What changes were proposed in this pull request? This PR fixes a bug that was introduced by [SPARK-47704](https://issues.apache.org/jira/browse/SPARK-47704). To be precise, SPA

Re: [PR] [SPARK-47628][SQL] Fix Postgres bit array issue 'Cannot cast to boolean' [spark]

2024-04-02 Thread via GitHub
cloud-fan commented on code in PR #45751: URL: https://github.com/apache/spark/pull/45751#discussion_r1548965732 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala: ## @@ -70,21 +70,21 @@ private case class PostgresDialect() extends JdbcDialect with SQLC

Re: [PR] [SPARK-47628][SQL] Fix Postgres bit array issue 'Cannot cast to boolean' [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on code in PR #45751: URL: https://github.com/apache/spark/pull/45751#discussion_r1548963207 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala: ## @@ -70,21 +70,21 @@ private case class PostgresDialect() extends JdbcDialect with

Re: [PR] [SPARK-47602][CORE] Resource managers: Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
pan3793 commented on code in PR #45808: URL: https://github.com/apache/spark/pull/45808#discussion_r1548962525 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala: ## @@ -532,7 +533,8 @@ class ExecutorPodsAllocat

Re: [PR] [SPARK-47653][SS] Add support for negative numeric types and range scan key encoder [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45778: URL: https://github.com/apache/spark/pull/45778#discussion_r1548956564 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreSuite.scala: ## @@ -294,6 +295,60 @@ class RocksDBStateStoreSuite extend

Re: [PR] [SPARK-47602][CORE] Resource managers: Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
gengliangwang commented on code in PR #45808: URL: https://github.com/apache/spark/pull/45808#discussion_r1548956155 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala: ## @@ -532,7 +533,8 @@ class ExecutorPodsA

Re: [PR] [SPARK-47602][CORE] Resource managers: Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
pan3793 commented on code in PR #45808: URL: https://github.com/apache/spark/pull/45808#discussion_r1548950710 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala: ## @@ -532,7 +533,8 @@ class ExecutorPodsAllocat

Re: [PR] [SPARK-47653][SS] Add support for negative numeric types and range scan key encoder [spark]

2024-04-02 Thread via GitHub
HeartSaVioR commented on code in PR #45778: URL: https://github.com/apache/spark/pull/45778#discussion_r1548946531 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala: ## @@ -276,53 +284,111 @@ class RangeKeyScanStateEncoder(

Re: [PR] [SPARK-47701][SQL][TESTS] Postgres: Add test for Composite and Range types [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on PR #45827: URL: https://github.com/apache/spark/pull/45827#issuecomment-2033576501 Merged to master for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-47701][SQL][TESTS] Postgres: Add test for Composite and Range types [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun closed pull request #45827: [SPARK-47701][SQL][TESTS] Postgres: Add test for Composite and Range types URL: https://github.com/apache/spark/pull/45827 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-47452][INFRA][FOLLOWUP] Enforce to install `six` to `Python 3.10` [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on PR #45832: URL: https://github.com/apache/spark/pull/45832#issuecomment-2033569304 Could you review this PR, @HyukjinKwon ? This PR will recover the `Python 3.10` CI pipeline. -- This is an automated message from the Apache Git Service. To respond to the message

Re: [PR] [SPARK-45733][PYTHON][TESTS][FOLLOWUP] Skip `pyspark.sql.tests.connect.client.test_client` if not should_test_connect [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun closed pull request #45830: [SPARK-45733][PYTHON][TESTS][FOLLOWUP] Skip `pyspark.sql.tests.connect.client.test_client` if not should_test_connect URL: https://github.com/apache/spark/pull/45830 -- This is an automated message from the Apache Git Service. To respond to the messa

Re: [PR] [SPARK-45733][PYTHON][TESTS][FOLLOWUP] Skip `pyspark.sql.tests.connect.client.test_client` if not should_test_connect [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on PR #45830: URL: https://github.com/apache/spark/pull/45830#issuecomment-2033561359 Thank you, @HyukjinKwon . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47553][SS] Add Java support for transformWithState operator APIs [spark]

2024-04-02 Thread via GitHub
HeartSaVioR commented on code in PR #45758: URL: https://github.com/apache/spark/pull/45758#discussion_r1548917934 ## sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala: ## @@ -676,6 +677,33 @@ class KeyValueGroupedDataset[K, V] private[sql]( ) }

[PR] [SPARK-47452][INFRA][FOLLOWUP] Enforce to install `six` to Python 3.10 [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun opened a new pull request, #45832: URL: https://github.com/apache/spark/pull/45832 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### H

Re: [PR] [SPARK-47696][SQL] try_to_timestamp should handle SparkUpgradeException [spark]

2024-04-02 Thread via GitHub
gengliangwang commented on code in PR #45822: URL: https://github.com/apache/spark/pull/45822#discussion_r1548927521 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala: ## @@ -1235,11 +1235,13 @@ object TryToTimestampExpressionBuil

Re: [PR] [SPARK-47696][SQL] try_to_timestamp should handle SparkUpgradeException [spark]

2024-04-02 Thread via GitHub
cloud-fan commented on code in PR #45822: URL: https://github.com/apache/spark/pull/45822#discussion_r1548924394 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala: ## @@ -1235,11 +1235,13 @@ object TryToTimestampExpressionBuilder

Re: [PR] [SPARK-47696][SQL] try_to_timestamp should handle SparkUpgradeException [spark]

2024-04-02 Thread via GitHub
cloud-fan commented on code in PR #45822: URL: https://github.com/apache/spark/pull/45822#discussion_r1548924928 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala: ## @@ -1235,11 +1235,13 @@ object TryToTimestampExpressionBuilder

[PR] [WIP][SQL][CONNECT] Fix a self-join case [spark]

2024-04-02 Thread via GitHub
zhengruifeng opened a new pull request, #45831: URL: https://github.com/apache/spark/pull/45831 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### Ho

Re: [PR] [MINOR][DOCS] Update tags inside configuration.md tables [spark]

2024-04-02 Thread via GitHub
yaooqinn commented on PR #45731: URL: https://github.com/apache/spark/pull/45731#issuecomment-2033504024 @vadim can you revise the PR description? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [MINOR][DOCS] Update tags inside configuration.md tables [spark]

2024-04-02 Thread via GitHub
vadim commented on PR #45731: URL: https://github.com/apache/spark/pull/45731#issuecomment-2033499194 first example: https://github.com/apache/spark/assets/86705/8a604c35-b067-4ad3-9670-9f3a00d01a20";> second example: https://github.com/apache/spark/assets/86705/097cbb82-cc95-496

Re: [PR] [SPARK-47649][SQL] Make the parameter `inputs` of the function `[csv|parquet|orc|json|text|xml](paths: String*)` non empty [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on PR #45776: URL: https://github.com/apache/spark/pull/45776#issuecomment-2033496439 To be sure, gentle ping once more, @HyukjinKwon , @cloud-fan , @LuciferYang , @koedlt - https://github.com/apache/spark/pull/45776#discussion_r1545520694 - https://github.com/

Re: [PR] [SPARK-47643][SS][PYTHON] Add pyspark test for python streaming source [spark]

2024-04-02 Thread via GitHub
HeartSaVioR commented on PR #45768: URL: https://github.com/apache/spark/pull/45768#issuecomment-2033495338 Late +1. If we can add more advanced test cases e.g. multiple partitions even better, which we could revisit in e2e test for python streaming source to python streaming sink. -- Th

Re: [PR] [SPARK-47696][SQL] try_to_timestamp should handle SparkUpgradeException [spark]

2024-04-02 Thread via GitHub
gengliangwang commented on code in PR #45822: URL: https://github.com/apache/spark/pull/45822#discussion_r1548887170 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala: ## @@ -1235,11 +1235,13 @@ object TryToTimestampExpressionBuil

Re: [PR] [SPARK-45733][PYTHON][TESTS][FOLLOWUP] Skip `pyspark.sql.tests.connect.client.test_client` if not should_test_connect [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on PR #45830: URL: https://github.com/apache/spark/pull/45830#issuecomment-2033491877 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

Re: [PR] [SPARK-47653][SS] Add support for negative numeric types and range scan key encoder [spark]

2024-04-02 Thread via GitHub
HeartSaVioR commented on code in PR #45778: URL: https://github.com/apache/spark/pull/45778#discussion_r154888 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala: ## @@ -276,53 +284,111 @@ class RangeKeyScanStateEncoder(

Re: [PR] [SPARK-47454][PYTHON][TESTS][FOLLOWUP] Skip `test_create_dataframe_from_pandas_with_day_time_interval` if pandas is not avaiable [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun closed pull request #45828: [SPARK-47454][PYTHON][TESTS][FOLLOWUP] Skip `test_create_dataframe_from_pandas_with_day_time_interval` if pandas is not avaiable URL: https://github.com/apache/spark/pull/45828 -- This is an automated message from the Apache Git Service. To respond

Re: [PR] [SPARK-47454][PYTHON][TESTS][FOLLOWUP] Skip `test_create_dataframe_from_pandas_with_day_time_interval` if pandas is not avaiable [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on PR #45828: URL: https://github.com/apache/spark/pull/45828#issuecomment-2033488061 Thank you, @yaooqinn ! Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-45733][PYTHON][TESTS][FOLLOWUP] Skip `pyspark.sql.tests.connect.client.test_client` if not should_test_connect [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on code in PR #45830: URL: https://github.com/apache/spark/pull/45830#discussion_r1548879653 ## python/pyspark/sql/tests/connect/client/test_client.py: ## @@ -134,18 +245,6 @@ def test_channel_builder_with_session(self): self.assertEqual(client._

Re: [PR] [SPARK-47701][SQL][TESTS] Postgres: Add test for Composite and Range types [spark]

2024-04-02 Thread via GitHub
yaooqinn commented on PR #45827: URL: https://github.com/apache/spark/pull/45827#issuecomment-2033481150 Thank you @dongjoon-hyun, I further added tests for Range types to reduce # of small PRs -- This is an automated message from the Apache Git Service. To respond to the message, please

[PR] [SPARK-45733][PYTHON][TESTS][FOLLOWUP] Skip `pyspark.sql.tests.connect.client.test_client` if not should_test_connect [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun opened a new pull request, #45830: URL: https://github.com/apache/spark/pull/45830 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-47628][SQL] Fix Postgres bit array issue 'Cannot cast to boolean' [spark]

2024-04-02 Thread via GitHub
yaooqinn commented on code in PR #45751: URL: https://github.com/apache/spark/pull/45751#discussion_r1548873575 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala: ## @@ -70,21 +70,21 @@ private case class PostgresDialect() extends JdbcDialect with SQLCo

[PR] [SPARK-47703][SQL] Modify the simpleString of DataSourceV2ScanRelation to distinguish it from DataSourceV2Relation [spark]

2024-04-02 Thread via GitHub
Zouxxyy opened a new pull request, #45829: URL: https://github.com/apache/spark/pull/45829 ### What changes were proposed in this pull request? Modify the simpleString of DataSourceV2ScanRelation to distinguish it from DataSourceV2Relation ### Why are the changes needed

Re: [PR] [SPARK-47628][SQL] Fix Postgres bit array issue 'Cannot cast to boolean' [spark]

2024-04-02 Thread via GitHub
cloud-fan commented on code in PR #45751: URL: https://github.com/apache/spark/pull/45751#discussion_r1548869846 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala: ## @@ -70,21 +70,21 @@ private case class PostgresDialect() extends JdbcDialect with SQLC

Re: [PR] [SPARK-47696][SQL] try_to_timestamp should handle SparkUpgradeException [spark]

2024-04-02 Thread via GitHub
cloud-fan commented on code in PR #45822: URL: https://github.com/apache/spark/pull/45822#discussion_r1548867885 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala: ## @@ -1235,11 +1235,13 @@ object TryToTimestampExpressionBuilder

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
sahnib commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548800286 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateTTLSuite.scala: ## @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
sahnib commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548864182 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TTLState.scala: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[PR] [SPARK-47454][PYTHON][TESTS][FOLLOWUP] Skip `test_create_dataframe_from_pandas_with_day_time_interval` if pandas is not avaiable [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun opened a new pull request, #45828: URL: https://github.com/apache/spark/pull/45828 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [SPARK-47655][SS] Integrate timer with Initial State handling for state-v2 [spark]

2024-04-02 Thread via GitHub
HeartSaVioR closed pull request #45780: [SPARK-47655][SS] Integrate timer with Initial State handling for state-v2 URL: https://github.com/apache/spark/pull/45780 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL ab

Re: [PR] [SPARK-47655][SS] Integrate timer with Initial State handling for state-v2 [spark]

2024-04-02 Thread via GitHub
HeartSaVioR commented on PR #45780: URL: https://github.com/apache/spark/pull/45780#issuecomment-2033458353 Thanks! Merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[PR] [SPARK-47701][SQL][TESTS] Postgres: Add test for complex types [spark]

2024-04-02 Thread via GitHub
yaooqinn opened a new pull request, #45827: URL: https://github.com/apache/spark/pull/45827 ### What changes were proposed in this pull request? Add tests for complex types of postgres. ### Why are the changes needed? test improvments ### Does this

Re: [PR] [SPARK-47691][SQL] Postgres: Support multi dimensional array on the write side [spark]

2024-04-02 Thread via GitHub
yaooqinn commented on PR #45815: URL: https://github.com/apache/spark/pull/45815#issuecomment-2033450239 Thank you @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-47688][CORE] Support `three` methods of the log `concatenation` in the `structured logging framework` [spark]

2024-04-02 Thread via GitHub
panbingkun commented on PR #45813: URL: https://github.com/apache/spark/pull/45813#issuecomment-2033440717 > > Do we need to support the second(Sting + MDC) and third(MDC + Sting) methods besides the first one? > > So by the end of this project, all the log entries containing variable

Re: [PR] [SPARK-47691][SQL] Postgres: Support multi dimensional array on the write side [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun closed pull request #45815: [SPARK-47691][SQL] Postgres: Support multi dimensional array on the write side URL: https://github.com/apache/spark/pull/45815 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47318][CORE] Adds HKDF round to AuthEngine key derivation to follow standard KEX practices [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on code in PR #45425: URL: https://github.com/apache/spark/pull/45425#discussion_r1548842531 ## common/network-common/src/main/java/org/apache/spark/network/crypto/AuthEngine.java: ## @@ -224,7 +236,7 @@ private TransportCipher generateTransportCipher(

Re: [PR] [SPARK-47318][CORE] Adds HKDF round to AuthEngine key derivation to follow standard KEX practices [spark]

2024-04-02 Thread via GitHub
dongjoon-hyun commented on code in PR #45425: URL: https://github.com/apache/spark/pull/45425#discussion_r1548843062 ## common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java: ## @@ -213,6 +213,11 @@ public boolean encryptionEnabled() { return

Re: [PR] [SPARK-47688][CORE] Support `three` methods of the log `concatenation` in the `structured logging framework` [spark]

2024-04-02 Thread via GitHub
gengliangwang commented on PR #45813: URL: https://github.com/apache/spark/pull/45813#issuecomment-2033407009 > Do we need to support the second(Sting + MDC) and third(MDC + Sting) methods besides the first one? So by the end of this project, all the log entries containing variables w

Re: [PR] [SPARK-47318][CORE] Adds HKDF round to AuthEngine key derivation to follow standard KEX practices [spark]

2024-04-02 Thread via GitHub
mridulm commented on code in PR #45425: URL: https://github.com/apache/spark/pull/45425#discussion_r1548815203 ## common/network-common/src/main/java/org/apache/spark/network/crypto/README.md: ## @@ -99,3 +103,13 @@ sessions. It would, however, allow impersonation of future ses

Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on PR #45377: URL: https://github.com/apache/spark/pull/45377#issuecomment-2033393150 cc @cloud-fan too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45377: URL: https://github.com/apache/spark/pull/45377#discussion_r1548814091 ## python/pyspark/testing/utils.py: ## @@ -280,7 +282,14 @@ def check_error( exception: PySparkException, error_class: str, message_param

Re: [PR] [SPARK-47691][SQL] Postgres: Support multi dimensional array on the write side [spark]

2024-04-02 Thread via GitHub
yaooqinn commented on PR #45815: URL: https://github.com/apache/spark/pull/45815#issuecomment-2033390778 cc @cloud-fan @dongjoon-hyun thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45377: URL: https://github.com/apache/spark/pull/45377#discussion_r1548812966 ## python/pyspark/errors/exceptions/captured.py: ## @@ -379,5 +379,17 @@ def fragment(self) -> str: def callSite(self) -> str: return str(self._q.call

Re: [PR] [SPARK-47681][SQL] Add schema_of_variant expression. [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45806: URL: https://github.com/apache/spark/pull/45806#discussion_r1548811808 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/variant/variantExpressions.scala: ## @@ -403,3 +405,134 @@ object VariantGetExpressionBuild

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548810035 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulProcessorHandleImpl.scala: ## @@ -103,22 +113,35 @@ class StatefulProcessorHandleImpl(

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548810035 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulProcessorHandleImpl.scala: ## @@ -103,22 +113,35 @@ class StatefulProcessorHandleImpl(

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548807708 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TTLState.scala: ## @@ -0,0 +1,150 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Re: [PR] [SPARK-47366][PYTHON] Add VariantVal for PySpark [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45826: URL: https://github.com/apache/spark/pull/45826#discussion_r1548802161 ## python/pyspark/sql/types.py: ## @@ -1468,6 +1475,36 @@ def __eq__(self, other: Any) -> bool: return type(self) == type(other) +class VariantVal: Re

Re: [PR] [SPARK-47653][SS] Add support for negative numeric types and range scan key encoder [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45778: URL: https://github.com/apache/spark/pull/45778#discussion_r1548801165 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala: ## @@ -276,53 +283,113 @@ class RangeKeyScanStateEncoder(

Re: [PR] [SPARK-47653][SS] Add support for negative numeric types and range scan key encoder [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45778: URL: https://github.com/apache/spark/pull/45778#discussion_r1548800791 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreSuite.scala: ## @@ -294,6 +295,55 @@ class RocksDBStateStoreSuite extend

Re: [PR] [SPARK-47653][SS] Add support for negative numeric types and range scan key encoder [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45778: URL: https://github.com/apache/spark/pull/45778#discussion_r1548800930 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateStoreSuite.scala: ## @@ -294,6 +295,55 @@ class RocksDBStateStoreSuite extend

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
sahnib commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548800286 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateTTLSuite.scala: ## @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
sahnib commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548795527 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateTTLSuite.scala: ## @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
sahnib commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548791096 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateTTLSuite.scala: ## @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
sahnib commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548791295 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateTTLSuite.scala: ## @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-47210][SQL] Addition of implicit casting without indeterminate support [spark]

2024-04-02 Thread via GitHub
cloud-fan commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1548791139 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundatio

Re: [PR] [SPARK-47210][SQL] Addition of implicit casting without indeterminate support [spark]

2024-04-02 Thread via GitHub
cloud-fan commented on code in PR #45383: URL: https://github.com/apache/spark/pull/45383#discussion_r1548790383 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CollationTypeCasts.scala: ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundatio

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
sahnib commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548788984 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateTTLSuite.scala: ## @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache Software Foundation (ASF)

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548788722 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateTTLSuite.scala: ## @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-47683][PYTHON][BUILD] Decouple PySpark core API to pyspark.core package [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on PR #45053: URL: https://github.com/apache/spark/pull/45053#issuecomment-2033351917 cc @zhengruifeng @grundprinzip @ueshin @hvanhovell @itholic @WeichenXu123 @mengxr @allisonwang-db @xinrong-meng @gatorsmile @cloud-fan This is ready for a look (before merging, should

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
sahnib commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548787953 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/ValueStateSuite.scala: ## @@ -303,6 +311,244 @@ class ValueStateSuite extends StateVariableSuit

Re: [PR] [SPARK-47688][CORE] Support `three` methods of the log `concatenation` in the `structured logging framework` [spark]

2024-04-02 Thread via GitHub
panbingkun commented on PR #45813: URL: https://github.com/apache/spark/pull/45813#issuecomment-2033348771 @gengliangwang Do we need to support the `second`(Sting + MDC) and `third`(MDC + Sting) methods besides the `first` one? -- This is an automated message from the Apache Git Servi

Re: [PR] [SPARK-47682][SQL] Support cast from variant. [spark]

2024-04-02 Thread via GitHub
chenhao-db commented on PR #45807: URL: https://github.com/apache/spark/pull/45807#issuecomment-2033348791 @HyukjinKwon Sure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-47558][SS] State TTL support for ValueState [spark]

2024-04-02 Thread via GitHub
anishshri-db commented on code in PR #45674: URL: https://github.com/apache/spark/pull/45674#discussion_r1548785634 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateTTLSuite.scala: ## @@ -0,0 +1,579 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-47653][SS] Add support for negative numeric types and range scan key encoder [spark]

2024-04-02 Thread via GitHub
neilramaswamy commented on code in PR #45778: URL: https://github.com/apache/spark/pull/45778#discussion_r1548751462 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBStateEncoder.scala: ## @@ -276,53 +283,113 @@ class RangeKeyScanStateEncoder(

Re: [PR] [SPARK-47602][CORE] Resource managers: Migrate logError with variables to structured logging framework [spark]

2024-04-02 Thread via GitHub
panbingkun commented on code in PR #45808: URL: https://github.com/apache/spark/pull/45808#discussion_r1548785661 ## resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala: ## @@ -854,7 +856,8 @@ private[spark] class ApplicationMaster(

Re: [PR] [WIP][SPARK-41811][PYTHON][CONNECT] Implement `SQLStringFormatter` with `WithRelations` [spark]

2024-04-02 Thread via GitHub
zhengruifeng commented on code in PR #45614: URL: https://github.com/apache/spark/pull/45614#discussion_r1548784881 ## connector/connect/common/src/main/protobuf/spark/connect/relations.proto: ## @@ -131,6 +132,23 @@ message SQL { repeated Expression pos_arguments = 5; } +

Re: [PR] [SPARK-47682][SQL] Support cast from variant. [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on PR #45807: URL: https://github.com/apache/spark/pull/45807#issuecomment-2033343284 Can you keep the PR template please? https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE -- This is an automated message from the Apache Git Service. To respon

Re: [PR] [SPARK-47693][SQL] Add optimization for lowercase comparison of UTF8String used in UTF8_BINARY_LCASE collation [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45816: URL: https://github.com/apache/spark/pull/45816#discussion_r1548780039 ## common/unsafe/src/test/java/org/apache/spark/unsafe/types/UTF8StringSuite.java: ## @@ -107,6 +107,28 @@ public void binaryCompareTo() { assertTrue(fromStrin

Re: [PR] [SPARK-47693][SQL] Add optimization for lowercase comparison of UTF8String used in UTF8_BINARY_LCASE collation [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45816: URL: https://github.com/apache/spark/pull/45816#discussion_r1548779756 ## common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java: ## @@ -447,6 +447,37 @@ private UTF8String toUpperCaseSlow() { return fromString(

Re: [PR] [SPARK-47359][SQL] Support TRANSLATE function to work with collated strings [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on PR #45820: URL: https://github.com/apache/spark/pull/45820#issuecomment-207001 Let's follow https://github.com/databricks/scala-style-guide, and remove unrelated changes, e.g., adding newlines which makes cherry-pick/backporting/reverting difficult. -- This i

Re: [PR] [SPARK-47692][SQL] Addition of priority flag to StringType [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on PR #45819: URL: https://github.com/apache/spark/pull/45819#issuecomment-206686 Let's follow https://github.com/databricks/scala-style-guide, and remove unrelated changes, e.g., adding newlines which makes cherry-pick/backporting/reverting difficult. -- This i

Re: [PR] [SPARK-47692][SQL] Addition of priority flag to StringType [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45819: URL: https://github.com/apache/spark/pull/45819#discussion_r1548778640 ## sql/core/src/test/scala/org/apache/spark/sql/CollationSuite.scala: ## @@ -509,6 +497,209 @@ class CollationSuite extends DatasourceV2SQLBase with AdaptiveSparkP

Re: [PR] [SPARK-47692][SQL] Addition of priority flag to StringType [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45819: URL: https://github.com/apache/spark/pull/45819#discussion_r1548778507 ## sql/core/src/test/scala/org/apache/spark/sql/CollationStringExpressionsSuite.scala: ## @@ -20,57 +20,60 @@ package org.apache.spark.sql import scala.collection.

Re: [PR] [SPARK-47692][SQL] Addition of priority flag to StringType [spark]

2024-04-02 Thread via GitHub
HyukjinKwon commented on code in PR #45819: URL: https://github.com/apache/spark/pull/45819#discussion_r1548778237 ## sql/core/src/test/scala/org/apache/spark/sql/CollationRegexpExpressionsSuite.scala: ## @@ -20,421 +20,406 @@ package org.apache.spark.sql import scala.collectio

  1   2   3   >