Re: [PR] [SPARK-47759][CORE] Fix an issue where JavaUtils.timeStringAs fails to parse a legitimate time string [spark]

2024-04-08 Thread via GitHub
mridulm commented on PR #45942: URL: https://github.com/apache/spark/pull/45942#issuecomment-2044203113 `Pattern.compile` is operating on a new instance of `Pattern` object - and so inherits the thread safety of `Pattern`. It is not clear to me what the issue being observed is, and how

Re: [PR] [SPARK-47595][CORE] Streaming: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on PR #45910: URL: https://github.com/apache/spark/pull/45910#issuecomment-2044203709 @panbingkun shall we mark this as ready to review? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang closed pull request #45904: [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework URL: https://github.com/apache/spark/pull/45904 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on PR #45904: URL: https://github.com/apache/spark/pull/45904#issuecomment-2044200833 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47774][INFRA] Remove redundant rules from `MimaExcludes` [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45944: URL: https://github.com/apache/spark/pull/45944#issuecomment-2044200548 Thank you, @pan3793 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45904: URL: https://github.com/apache/spark/pull/45904#discussion_r1556952867 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -95,6 +117,8 @@ object LogKey extends Enumeration { val SHUFFLE_MERGE_ID = Value

Re: [PR] [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
panbingkun commented on PR #45936: URL: https://github.com/apache/spark/pull/45936#issuecomment-2044193279 +1, LGTM. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47754][SQL] Postgres: Support reading multidimensional arrays [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45917: URL: https://github.com/apache/spark/pull/45917#issuecomment-2044191406 Merged to master. Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

2024-04-08 Thread via GitHub
itholic commented on PR #45377: URL: https://github.com/apache/spark/pull/45377#issuecomment-2044191299 @cloud-fan @ueshin I believe now the previous comments are all resolved, and I also added more tests accordingly. Could you take a look when you find some time? -- This is an

Re: [PR] [SPARK-47754][SQL] Postgres: Support reading multidimensional arrays [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun closed pull request #45917: [SPARK-47754][SQL] Postgres: Support reading multidimensional arrays URL: https://github.com/apache/spark/pull/45917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-47774][INFRA] Remove redundant rules from `MimaExcludes` [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45944: URL: https://github.com/apache/spark/pull/45944#issuecomment-2044190123 MiMi Test passed on this PR. ![Screenshot 2024-04-08 at 22 43 49](https://github.com/apache/spark/assets/9700541/5f103235-5199-4c24-9d6f-4cc676d25220) -- This is an

Re: [PR] [SPARK-47771][PYTHON][TESTS] Make `max_by, min_by` doctests deterministic [spark]

2024-04-08 Thread via GitHub
zhengruifeng commented on PR #45939: URL: https://github.com/apache/spark/pull/45939#issuecomment-2044186910 thanks, merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47771][PYTHON][TESTS] Make `max_by, min_by` doctests deterministic [spark]

2024-04-08 Thread via GitHub
zhengruifeng closed pull request #45939: [SPARK-47771][PYTHON][TESTS] Make `max_by, min_by` doctests deterministic URL: https://github.com/apache/spark/pull/45939 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[PR] [SPARK-47775][SQL] Support remaining scalar types in the variant spec. [spark]

2024-04-08 Thread via GitHub
chenhao-db opened a new pull request, #45945: URL: https://github.com/apache/spark/pull/45945 ### What changes were proposed in this pull request? This PR adds support for the remaining scalar types defined in the variant spec (DATE, TIMESTAMP, TIMESTAMP_NTZ, FLOAT, BINARY). The

[PR] [SPARK-47774][INFRA] Remove redundant rules from `MimaExcludes` [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun opened a new pull request, #45944: URL: https://github.com/apache/spark/pull/45944 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
pan3793 commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556905110 ## sql/api/pom.xml: ## @@ -56,13 +56,6 @@ org.json4s json4s-jackson_${scala.binary.version} -3.7.0-M11 Review Comment:

[PR] [MINOR][BUILD] Remove redundant json4s-jackson exclusions in sql/api POM [spark]

2024-04-08 Thread via GitHub
pan3793 opened a new pull request, #45943: URL: https://github.com/apache/spark/pull/45943 ### What changes were proposed in this pull request? The dependency's version and exclusion are managed by root `pom.xml` `dependencyManagement`, we don't need to replicate the

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556899643 ## sql/api/pom.xml: ## @@ -56,13 +56,6 @@ org.json4s json4s-jackson_${scala.binary.version} -3.7.0-M11 Review

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
pan3793 commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556900847 ## sql/api/pom.xml: ## @@ -56,13 +56,6 @@ org.json4s json4s-jackson_${scala.binary.version} -3.7.0-M11 Review Comment:

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556899643 ## sql/api/pom.xml: ## @@ -56,13 +56,6 @@ org.json4s json4s-jackson_${scala.binary.version} -3.7.0-M11 Review

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45838: URL: https://github.com/apache/spark/pull/45838#issuecomment-2044149193 Thank you for rebasing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47770][INFRA] Fix `GenerateMIMAIgnore.isPackagePrivateModule` to return `false` instead of failing [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45938: URL: https://github.com/apache/spark/pull/45938#issuecomment-2044144931 Since this is a kind of error handling in infra module, I backport this to branch-3.5/3.4, too. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-47770][INFRA] Fix `GenerateMIMAIgnore.isPackagePrivateModule` to return `false` instead of failing [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45938: URL: https://github.com/apache/spark/pull/45938#issuecomment-2044139422 Thank you, @HyukjinKwon , @LuciferYang , @yaooqinn , @pan3793 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-47680][SQL] Add variant_explode expression. [spark]

2024-04-08 Thread via GitHub
cloud-fan commented on code in PR #45805: URL: https://github.com/apache/spark/pull/45805#discussion_r1556871731 ## sql/core/src/test/scala/org/apache/spark/sql/VariantSuite.scala: ## @@ -298,4 +300,28 @@ class VariantSuite extends QueryTest with SharedSparkSession { }

Re: [PR] [SPARK-47680][SQL] Add variant_explode expression. [spark]

2024-04-08 Thread via GitHub
srielau commented on code in PR #45805: URL: https://github.com/apache/spark/pull/45805#discussion_r1556857735 ## sql/core/src/test/scala/org/apache/spark/sql/VariantSuite.scala: ## @@ -298,4 +300,28 @@ class VariantSuite extends QueryTest with SharedSparkSession { }

Re: [PR] Add support for AbstractArrayType(StringTypeCollated) [spark]

2024-04-08 Thread via GitHub
cloud-fan commented on code in PR #45891: URL: https://github.com/apache/spark/pull/45891#discussion_r1556851091 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/StringTypeCollated.scala: ## @@ -52,3 +52,20 @@ case object StringTypeAnyCollation extends

[PR] [SPARK-47759][CORE] Fix an issue where JavaUtils.timeStringAs fails to parse a legitimate time string [spark]

2024-04-08 Thread via GitHub
xiongbo-sjtu opened a new pull request, #45942: URL: https://github.com/apache/spark/pull/45942 ### What changes were proposed in this pull request? Fix an issue where JavaUtils.timeStringAs fails to parse a legitimate time string. ### Why are the changes needed? It's

Re: [PR] [SPARK-47682][SQL] Support cast from variant. [spark]

2024-04-08 Thread via GitHub
cloud-fan commented on PR #45807: URL: https://github.com/apache/spark/pull/45807#issuecomment-2044110214 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47682][SQL] Support cast from variant. [spark]

2024-04-08 Thread via GitHub
cloud-fan closed pull request #45807: [SPARK-47682][SQL] Support cast from variant. URL: https://github.com/apache/spark/pull/45807 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
pan3793 commented on PR #45838: URL: https://github.com/apache/spark/pull/45838#issuecomment-2044102847 @dongjoon-hyun @cloud-fan @gengliangwang @LuciferYang I updated MimaExcludes, now it looks good. -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
pan3793 commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556835313 ## project/MimaExcludes.scala: ## @@ -90,7 +90,21 @@ object MimaExcludes {

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
pan3793 commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556834466 ## project/MimaExcludes.scala: ## @@ -90,7 +90,11 @@ object MimaExcludes {

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
pan3793 commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556833527 ## project/MimaExcludes.scala: ## @@ -90,7 +90,11 @@ object MimaExcludes {

Re: [PR] [SPARK-47770][INFRA] Fix `GenerateMIMAIgnore.isPackagePrivateModule` to return `false` instead of failing [spark]

2024-04-08 Thread via GitHub
LuciferYang commented on PR #45938: URL: https://github.com/apache/spark/pull/45938#issuecomment-2044079821 Merged into master for Spark 4.0. Thanks @dongjoon-hyun @HyukjinKwon @yaooqinn @pan3793 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-47770][INFRA] Fix `GenerateMIMAIgnore.isPackagePrivateModule` to return `false` instead of failing [spark]

2024-04-08 Thread via GitHub
LuciferYang closed pull request #45938: [SPARK-47770][INFRA] Fix `GenerateMIMAIgnore.isPackagePrivateModule` to return `false` instead of failing URL: https://github.com/apache/spark/pull/45938 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-47593][CORE] Connector module: Migrate logWarn with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
panbingkun commented on PR #45879: URL: https://github.com/apache/spark/pull/45879#issuecomment-2044077387 > @panbingkun Thanks for the works! LGTM except for some minor comments. @gengliangwang all done. Thanks! -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-47761][SQL] Oracle: Support reading AnsiIntervalTypes [spark]

2024-04-08 Thread via GitHub
yaooqinn commented on code in PR #45925: URL: https://github.com/apache/spark/pull/45925#discussion_r1556791614 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala: ## @@ -526,6 +526,16 @@ object JdbcUtils extends Logging with

Re: [PR] [SPARK-47754][SQL] Postgres: Support reading multidimensional arrays [spark]

2024-04-08 Thread via GitHub
yaooqinn commented on code in PR #45917: URL: https://github.com/apache/spark/pull/45917#discussion_r1556790492 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala: ## @@ -798,6 +798,9 @@ abstract class JdbcDialect extends Serializable with Logging {

Re: [PR] [SPARK-47770][INFRA] Fix `GenerateMIMAIgnore.isPackagePrivateModule` to return `false` instead of failing [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45938: URL: https://github.com/apache/spark/pull/45938#issuecomment-2044054285 Could you review this PR, @HyukjinKwon ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[PR] [SPARK-47772][PYTHON][TESTS] Fix the doctest of `mode` function [spark]

2024-04-08 Thread via GitHub
zhengruifeng opened a new pull request, #45940: URL: https://github.com/apache/spark/pull/45940 ### What changes were proposed in this pull request? Fix the doctest of `mode` function ### Why are the changes needed? When the data has multiple modes, `mode("col",

[PR] [SPARK-47771][PYTHON][TESTS] Make `max_by, min_by` doctests deterministic [spark]

2024-04-08 Thread via GitHub
zhengruifeng opened a new pull request, #45939: URL: https://github.com/apache/spark/pull/45939 ### What changes were proposed in this pull request? Make `max_by, min_by` doctests deterministic ### Why are the changes needed? they do fail when testing env (python version,

Re: [PR] [SPARK-47767][SQL] Show offset value in TakeOrderedAndProjectExec [spark]

2024-04-08 Thread via GitHub
HyukjinKwon commented on code in PR #45931: URL: https://github.com/apache/spark/pull/45931#discussion_r1556759763 ## sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala: ## @@ -358,7 +358,8 @@ case class TakeOrderedAndProjectExec( val orderByString =

Re: [PR] [MINOR][TESTS] Make `check_invalid_args` reusable in parity test [spark]

2024-04-08 Thread via GitHub
HyukjinKwon closed pull request #45928: [MINOR][TESTS] Make `check_invalid_args` reusable in parity test URL: https://github.com/apache/spark/pull/45928 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [MINOR][TESTS] Make `check_invalid_args` reusable in parity test [spark]

2024-04-08 Thread via GitHub
HyukjinKwon commented on PR #45928: URL: https://github.com/apache/spark/pull/45928#issuecomment-2043999514 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47586][SQL] Hive module: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
itholic commented on code in PR #45876: URL: https://github.com/apache/spark/pull/45876#discussion_r1556686231 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -686,17 +687,19 @@ private[hive] class HiveClientImpl( } catch {

Re: [PR] [WIP][SPARK-47758][BUILD] Upgrade commons-collections4 to 4.5.0-M1 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45921: URL: https://github.com/apache/spark/pull/45921#discussion_r1556679227 ## dev/deps/spark-deps-hadoop-3-hive-2.3: ## @@ -38,7 +38,7 @@ chill_2.13/0.10.0//chill_2.13-0.10.0.jar commons-cli/1.6.0//commons-cli-1.6.0.jar

Re: [PR] [WIP][SPARK-47758][BUILD] Upgrade commons-collections4 to 4.5.0-M1 [spark]

2024-04-08 Thread via GitHub
panbingkun commented on code in PR #45921: URL: https://github.com/apache/spark/pull/45921#discussion_r1556677753 ## dev/deps/spark-deps-hadoop-3-hive-2.3: ## @@ -38,7 +38,7 @@ chill_2.13/0.10.0//chill_2.13-0.10.0.jar commons-cli/1.6.0//commons-cli-1.6.0.jar

Re: [PR] Operator 1.0.0-alpha [spark-kubernetes-operator]

2024-04-08 Thread via GitHub
jiangzho commented on code in PR #2: URL: https://github.com/apache/spark-kubernetes-operator/pull/2#discussion_r1556672404 ## build-tools/helm/spark-kubernetes-operator/crds/sparkapplications.org.apache.spark-v1.yml: ## @@ -0,0 +1,6947 @@ +# Generated by Fabric8 CRDGenerator,

Re: [PR] [SPARK-47609][SQL] Making CacheLookup more optimal to minimize cache miss [spark]

2024-04-08 Thread via GitHub
ahshahid commented on PR #45935: URL: https://github.com/apache/spark/pull/45935#issuecomment-2043936475 @cloud-fan .. I have extracted from the PR for SPARK-45959 , the cache lookup optimization part. I have added a sanity test, will add some more ( More tests to check caching

Re: [PR] Operator 1.0.0-alpha [spark-kubernetes-operator]

2024-04-08 Thread via GitHub
jiangzho commented on code in PR #2: URL: https://github.com/apache/spark-kubernetes-operator/pull/2#discussion_r155140 ## README.md: ## @@ -1 +1,63 @@ -# spark-kubernetes-operator + Review Comment: removed per

Re: [PR] [SPARK-47586][SQL] Hive module: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
itholic commented on code in PR #45876: URL: https://github.com/apache/spark/pull/45876#discussion_r1556661227 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -686,17 +687,19 @@ private[hive] class HiveClientImpl( } catch {

Re: [PR] Operator 1.0.0-alpha [spark-kubernetes-operator]

2024-04-08 Thread via GitHub
jiangzho commented on code in PR #2: URL: https://github.com/apache/spark-kubernetes-operator/pull/2#discussion_r1556661478 ## spark-operator/src/main/java/org/apache/spark/kubernetes/operator/config/SparkOperatorConf.java: ## @@ -0,0 +1,407 @@ +/* + * Licensed to the Apache

Re: [PR] [SPARK-47586][SQL] Hive module: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
itholic commented on code in PR #45876: URL: https://github.com/apache/spark/pull/45876#discussion_r1556661227 ## sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala: ## @@ -686,17 +687,19 @@ private[hive] class HiveClientImpl( } catch {

Re: [PR] Operator 1.0.0-alpha [spark-kubernetes-operator]

2024-04-08 Thread via GitHub
jiangzho commented on code in PR #2: URL: https://github.com/apache/spark-kubernetes-operator/pull/2#discussion_r1556660989 ## spark-operator-docs/spark_application.md: ## @@ -0,0 +1,212 @@ +## Spark Application API + +The core user facing API of the Spark Kubernetes Operator

Re: [PR] [SPARK-47770][INFRA] Fix `GenerateMIMAIgnore.isPackagePrivateModule` to return false instead of failing [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45938: URL: https://github.com/apache/spark/pull/45938#issuecomment-2043930082 cc @cloud-fan , @gengliangwang , @pan3793 from https://github.com/apache/spark/pull/45838 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556656706 ## project/MimaExcludes.scala: ## @@ -90,7 +90,21 @@ object MimaExcludes {

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45838: URL: https://github.com/apache/spark/pull/45838#issuecomment-2043927068 To @pan3793 , for MIMA issue, I made the following PR to fix it. - #45938 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[PR] [SPARK-47770][INFRA] Fix `GenerateMIMAIgnore.isPackagePrivateModule` to return false instead of failing [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun opened a new pull request, #45938: URL: https://github.com/apache/spark/pull/45938 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] [DO-NOT-REVIEW][DRAFT] Spark 45637 multiple state test [spark]

2024-04-08 Thread via GitHub
github-actions[bot] closed pull request #44076: [DO-NOT-REVIEW][DRAFT] Spark 45637 multiple state test URL: https://github.com/apache/spark/pull/44076 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [DOCS][MINOR] Fix typos [spark]

2024-04-08 Thread via GitHub
github-actions[bot] commented on PR #44276: URL: https://github.com/apache/spark/pull/44276#issuecomment-2043923101 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45904: URL: https://github.com/apache/spark/pull/45904#discussion_r1556652407 ## common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala: ## @@ -95,6 +117,8 @@ object LogKey extends Enumeration { val SHUFFLE_MERGE_ID = Value

Re: [PR] [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang closed pull request #45936: [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework URL: https://github.com/apache/spark/pull/45936 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45936: URL: https://github.com/apache/spark/pull/45936#discussion_r1556646755 ## sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala: ## @@ -142,7 +144,9 @@ private[hive] class

Re: [PR] [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on PR #45936: URL: https://github.com/apache/spark/pull/45936#issuecomment-2043916102 @dongjoon-hyun @itholic Thanks for the review. Merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-47586][SQL] Hive module: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
itholic commented on PR #45876: URL: https://github.com/apache/spark/pull/45876#issuecomment-2043911366 > @itholic we need more migrations: I belive `src/main/scala/org/apache/spark/sql/hive/TableReader.scala` is already addressed. Just migrated

[PR] [SPARK-47733] Add custom metrics for transformWithState operator part of query progress [spark]

2024-04-08 Thread via GitHub
anishshri-db opened a new pull request, #45937: URL: https://github.com/apache/spark/pull/45937 ### What changes were proposed in this pull request? Add custom metrics for transformWithState operator part of query progress ### Why are the changes needed? Adding operational

Re: [PR] [SPARK-47586][SQL] Hive module: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
itholic commented on code in PR #45876: URL: https://github.com/apache/spark/pull/45876#discussion_r1556625020 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala: ## @@ -200,8 +202,9 @@ class HiveExternalCatalogVersionsSuite extends

Re: [PR] [SPARK-47274][PYTHON][SQL] Provide more useful context for PySpark DataFrame API errors [spark]

2024-04-08 Thread via GitHub
itholic commented on code in PR #45377: URL: https://github.com/apache/spark/pull/45377#discussion_r1556624078 ## sql/core/src/main/scala/org/apache/spark/sql/Column.scala: ## @@ -699,6 +699,13 @@ class Column(val expr: Expression) extends Logging { */ def plus(other:

Re: [PR] [SPARK-47737][PYTHON] Bump PyArrow to 10.0.0 [spark]

2024-04-08 Thread via GitHub
itholic commented on PR #45892: URL: https://github.com/apache/spark/pull/45892#issuecomment-2043887649 Thanks for the review @dongjoon-hyun and @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
itholic commented on code in PR #45936: URL: https://github.com/apache/spark/pull/45936#discussion_r1556619849 ## sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala: ## @@ -142,7 +144,9 @@ private[hive] class

Re: [PR] [SPARK-47230] [SQL] Push projection through Filter-> Generate [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45839: URL: https://github.com/apache/spark/pull/45839#issuecomment-2043844239 Thank you for making a PR, @theHarpers . - Could you make a PR to `master` branch instead of `branch-3.5`? - Do you mean this is a regression of any previous JIRA issue?

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556539422 ## project/MimaExcludes.scala: ## @@ -90,7 +90,21 @@ object MimaExcludes {

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556539422 ## project/MimaExcludes.scala: ## @@ -90,7 +90,21 @@ object MimaExcludes {

Re: [PR] [SPARK-47706][BUILD] Bump json4s 4.0.7 [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45838: URL: https://github.com/apache/spark/pull/45838#discussion_r1556151836 ## project/MimaExcludes.scala: ## @@ -90,7 +90,21 @@ object MimaExcludes {

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
dtenedor commented on code in PR #45904: URL: https://github.com/apache/spark/pull/45904#discussion_r1556511199 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/StreamingJoinHelper.scala: ## @@ -284,7 +287,8 @@ object StreamingJoinHelper extends

Re: [PR] [SPARK-47696][SQL] try_to_timestamp should handle SparkUpgradeException [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45822: URL: https://github.com/apache/spark/pull/45822#discussion_r1556503931 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala: ## @@ -1235,11 +1235,13 @@ object

Re: [PR] [SPARK-47696][SQL] try_to_timestamp should handle SparkUpgradeException [spark]

2024-04-08 Thread via GitHub
gengliangwang closed pull request #45822: [SPARK-47696][SQL] try_to_timestamp should handle SparkUpgradeException URL: https://github.com/apache/spark/pull/45822 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on PR #45936: URL: https://github.com/apache/spark/pull/45936#issuecomment-2043743550 > the organization of this log migration workitem. I am trying to split the tasks as per the "project" definition in SBT. For example, we got project

Re: [PR] [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45936: URL: https://github.com/apache/spark/pull/45936#issuecomment-2043738478 Oh, I didn't mean it was wrong here. I want to understand the organization of this log migration workitem. :) -- This is an automated message from the Apache Git Service. To

Re: [PR] [SPARK-47589][SQL]Hive-Thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on PR #45936: URL: https://github.com/apache/spark/pull/45936#issuecomment-2043737090 @dongjoon-hyun ok, I just changed back to `[SQL]`. Thanks for pointing it out! -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-47589][HIVE-THRIFTSERVER] Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45936: URL: https://github.com/apache/spark/pull/45936#issuecomment-2043735363 Just for my knowledge, may I ask why this should be different from the GitHub Action label, @gengliangwang ?

Re: [PR] [SPARK-47589][HIVE-THRIFTSERVER] Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45936: URL: https://github.com/apache/spark/pull/45936#issuecomment-2043734561 Oh, should this be `[HIVE-THRIFTSERVER]`, @gengliangwang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-47761][SQL] Oracle: Support reading AnsiIntervalTypes [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45925: URL: https://github.com/apache/spark/pull/45925#issuecomment-2043731308 Also, cc @gengliangwang , too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47589][CORE] Hive-thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on PR #45936: URL: https://github.com/apache/spark/pull/45936#issuecomment-2043731880 cc @dtenedor @panbingkun @itholic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-47761][SQL] Oracle: Support reading AnsiIntervalTypes [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45925: URL: https://github.com/apache/spark/pull/45925#discussion_r1556482381 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala: ## @@ -540,4 +541,30 @@ class OracleIntegrationSuite

Re: [PR] [SPARK-47761][SQL] Oracle: Support reading AnsiIntervalTypes [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45925: URL: https://github.com/apache/spark/pull/45925#discussion_r1556482381 ## connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/OracleIntegrationSuite.scala: ## @@ -540,4 +541,30 @@ class OracleIntegrationSuite

Re: [PR] [SPARK-47761][SQL] Oracle: Support reading AnsiIntervalTypes [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45925: URL: https://github.com/apache/spark/pull/45925#discussion_r1556482166 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcDialects.scala: ## @@ -164,6 +166,17 @@ abstract class JdbcDialect extends Serializable with Logging {

[PR] [SPARK-47589][CORE] Hive-thriftserver: Migrate logError with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang opened a new pull request, #45936: URL: https://github.com/apache/spark/pull/45936 ### What changes were proposed in this pull request? Migrate logError with variables of Hive-thriftserver module to the structured logging framework. ### Why are the

Re: [PR] [SPARK-47761][SQL] Oracle: Support reading AnsiIntervalTypes [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on code in PR #45925: URL: https://github.com/apache/spark/pull/45925#discussion_r1556481273 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala: ## @@ -526,6 +526,16 @@ object JdbcUtils extends Logging with

[PR] [SPARK-47609][SQL] Making CacheLookup more optimal to minimize cache miss [spark]

2024-04-08 Thread via GitHub
ahshahid opened a new pull request, #45935: URL: https://github.com/apache/spark/pull/45935 ### What changes were proposed in this pull request? Currently the CacheLookup relies on the fragment ( sub plan of incoming query) matching exactly the analyzed Logical Plan in CacheManager for

Re: [PR] [SPARK-47366][PYTHON] Add VariantVal for PySpark [spark]

2024-04-08 Thread via GitHub
gene-db commented on code in PR #45826: URL: https://github.com/apache/spark/pull/45826#discussion_r1556457254 ## python/pyspark/sql/tests/test_types.py: ## @@ -1406,6 +1407,68 @@ def test_calendar_interval_type_with_sf(self): schema1 =

Re: [PR] [SPARK-47366][PYTHON] Add VariantVal for PySpark [spark]

2024-04-08 Thread via GitHub
gene-db commented on code in PR #45826: URL: https://github.com/apache/spark/pull/45826#discussion_r1556456624 ## python/pyspark/sql/types.py: ## @@ -1468,6 +1475,36 @@ def __eq__(self, other: Any) -> bool: return type(self) == type(other) +class VariantVal:

Re: [PR] [SPARK-47734][PYTHON][TESTS][3.4] Fix flaky DataFrame.writeStream doctest by stopping streaming query [spark]

2024-04-08 Thread via GitHub
dongjoon-hyun commented on PR #45908: URL: https://github.com/apache/spark/pull/45908#issuecomment-2043670805 Thank you, @JoshRosen and all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45904: URL: https://github.com/apache/spark/pull/45904#discussion_r1556423196 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala: ## @@ -136,9 +137,11 @@ object StringUtils extends Logging { override

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45904: URL: https://github.com/apache/spark/pull/45904#discussion_r1556422516 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala: ## @@ -321,8 +322,11 @@ object ResolveDefaultColumns extends

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45904: URL: https://github.com/apache/spark/pull/45904#discussion_r1556421508 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/CharVarcharUtils.scala: ## @@ -74,10 +75,10 @@ object CharVarcharUtils extends Logging with

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45904: URL: https://github.com/apache/spark/pull/45904#discussion_r1556417726 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVHeaderChecker.scala: ## @@ -75,19 +75,21 @@ class CSVHeaderChecker( }

Re: [PR] [SPARK-47581][CORE] SQL catalyst: Migrate logWarning with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45904: URL: https://github.com/apache/spark/pull/45904#discussion_r1556415217 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/StreamingJoinHelper.scala: ## @@ -284,7 +287,8 @@ object StreamingJoinHelper extends

Re: [PR] [SPARK-47593][CORE] Connector module: Migrate logWarn with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on PR #45879: URL: https://github.com/apache/spark/pull/45879#issuecomment-2043577634 @panbingkun Thanks for the works! LGTM except for some minor comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-47593][CORE] Connector module: Migrate logWarn with variables to structured logging framework [spark]

2024-04-08 Thread via GitHub
gengliangwang commented on code in PR #45879: URL: https://github.com/apache/spark/pull/45879#discussion_r1556386948 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala: ## @@ -653,13 +654,13 @@ class KafkaTestUtils(

  1   2   >