Re: [PR] [SPARK-47787][BUILD][3.5] Upgrade `commons-compress` to 1.26.1 [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun closed pull request #45967: [SPARK-47787][BUILD][3.5] Upgrade `commons-compress` to 1.26.1 URL: https://github.com/apache/spark/pull/45967 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-05-03 Thread via GitHub
sahnib commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1589364420 ## sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala: ## @@ -702,6 +742,39 @@ class KeyValueGroupedDataset[K, V] private[sql](

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-05-03 Thread via GitHub
sahnib commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1589375804 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/EventTimeWatermarkExec.scala: ## @@ -107,25 +108,67 @@ case class EventTimeWatermarkExec( }

Re: [PR] [SPARK-48113][CONNECT] Allow Plugins to integrate with Spark Connect [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46364: URL: https://github.com/apache/spark/pull/46364#issuecomment-2093265227 Could you re-trigger CIs, @tomvanbussel ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-05-03 Thread via GitHub
sahnib commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1589383634 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala: ## @@ -347,6 +347,33 @@ class IncrementalExecution(

Re: [PR] [SPARK-48113][CONNECT] Allow Plugins to integrate with Spark Connect [spark]

2024-05-03 Thread via GitHub
HyukjinKwon commented on code in PR #46364: URL: https://github.com/apache/spark/pull/46364#discussion_r1589059935 ## connector/connect/server/src/test/scala/org/apache/spark/sql/connect/plugin/SparkConnectPluginRegistrySuite.scala: ## @@ -20,8 +20,8 @@ package

[PR] [SPARK-48115][INFRA] Remove `Python 3.11` from `build_python.yml` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun opened a new pull request, #46366: URL: https://github.com/apache/spark/pull/46366 ### What changes were proposed in this pull request? This PR aims to remove `Python 3.11` from `build_python.yml` Daily CI because `Python 3.11` is the main python version in the PR and

Re: [PR] [SPARK-48059][CORE] Implement the structured log framework on the java side [spark]

2024-05-03 Thread via GitHub
panbingkun commented on code in PR #46301: URL: https://github.com/apache/spark/pull/46301#discussion_r1589171804 ## common/utils/src/main/java/org/apache/spark/internal/Logger.java: ## @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more

Re: [PR] [SPARK-48049][BUILD] Upgrade Scala to 2.13.14 [spark]

2024-05-03 Thread via GitHub
panbingkun commented on PR #46288: URL: https://github.com/apache/spark/pull/46288#issuecomment-2093048107 @SethTisue Just right, you are here too https://github.com/scala/scala/pull/10717

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-05-03 Thread via GitHub
sahnib commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1589359837 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/EventTimeWatermark.scala: ## @@ -49,26 +80,31 @@ case class EventTimeWatermark( // logic

Re: [PR] [SPARK-48049][BUILD] Upgrade Scala to 2.13.14 [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46288: URL: https://github.com/apache/spark/pull/46288#issuecomment-2093258975 Thank you for the analysis and sharing, @panbingkun and @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[PR] [SPARK-48116][INFRA] Move `pyspark-pandas*` tests to Daily Python CIs [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun opened a new pull request, #46367: URL: https://github.com/apache/spark/pull/46367 ### What changes were proposed in this pull request? This PR aims to remove `Python 3.11` from `build_python.yml` Daily CI because `Python 3.11` is the main python version in the PR and

[PR] [SPARK-48114][CORE] Precompile template regex to avoid unnecessary work [spark]

2024-05-03 Thread via GitHub
vladimirg-db opened a new pull request, #46365: URL: https://github.com/apache/spark/pull/46365 ### What changes were proposed in this pull request? Error message template regex is now precompiled to avoid unnecessary work ### Why are the changes needed?

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-05-03 Thread via GitHub
sahnib commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1589450965 ## sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateChainingSuite.scala: ## @@ -0,0 +1,364 @@ +/* + * Licensed to the Apache Software Foundation

Re: [PR] [SPARK-48049][BUILD] Upgrade Scala to 2.13.14 [spark]

2024-05-03 Thread via GitHub
LuciferYang commented on PR #46288: URL: https://github.com/apache/spark/pull/46288#issuecomment-2093132416 https://github.com/jline/jline3/blob/c75301facc8716b59c1d57d3e3c5943358022560/pom.xml#L93-L95

Re: [PR] [SPARK-48049][BUILD] Upgrade Scala to 2.13.14 [spark]

2024-05-03 Thread via GitHub
LuciferYang commented on PR #46288: URL: https://github.com/apache/spark/pull/46288#issuecomment-2093165641 [INFO] Restricted to JDK 17 yet org.jline:jline:jar:3.25.1:compile contains org/jline/terminal/impl/ffm/CLibrary$termios.class targeted to 65.-257 Oh... It seems that

Re: [PR] [SPARK-48112][CONNECT] Expose session in SparkConnectPlanner to plugins [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46363: URL: https://github.com/apache/spark/pull/46363#discussion_r1589376607 ## connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala: ## @@ -101,7 +101,8 @@ class SparkConnectPlanner(

Re: [PR] [SPARK-47960][SS] Allow chaining other stateful operators after transformWIthState operator. [spark]

2024-05-03 Thread via GitHub
sahnib commented on code in PR #45376: URL: https://github.com/apache/spark/pull/45376#discussion_r1589380529 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/TransformWithStateExec.scala: ## @@ -78,15 +78,32 @@ case class TransformWithStateExec( override

Re: [PR] [SPARK-48115][INFRA] Remove `Python 3.11` from `build_python.yml` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46366: URL: https://github.com/apache/spark/pull/46366#issuecomment-2093697000 Could you review this PR, @viirya ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] [SPARK-48105] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589713513 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48118][SQL] Support `SPARK_SQL_LEGACY_CREATE_HIVE_TABLE` env variable [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46369: URL: https://github.com/apache/spark/pull/46369#issuecomment-2093744223 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
anishshri-db commented on PR #46351: URL: https://github.com/apache/spark/pull/46351#issuecomment-2093801768 @HeartSaVioR - do we also need to port to older Spark versions ? 3.5 maybe ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[PR] [SPARK-48120] Enable autolink to SPARK jira issue [spark-kubernetes-operator]

2024-05-03 Thread via GitHub
dongjoon-hyun opened a new pull request, #11: URL: https://github.com/apache/spark-kubernetes-operator/pull/11 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[PR] [SPARK-47097][CONNECT][TESTS][FOLLOWUP] Increase timeout to 1 minute for `interrupt tag` test [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun opened a new pull request, #46374: URL: https://github.com/apache/spark/pull/46374 ### What changes were proposed in this pull request? This is a follow-up. - #45173 ### Why are the changes needed? To reduce the flakiness more -

Re: [PR] [SPARK-48120] Enable autolink to SPARK jira issue [spark-kubernetes-operator]

2024-05-03 Thread via GitHub
dongjoon-hyun closed pull request #11: [SPARK-48120] Enable autolink to SPARK jira issue URL: https://github.com/apache/spark-kubernetes-operator/pull/11 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-48123][Core] Provide a constant table schema for querying structured logs [spark]

2024-05-03 Thread via GitHub
gengliangwang commented on code in PR #46375: URL: https://github.com/apache/spark/pull/46375#discussion_r1589812045 ## common/utils/src/main/scala/org/apache/spark/util/LogUtils.scala: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

Re: [PR] [SPARK-47578][CORE] Migrate logWarning with variables to structured logging framework [spark]

2024-05-03 Thread via GitHub
dtenedor commented on PR #46309: URL: https://github.com/apache/spark/pull/46309#issuecomment-2093911859 The two remaining test failures appear to be flaky/unrelated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-48059][CORE] Implement the structured log framework on the java side [spark]

2024-05-03 Thread via GitHub
panbingkun commented on PR #46301: URL: https://github.com/apache/spark/pull/46301#issuecomment-2093912354 > Thanks, merging to master Thank you very much. Based on this, I will do log migration and auxiliary work on the java side. (eg, after the migration is completed, we can

Re: [PR] [SPARK-48105] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589718949 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

[PR] [SPARK-48119][K8S] Promote ` KubernetesDriverSpec` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
jiangzho opened a new pull request, #46371: URL: https://github.com/apache/spark/pull/46371 ### What changes were proposed in this pull request? This PR aims to promote ` KubernetesDriverSpec` to `DeveloperApi` ### Why are the changes needed? Since Apache

Re: [PR] [SPARK-48019][FOLLOWUP] Set nulls correctly for primitive arrays [spark]

2024-05-03 Thread via GitHub
gene-db commented on PR #46372: URL: https://github.com/apache/spark/pull/46372#issuecomment-2093825065 @cloud-fan Could you take a look at this followup to improve the primitive array with nulls issue. Thanks! -- This is an automated message from the Apache Git Service. To respond to

Re: [PR] [SPARK-48019][SQL][FOLLOWUP] Set nulls correctly for primitive arrays [spark]

2024-05-03 Thread via GitHub
gene-db commented on PR #46372: URL: https://github.com/apache/spark/pull/46372#issuecomment-2093845240 @dongjoon-hyun > * Could you provide a test case to validate your contribution and to prevent any future regression at this area? Yes, the previous PR added many

[PR] Promote `KubernetesDriverConf` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
jiangzho opened a new pull request, #46373: URL: https://github.com/apache/spark/pull/46373 ### What changes were proposed in this pull request? This PR aims to promote `KubernetesDriverConf` to `DeveloperApi` ### Why are the changes needed? Since Apache

Re: [PR] [SPARK-48123][Core] Provide a constant table schema for querying structured logs [spark]

2024-05-03 Thread via GitHub
gengliangwang commented on code in PR #46375: URL: https://github.com/apache/spark/pull/46375#discussion_r1589811109 ## common/utils/src/main/scala/org/apache/spark/util/LogUtils.scala: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

Re: [PR] [SPARK-47594] Connector module: Migrate logInfo with variables to structured logging framework [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46022: URL: https://github.com/apache/spark/pull/46022#discussion_r1589845846 ## connector/profiler/src/main/scala/org/apache/spark/executor/profiler/ExecutorJVMProfiler.scala: ## @@ -25,7 +25,8 @@ import org.apache.hadoop.fs.{FileSystem,

Re: [PR] [SPARK-48058][SPARK-43727][PYTHON][CONNECT] `UserDefinedFunction.returnType` parse the DDL string [spark]

2024-05-03 Thread via GitHub
zhengruifeng commented on PR #46300: URL: https://github.com/apache/spark/pull/46300#issuecomment-2093941093 thanks @HyukjinKwon and @dongjoon-hyun for reviews! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-47594] Connector module: Migrate logInfo with variables to structured logging framework [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46022: URL: https://github.com/apache/spark/pull/46022#discussion_r1589845864 ## connector/profiler/src/main/scala/org/apache/spark/executor/profiler/ExecutorProfilerPlugin.scala: ## @@ -23,7 +23,8 @@ import scala.util.Random import

Re: [PR] [SPARK-48115][INFRA] Remove `Python 3.11` from `build_python.yml` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun closed pull request #46366: [SPARK-48115][INFRA] Remove `Python 3.11` from `build_python.yml` URL: https://github.com/apache/spark/pull/46366 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-48115][INFRA] Remove `Python 3.11` from `build_python.yml` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46366: URL: https://github.com/apache/spark/pull/46366#issuecomment-2093766064 Thank you! Sure. I'll update it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-48115][INFRA] Remove `Python 3.11` from `build_python.yml` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46366: URL: https://github.com/apache/spark/pull/46366#issuecomment-2093769123 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
anishshri-db commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589732558 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48119][K8S] Promote `KubernetesDriverSpec` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun closed pull request #46371: [SPARK-48119][K8S] Promote `KubernetesDriverSpec` to `DeveloperApi` URL: https://github.com/apache/spark/pull/46371 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-47578][CORE] Migrate logWarning with variables to structured logging framework [spark]

2024-05-03 Thread via GitHub
dtenedor commented on code in PR #46309: URL: https://github.com/apache/spark/pull/46309#discussion_r1589791573 ## core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala: ## @@ -127,7 +127,7 @@ class RPackageUtilsSuite

Re: [PR] [SPARK-48123][Core] Provide a constant table schema for querying structured logs [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46375: URL: https://github.com/apache/spark/pull/46375#discussion_r1589805250 ## common/utils/src/main/scala/org/apache/spark/util/LogUtils.scala: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

Re: [PR] [SPARK-47097][CONNECT][TESTS][FOLLOWUP] Increase timeout to `1 minute` for `interrupt tag` test [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun closed pull request #46374: [SPARK-47097][CONNECT][TESTS][FOLLOWUP] Increase timeout to `1 minute` for `interrupt tag` test URL: https://github.com/apache/spark/pull/46374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-47097][CONNECT][TESTS][FOLLOWUP] Increase timeout to `1 minute` for `interrupt tag` test [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46374: URL: https://github.com/apache/spark/pull/46374#issuecomment-2093895254 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-48049][BUILD] Upgrade Scala to 2.13.14 [spark]

2024-05-03 Thread via GitHub
panbingkun commented on PR #46288: URL: https://github.com/apache/spark/pull/46288#issuecomment-2093915493 > Thank you for the analysis and sharing, @panbingkun and @LuciferYang . > [INFO] Restricted to JDK 17 yet org.jline:jline:jar:3.25.1:compile contains

[PR] [SPARK-48056][PYTHON][CONNECT][FOLLOW-UP] Use `assertEqual` instead of `assertEquals` for Python 3.12 [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun opened a new pull request, #46377: URL: https://github.com/apache/spark/pull/46377 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

Re: [PR] Operator 0.1.0 [spark-kubernetes-operator]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #2: URL: https://github.com/apache/spark-kubernetes-operator/pull/2#issuecomment-2093722066 Hi, All. For the record, - We has been working on `submission` module and will merge with `Apache Spark 4.0.0-preview` as the first version.

Re: [PR] [SPARK-48105] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589714412 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589746596 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
anishshri-db commented on PR #46351: URL: https://github.com/apache/spark/pull/46351#issuecomment-2093810520 @huanliwang-db - actually do we need `clear` in the HDFSStateStoreMap implementation any more - are there any more callers for that function ? I guess we can remove that interface

Re: [PR] [SPARK-48119][K8S] Promote `KubernetesDriverSpec` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46371: URL: https://github.com/apache/spark/pull/46371#issuecomment-2093819636 Merged to master for Apache Spark 4.0.0-preview. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48120] Enable autolink to SPARK jira issue [spark-kubernetes-operator]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #11: URL: https://github.com/apache/spark-kubernetes-operator/pull/11#issuecomment-2093823794 Could you review this, @jiangzho ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48019][SQL][FOLLOWUP] Set nulls correctly for primitive arrays [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46372: URL: https://github.com/apache/spark/pull/46372#issuecomment-2093850534 To @gene-db , for the following purpose, you should not use `correctly` in the PR title because it's misleading. > faster than the object ones. -- This is an automated message

Re: [PR] [SPARK-48019][SQL][FOLLOWUP] Set nulls correctly for primitive arrays [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46372: URL: https://github.com/apache/spark/pull/46372#issuecomment-2093851288 Please revise the PR title and description. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-48120] Enable autolink to SPARK jira issue [spark-kubernetes-operator]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #11: URL: https://github.com/apache/spark-kubernetes-operator/pull/11#issuecomment-2093859360 Thank you for review and approval, @jiangzho ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-48123][Core] Provide a constant table schema for querying structured logs [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46375: URL: https://github.com/apache/spark/pull/46375#discussion_r1589807167 ## sql/core/src/test/scala/org/apache/spark/sql/LogQuerySuite.scala: ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

Re: [PR] [SPARK-47097][CONNECT][TESTS][FOLLOWUP] Increase timeout to 1 minute for `interrupt tag` test [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46374: URL: https://github.com/apache/spark/pull/46374#issuecomment-2093886085 Could you review this test PR, @gengliangwang ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48123][Core] Provide a constant table schema for querying structured logs [spark]

2024-05-03 Thread via GitHub
ueshin commented on code in PR #46375: URL: https://github.com/apache/spark/pull/46375#discussion_r1589805454 ## common/utils/src/main/scala/org/apache/spark/util/LogUtils.scala: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + *

Re: [PR] [SPARK-47097][CONNECT][TESTS][FOLLOWUP] Increase timeout to `1 minute` for `interrupt tag` test [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46374: URL: https://github.com/apache/spark/pull/46374#issuecomment-2093895463 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46813][CORE] Don't set the executor id to "driver" when SparkContext is created by the executor side [spark]

2024-05-03 Thread via GitHub
github-actions[bot] commented on PR #44855: URL: https://github.com/apache/spark/pull/44855#issuecomment-2093908111 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [WIP][SQL] Avoid parquet footer reads twice [spark]

2024-05-03 Thread via GitHub
github-actions[bot] closed pull request #44853: [WIP][SQL] Avoid parquet footer reads twice URL: https://github.com/apache/spark/pull/44853 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46738][CONNECT][PYTHON] Make the display of `cast` in `Regular Spark` and `Spark Connect` consistent [spark]

2024-05-03 Thread via GitHub
github-actions[bot] closed pull request #44829: [SPARK-46738][CONNECT][PYTHON] Make the display of `cast` in `Regular Spark` and `Spark Connect` consistent URL: https://github.com/apache/spark/pull/44829 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] [SPARK-46714][SQL] Overwrite a partition with custom location [spark]

2024-05-03 Thread via GitHub
github-actions[bot] closed pull request #44725: [SPARK-46714][SQL] Overwrite a partition with custom location URL: https://github.com/apache/spark/pull/44725 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-42669][CONNECT] Short circuit local relation RPCs [spark]

2024-05-03 Thread via GitHub
github-actions[bot] closed pull request #40782: [SPARK-42669][CONNECT] Short circuit local relation RPCs URL: https://github.com/apache/spark/pull/40782 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

Re: [PR] [SPARK-48116][INFRA] Run `pyspark-pandas*` only in PR builder and Daily Python CIs [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46367: URL: https://github.com/apache/spark/pull/46367#issuecomment-2093950583 Could you review this PR, @zhengruifeng ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-48105] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589714412 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48105] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589729057 ## sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/StateStoreSuite.scala: ## @@ -388,6 +388,33 @@ class StateStoreSuite extends

Re: [PR] [SPARK-48105] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
anishshri-db commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589729603 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48118][SQL] Support `SPARK_SQL_LEGACY_CREATE_HIVE_TABLE` env variable [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46369: URL: https://github.com/apache/spark/pull/46369#issuecomment-2093770166 Could you review this SQL PR too for Apache Spark 4.0.0-preview, @viirya ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
anishshri-db commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589731857 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
huanliwang-db commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589740422 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589744613 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589764171 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48019][SQL][FOLLOWUP] Use primitive arrays over object arrays when nulls exist [spark]

2024-05-03 Thread via GitHub
gene-db commented on PR #46372: URL: https://github.com/apache/spark/pull/46372#issuecomment-2093857652 > Please revise the PR title and description. @dongjoon-hyun Updated the title and description. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-48105] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589714412 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48105] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589714412 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48118][SQL] Support `SPARK_SQL_LEGACY_CREATE_HIVE_TABLE` env variable [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun closed pull request #46369: [SPARK-48118][SQL] Support `SPARK_SQL_LEGACY_CREATE_HIVE_TABLE` env variable URL: https://github.com/apache/spark/pull/46369 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-48118][SQL] Support `SPARK_SQL_LEGACY_CREATE_HIVE_TABLE` env variable [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46369: URL: https://github.com/apache/spark/pull/46369#issuecomment-2093816952 Thank you, @viirya ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48118][SQL] Support `SPARK_SQL_LEGACY_CREATE_HIVE_TABLE` env variable [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46369: URL: https://github.com/apache/spark/pull/46369#issuecomment-2093817733 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48121][K8S] Promote `KubernetesDriverConf` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun closed pull request #46373: [SPARK-48121][K8S] Promote `KubernetesDriverConf` to `DeveloperApi` URL: https://github.com/apache/spark/pull/46373 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-48121][K8S] Promote `KubernetesDriverConf` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on PR #46373: URL: https://github.com/apache/spark/pull/46373#issuecomment-2093881734 Merged to master for Apache Spark 4.0.0-preview. Thank you, @jiangzho . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-48059][CORE] Implement the structured log framework on the java side [spark]

2024-05-03 Thread via GitHub
gengliangwang commented on PR #46301: URL: https://github.com/apache/spark/pull/46301#issuecomment-2093912796 @panbingkun Awesome, thanks for the work! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
huanliwang-db commented on PR #46351: URL: https://github.com/apache/spark/pull/46351#issuecomment-2093814026 > @huanliwang-db - actually do we need clear in the HDFSStateStoreMap implementation any more - are there any more callers for that function ? I guess we can remove that interface

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589764171 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on code in PR #46351: URL: https://github.com/apache/spark/pull/46351#discussion_r1589764171 ## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala: ## @@ -351,7 +351,7 @@ private[sql] class

Re: [PR] [SPARK-48114][CORE] Precompile template regex to avoid unnecessary work [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun closed pull request #46365: [SPARK-48114][CORE] Precompile template regex to avoid unnecessary work URL: https://github.com/apache/spark/pull/46365 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48105][SS] Fix the race condition between state store unloading and snapshotting [spark]

2024-05-03 Thread via GitHub
HeartSaVioR commented on PR #46351: URL: https://github.com/apache/spark/pull/46351#issuecomment-2093861697 > @HeartSaVioR - do we also need to port to older Spark versions ? 3.5 maybe ? Ideally all non-EOL version lines, 3.5/3.4. -- This is an automated message from the Apache

Re: [PR] [SPARK-48121][K8S] Promote `KubernetesDriverConf` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46373: URL: https://github.com/apache/spark/pull/46373#discussion_r1589790522 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala: ## @@ -78,7 +79,16 @@ private[spark] abstract class

Re: [PR] [SPARK-47578][CORE] Migrate logWarning with variables to structured logging framework [spark]

2024-05-03 Thread via GitHub
dtenedor commented on code in PR #46309: URL: https://github.com/apache/spark/pull/46309#discussion_r1589790627 ## core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala: ## @@ -127,7 +127,7 @@ class RPackageUtilsSuite

Re: [PR] [SPARK-48121][K8S] Promote `KubernetesDriverConf` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46373: URL: https://github.com/apache/spark/pull/46373#discussion_r1589790988 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala: ## @@ -78,7 +79,16 @@ private[spark] abstract class

Re: [PR] [SPARK-48121][K8S] Promote `KubernetesDriverConf` to `DeveloperApi` [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46373: URL: https://github.com/apache/spark/pull/46373#discussion_r1589790586 ## resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/KubernetesConf.scala: ## @@ -78,7 +79,16 @@ private[spark] abstract class

[PR] [SPARK-48123][Core] Provide a constant table schema for querying structured logs [spark]

2024-05-03 Thread via GitHub
gengliangwang opened a new pull request, #46375: URL: https://github.com/apache/spark/pull/46375 ### What changes were proposed in this pull request? Providing a table schema LOG_SCHEMA, so that users can load structured logs with the following code: ``` import

Re: [PR] [SPARK-48123][Core] Provide a constant table schema for querying structured logs [spark]

2024-05-03 Thread via GitHub
ueshin commented on code in PR #46375: URL: https://github.com/apache/spark/pull/46375#discussion_r1589805454 ## common/utils/src/main/scala/org/apache/spark/util/LogUtils.scala: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + *

Re: [PR] [SPARK-48059][CORE] Implement the structured log framework on the java side [spark]

2024-05-03 Thread via GitHub
gengliangwang closed pull request #46301: [SPARK-48059][CORE] Implement the structured log framework on the java side URL: https://github.com/apache/spark/pull/46301 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-48123][Core] Provide a constant table schema for querying structured logs [spark]

2024-05-03 Thread via GitHub
dongjoon-hyun commented on code in PR #46375: URL: https://github.com/apache/spark/pull/46375#discussion_r1589805435 ## common/utils/src/main/scala/org/apache/spark/util/LogUtils.scala: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

Re: [PR] [SPARK-48059][CORE] Implement the structured log framework on the java side [spark]

2024-05-03 Thread via GitHub
gengliangwang commented on code in PR #46301: URL: https://github.com/apache/spark/pull/46301#discussion_r1589805794 ## common/utils/src/test/java/org/apache/spark/util/LoggerSuiteBase.java: ## @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one

Re: [PR] [SPARK-48059][CORE] Implement the structured log framework on the java side [spark]

2024-05-03 Thread via GitHub
gengliangwang commented on PR #46301: URL: https://github.com/apache/spark/pull/46301#issuecomment-2093883660 Thanks, merging to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-48049][BUILD] Upgrade Scala to 2.13.14 [spark]

2024-05-03 Thread via GitHub
panbingkun commented on PR #46288: URL: https://github.com/apache/spark/pull/46288#issuecomment-2093917835 Currently, this GA has turned green. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

  1   2   >