Re: [PR] [SPARK-47847][CORE] Deprecate `spark.network.remoteReadNioBufferConversion` [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46047: URL: https://github.com/apache/spark/pull/46047#issuecomment-2103920027 Merged to master/3.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47847][CORE] Deprecate `spark.network.remoteReadNioBufferConversion` [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46047: [SPARK-47847][CORE] Deprecate `spark.network.remoteReadNioBufferConversion` URL: https://github.com/apache/spark/pull/46047 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47847][CORE] Deprecate `spark.network.remoteReadNioBufferConversion` [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46047: URL: https://github.com/apache/spark/pull/46047#issuecomment-2103919298 Since this is irrelevant to CI, I verified manually like the following. ``` $ bin/spark-shell -c spark.network.remoteReadNioBufferConversion=true WARNING: Using

[PR] [SPARK-48230][BUILD] Remove unused jodd-core [spark]

2024-05-09 Thread via GitHub
pan3793 opened a new pull request, #46520: URL: https://github.com/apache/spark/pull/46520 ### What changes were proposed in this pull request? Remove a jar that has CVE https://github.com/advisories/GHSA-jrg3-qq99-35g7 ### Why are the changes needed? Previously,

Re: [PR] [SPARK-48219][CORE] StreamReader Charset fix with UTF8 [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46509: URL: https://github.com/apache/spark/pull/46509#issuecomment-2103908417 Sorry but I'll leave this to the other reviewers, @xuzifu666 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-48219][CORE] StreamReader Charset fix with UTF8 [spark]

2024-05-09 Thread via GitHub
xuzifu666 commented on PR #46509: URL: https://github.com/apache/spark/pull/46509#issuecomment-2103906360 @dongjoon-hyun Could you give a final review? Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file [spark]

2024-05-09 Thread via GitHub
cloud-fan commented on PR #46512: URL: https://github.com/apache/spark/pull/46512#issuecomment-2103907125 I see, I'll install python 3.9 on the release docker image. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-47847][CORE] Deprecate spark.network.remoteReadNioBufferConversion [spark]

2024-05-09 Thread via GitHub
pan3793 commented on PR #46047: URL: https://github.com/apache/spark/pull/46047#issuecomment-2103900781 @dongjoon-hyun thanks for your suggestion, updated the deprecated message, and we can consider removing it at 4.1.0 or later -- This is an automated message from the Apache Git

Re: [PR] [SPARK-48228][PYTHON][CONNECT] Implement the missing function validation in ApplyInXXX [spark]

2024-05-09 Thread via GitHub
zhengruifeng commented on PR #46519: URL: https://github.com/apache/spark/pull/46519#issuecomment-2103877686 @dongjoon-hyun and @HyukjinKwon thanks for reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48201][DOCS][PYTHON] Make some corrections in the docstring of pyspark DataStreamReader methods [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46416: URL: https://github.com/apache/spark/pull/46416#issuecomment-2103874418 Welcome to the Apache Spark community, @chloeh13q . I added you to the Apache Spark contributor group and assigned SPARK-48201 to you. Congratulations for your first commit!

Re: [PR] [SPARK-48201][DOCS][PYTHON] Make some corrections in the docstring of pyspark DataStreamReader methods [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46416: [SPARK-48201][DOCS][PYTHON] Make some corrections in the docstring of pyspark DataStreamReader methods URL: https://github.com/apache/spark/pull/46416 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Fix previous reader checks in Vectorized DELTA_BYTE_ARRAY decoder [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on code in PR #46485: URL: https://github.com/apache/spark/pull/46485#discussion_r1596258635 ## sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java: ## @@ -353,8 +353,9 @@ private void initDataReader(

Re: [PR] Fix previous reader checks in Vectorized DELTA_BYTE_ARRAY decoder [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46485: URL: https://github.com/apache/spark/pull/46485#issuecomment-2103871560 cc @sunchao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-48228][PYTHON][CONNECT] Implement the missing function validation in ApplyInXXX [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46519: URL: https://github.com/apache/spark/pull/46519#issuecomment-2103868519 Merged to master for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47834][SQL][CONNECT] Mark deprecated functions with `@deprecated` in `SQLImplicits` [spark]

2024-05-09 Thread via GitHub
LuciferYang commented on PR #46029: URL: https://github.com/apache/spark/pull/46029#issuecomment-2103868266 Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48228][PYTHON][CONNECT] Implement the missing function validation in ApplyInXXX [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46519: [SPARK-48228][PYTHON][CONNECT] Implement the missing function validation in ApplyInXXX URL: https://github.com/apache/spark/pull/46519 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] [SPARK-48210][DOC]Modify the description of whether dynamic partition… [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46496: URL: https://github.com/apache/spark/pull/46496#issuecomment-2103866076 cc @mridulm and @tgravescs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-48224][SQL] Disallow map keys from being of variant type [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46516: [SPARK-48224][SQL] Disallow map keys from being of variant type URL: https://github.com/apache/spark/pull/46516 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46468: URL: https://github.com/apache/spark/pull/46468#issuecomment-2103844998 Also, cc @cloud-fan and @HyukjinKwon This fixes not only Hive dependency but also a long standing `libthrift` library issue. -- This is an automated message from the

Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46468: URL: https://github.com/apache/spark/pull/46468#issuecomment-2103844347 Merged to master! Thank you so much, @pan3793 and @sunchao . From now, many people will use Hive 2.3.10. I believe we can build more confidence before Apache Spark

Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46468: [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 URL: https://github.com/apache/spark/pull/46468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [DRAFT][BUILD] Test upgrading built-in Hive to 2.3.10 [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #45372: [DRAFT][BUILD] Test upgrading built-in Hive to 2.3.10 URL: https://github.com/apache/spark/pull/45372 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-47119][BUILD] Add `hive-jackson-provided` profile [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #45201: URL: https://github.com/apache/spark/pull/45201#issuecomment-2103839853 It's supposed to be here as a last resort until we release Apache Spark 4.0.0 successfully without reverting Hive 2.3.10, @pan3793 . -- This is an automated message from the

Re: [PR] [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46512: URL: https://github.com/apache/spark/pull/46512#issuecomment-2103835797 Ya, as @nchammas mentioned, it seems that we missed to bump Python to 3.9 in `spark-rm` in the following PR. - #46228 -- This is an automated message from the Apache Git

Re: [PR] [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file [spark]

2024-05-09 Thread via GitHub
nchammas commented on PR #46512: URL: https://github.com/apache/spark/pull/46512#issuecomment-2103835083 Yes, and we dropped support for Python 3.8 in #46228. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46512: URL: https://github.com/apache/spark/pull/46512#issuecomment-2103834606 It seems that `files` attribute is added at Python 3.9, but the running python version is `Python 3.8`, @cloud-fan . -

Re: [PR] [SPARK-47441][YARN] Do not add log link for unmanaged AM in Spark UI [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #45565: URL: https://github.com/apache/spark/pull/45565#issuecomment-2103823764 Also, cc @mridulm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47847][CORE] Deprecate spark.network.remoteReadNioBufferConversion [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46047: URL: https://github.com/apache/spark/pull/46047#issuecomment-2103818630 BTW, we need to give enough time to report the issue from users. So, we cannot delete this configuration at Apache Spark 4.0.0 because Apache Spark 3.5.2 is not released yet and we

Re: [PR] [SPARK-47847][CORE] Deprecate spark.network.remoteReadNioBufferConversion [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46047: URL: https://github.com/apache/spark/pull/46047#issuecomment-2103817471 To @pan3793 , I rethink about this. > I fill the deprecated message with "Not used anymore", to be consistent with existing items > > ``` >

Re: [PR] [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file [spark]

2024-05-09 Thread via GitHub
cloud-fan commented on PR #46512: URL: https://github.com/apache/spark/pull/46512#issuecomment-2103814645 cc @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file [spark]

2024-05-09 Thread via GitHub
cloud-fan commented on PR #46512: URL: https://github.com/apache/spark/pull/46512#issuecomment-2103814454 the Bundler issue is resolved, but I hit a new issue for generating pyspark docs ``` Configuration error: There is a programmable error in your configuration file:

Re: [PR] [MINOR][BUILD] Remove duplicate configuration of maven-compiler-plugin [spark]

2024-05-09 Thread via GitHub
zml1206 commented on PR #46024: URL: https://github.com/apache/spark/pull/46024#issuecomment-2103810977 Thank you for review. @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [MINOR][BUILD] Remove duplicate configuration of maven-compiler-plugin [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46024: [MINOR][BUILD] Remove duplicate configuration of maven-compiler-plugin URL: https://github.com/apache/spark/pull/46024 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [MINOR][BUILD] Remove duplicate configuration of maven-compiler-plugin [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46024: URL: https://github.com/apache/spark/pull/46024#issuecomment-2103808514 Sorry for being late. I missed your ping here, @zml1206 . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-48228][PYTHON][CONNECT] Implement the missing function validation in ApplyInXXX [spark]

2024-05-09 Thread via GitHub
zhengruifeng commented on code in PR #46519: URL: https://github.com/apache/spark/pull/46519#discussion_r1596217131 ## python/pyspark/sql/connect/group.py: ## @@ -34,6 +34,7 @@ from pyspark.util import PythonEvalType from pyspark.sql.group import GroupedData as

[PR] [SPARK-48228][PYTHON][CONNECT] Implement the missing function validation in ApplyInXXX [spark]

2024-05-09 Thread via GitHub
zhengruifeng opened a new pull request, #46519: URL: https://github.com/apache/spark/pull/46519 ### What changes were proposed in this pull request? Implement the missing function validation in ApplyInXXX https://github.com/apache/spark/pull/46397 fixed this issue for

Re: [PR] [SPARK-47834][SQL][CONNECT] Mark deprecated functions with `@deprecated` in `SQLImplicits` [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46029: URL: https://github.com/apache/spark/pull/46029#issuecomment-2103806851 Merged to master for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-47834][SQL][CONNECT] Mark deprecated functions with `@deprecated` in `SQLImplicits` [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46029: [SPARK-47834][SQL][CONNECT] Mark deprecated functions with `@deprecated` in `SQLImplicits` URL: https://github.com/apache/spark/pull/46029 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-47954][K8S] Support creating ingress entry for external UI access [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46184: URL: https://github.com/apache/spark/pull/46184#issuecomment-2103803949 Just FYI, please take your time. We can target this for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

Re: [PR] [SPARK-48144][SQL] Fix `canPlanAsBroadcastHashJoin` to respect shuffle join hints [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46401: URL: https://github.com/apache/spark/pull/46401#issuecomment-2103802765 Gentle ping, @fred-db . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-47995][PYTHON][INFRA][TESTS] Upgrade `pyarrow` to 16.0.0 in GitHub Action CI [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46232: URL: https://github.com/apache/spark/pull/46232#issuecomment-2103795700 This is still blocked by `mlflow 2.12.2` ``` mlflow 2.12.2 requires pyarrow<16,>=4.0.0, but you have pyarrow 16.0.0 which is incompatible. ``` -- This is an automated

Re: [PR] [SPARK-27900][CORE][K8s] Add `spark.driver.killOnOOMError` flag in cluster mode [spark]

2024-05-09 Thread via GitHub
dimon222 commented on PR #26161: URL: https://github.com/apache/spark/pull/26161#issuecomment-2103793011 Was this ever fixed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48225][BUILD] Upgrade `sbt` to 1.10.0 [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46517: [SPARK-48225][BUILD] Upgrade `sbt` to 1.10.0 URL: https://github.com/apache/spark/pull/46517 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48176][SQL] Adjust name of FIELD_ALREADY_EXISTS error condition [spark]

2024-05-09 Thread via GitHub
HyukjinKwon closed pull request #46510: [SPARK-48176][SQL] Adjust name of FIELD_ALREADY_EXISTS error condition URL: https://github.com/apache/spark/pull/46510 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [SPARK-48176][SQL] Adjust name of FIELD_ALREADY_EXISTS error condition [spark]

2024-05-09 Thread via GitHub
HyukjinKwon commented on PR #46510: URL: https://github.com/apache/spark/pull/46510#issuecomment-2103788152 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48214][INFRA] Ban import `org.slf4j.Logger` & `org.slf4j.LoggerFactory` [spark]

2024-05-09 Thread via GitHub
LuciferYang commented on PR #46502: URL: https://github.com/apache/spark/pull/46502#issuecomment-2103783129 Or how about having these modules depend on the `common/utils` module? `common/utils` doesn't seem to be a heavyweight module. -- This is an automated message from the Apache Git

Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46468: URL: https://github.com/apache/spark/pull/46468#issuecomment-2103781151 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-48225][BUILD] Upgrade `sbt` to 1.10.0 [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46517: URL: https://github.com/apache/spark/pull/46517#issuecomment-2103780307 Thank you so much for sharing! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-48197][SQL][TESTS][FOLLOWUP][3.5] Regenerate golden files [spark]

2024-05-09 Thread via GitHub
cloud-fan commented on PR #46514: URL: https://github.com/apache/spark/pull/46514#issuecomment-2103779630 thanks for the fix! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on PR #46493: URL: https://github.com/apache/spark/pull/46493#issuecomment-2103774486 cc @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48225][BUILD] Upgrade `sbt` to 1.10.0 [spark]

2024-05-09 Thread via GitHub
LuciferYang commented on PR #46517: URL: https://github.com/apache/spark/pull/46517#issuecomment-2103771513 @dongjoon-hyun I was discussing this issue with @panbingkun offline yesterday. From the responses of sbt and coursier, it seems difficult to solve this problem in the short term

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596192395 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -1999,7 +2042,9 @@ private AppPathsInfo(

Re: [PR] [SPARK-48226][BUILD] Add `spark-ganglia-lgpl` to `lint-java` & `spark-ganglia-lgpl` and `jvm-profiler` to `sbt-checkstyle` [spark]

2024-05-09 Thread via GitHub
LuciferYang commented on PR #46501: URL: https://github.com/apache/spark/pull/46501#issuecomment-2103760020 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596186119 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java: ## @@ -363,7 +367,8 @@ static MergedShuffleFileManager

Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-09 Thread via GitHub
srielau commented on code in PR #46267: URL: https://github.com/apache/spark/pull/46267#discussion_r1596184905 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -945,54 +945,73 @@ class SessionCatalog( throw

Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-09 Thread via GitHub
srielau commented on code in PR #46267: URL: https://github.com/apache/spark/pull/46267#discussion_r1596184739 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -945,54 +945,73 @@ class SessionCatalog( throw

Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-09 Thread via GitHub
srielau commented on code in PR #46267: URL: https://github.com/apache/spark/pull/46267#discussion_r1596180917 ## sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala: ## @@ -224,6 +224,7 @@ abstract class BaseSessionStateBuilder(

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596176953 ## common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java: ## @@ -298,7 +303,9 @@ public void onFailure(Throwable e) {

Re: [PR] [SPARK-46632][SQL] Fix subexpression elimination when equivalent ternary expressions have different children [spark]

2024-05-09 Thread via GitHub
zml1206 commented on PR #46135: URL: https://github.com/apache/spark/pull/46135#issuecomment-2103726355 @peter-toth Can this PR be merged? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596171652 ## common/network-common/src/main/java/org/apache/spark/network/ssl/ReloadingX509TrustManager.java: ## @@ -211,13 +210,13 @@ public void run() {

Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-09 Thread via GitHub
cloud-fan commented on code in PR #46267: URL: https://github.com/apache/spark/pull/46267#discussion_r1596169480 ## sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala: ## @@ -224,6 +224,7 @@ abstract class BaseSessionStateBuilder(

Re: [PR] [SPARK-47119][BUILD] Add `hive-jackson-provided` profile [spark]

2024-05-09 Thread via GitHub
pan3793 commented on PR #45201: URL: https://github.com/apache/spark/pull/45201#issuecomment-2103713964 @dongjoon-hyun Jackson 1.x can be removed after SPARK-47018 (bump Hive 2.3.10), what should we do for `hive-jackson-provided`? -- This is an automated message from the Apache Git

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596167842 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java: ## @@ -472,7 +487,8 @@ static ConcurrentMap

Re: [PR] [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file [spark]

2024-05-09 Thread via GitHub
cloud-fan closed pull request #46512: [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file URL: https://github.com/apache/spark/pull/46512 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596166648 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java: ## @@ -368,7 +382,8 @@ public int removeBlocks(String

Re: [PR] [SPARK-48222][INFRA][DOCS] Sync Ruby Bundler to 2.4.22 and refresh Gem lock file [spark]

2024-05-09 Thread via GitHub
cloud-fan commented on PR #46512: URL: https://github.com/apache/spark/pull/46512#issuecomment-2103711389 thanks, merging to master! (it's easier for me to test after merging it) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-09 Thread via GitHub
pan3793 commented on PR #46468: URL: https://github.com/apache/spark/pull/46468#issuecomment-2103709988 Hive 2.3.10 jars should be available on Google Maven Central Mirror now, re-triggered CI -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [SPARK-48219][CORE] StreamReader Charset fix with UTF8 [spark]

2024-05-09 Thread via GitHub
xuzifu666 commented on PR #46509: URL: https://github.com/apache/spark/pull/46509#issuecomment-2103709532 > Do you think you can provide a test coverage to protect your contribution from potential future regression, @xuzifu666 ? > > > Not need @dongjoon-hyun Thanks for you

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596163455 ## common/utils/src/main/java/org/apache/spark/internal/LoggerFactory.java: ## @@ -19,6 +19,11 @@ public class LoggerFactory { + public static Logger

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596162272 ## core/src/main/scala/org/apache/spark/network/netty/NettyBlockTransferService.scala: ## @@ -30,7 +30,7 @@ import com.codahale.metrics.{Metric, MetricSet}

Re: [PR] [SPARK-47018][BUILD][SQL] Bump built-in Hive to 2.3.10 [spark]

2024-05-09 Thread via GitHub
pan3793 commented on code in PR #46468: URL: https://github.com/apache/spark/pull/46468#discussion_r1596161241 ## sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveExternalCatalogVersionsSuite.scala: ## @@ -211,7 +211,7 @@ class HiveExternalCatalogVersionsSuite extends

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596158815 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala: ## @@ -200,7 +200,7 @@ private[sql] object GrpcRetryHandler

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596157433 ## common/utils/src/main/java/org/apache/spark/internal/LoggerFactory.java: ## @@ -19,6 +19,11 @@ public class LoggerFactory { + public static Logger

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596157433 ## common/utils/src/main/java/org/apache/spark/internal/LoggerFactory.java: ## @@ -19,6 +19,11 @@ public class LoggerFactory { + public static Logger

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596151348 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RetryingBlockTransferor.java: ## @@ -177,10 +179,16 @@ private void

Re: [PR] [SPARK-48209][CORE] Common (java side): Migrate `error/warn/info` with variables to structured logging framework [spark]

2024-05-09 Thread via GitHub
panbingkun commented on code in PR #46493: URL: https://github.com/apache/spark/pull/46493#discussion_r1596148922 ## common/network-common/src/main/java/org/apache/spark/network/ssl/SSLFactory.java: ## @@ -136,7 +135,7 @@ public void destroy() { try {

Re: [PR] [SPARK-48031][SQL] Support view schema evolution [spark]

2024-05-09 Thread via GitHub
gengliangwang commented on code in PR #46267: URL: https://github.com/apache/spark/pull/46267#discussion_r1596122256 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -945,54 +945,73 @@ class SessionCatalog( throw

Re: [PR] [SPARK-44609][K8S] Remove executor pod from PodsAllocator if it was removed from scheduler backend [spark]

2024-05-09 Thread via GitHub
github-actions[bot] closed pull request #42297: [SPARK-44609][K8S] Remove executor pod from PodsAllocator if it was removed from scheduler backend URL: https://github.com/apache/spark/pull/42297 -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] [SPARK-46885][SQL] Push down filters through `TypedFilter` [spark]

2024-05-09 Thread via GitHub
github-actions[bot] closed pull request #44911: [SPARK-46885][SQL] Push down filters through `TypedFilter` URL: https://github.com/apache/spark/pull/44911 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-46108][SQL] keepInnerXmlAsRaw option for Built-in XML Data Source [spark]

2024-05-09 Thread via GitHub
github-actions[bot] commented on PR #44022: URL: https://github.com/apache/spark/pull/44022#issuecomment-2103637588 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [WIP] docs: restructure the docs index page [spark]

2024-05-09 Thread via GitHub
github-actions[bot] commented on PR #44812: URL: https://github.com/apache/spark/pull/44812#issuecomment-2103637571 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-45708][BUILD] Retry mvn deploy [spark]

2024-05-09 Thread via GitHub
github-actions[bot] commented on PR #43559: URL: https://github.com/apache/spark/pull/43559#issuecomment-2103637610 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-48214][INFRA] Ban import `org.slf4j.Logger` & `org.slf4j.LoggerFactory` [spark]

2024-05-09 Thread via GitHub
panbingkun commented on PR #46502: URL: https://github.com/apache/spark/pull/46502#issuecomment-2103636458 > I am +1 for the idea. However, I wonder if there will be suggestions about why the two imports are not allowed and how to fix the style error. If that's not feasible with

Re: [PR] [SPARK-48227][PYTHON][DOC] Document the requirement of seed in protos [spark]

2024-05-09 Thread via GitHub
zhengruifeng commented on PR #46518: URL: https://github.com/apache/spark/pull/46518#issuecomment-2103633099 thanks @HyukjinKwon and @dongjoon-hyun for reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48214][INFRA] Ban import `org.slf4j.Logger` & `org.slf4j.LoggerFactory` [spark]

2024-05-09 Thread via GitHub
gengliangwang commented on PR #46502: URL: https://github.com/apache/spark/pull/46502#issuecomment-2103626513 I am +1 for the idea. However, I wonder if there will be suggestions about why the two imports are not allowed and how to fix the style error. If that's not feasible with

Re: [PR] [SPARK-48227][PYTHON][DOC] Document the requirement of seed in protos [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46518: URL: https://github.com/apache/spark/pull/46518#issuecomment-2103621276 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48227][PYTHON][DOC] Document the requirement of seed in protos [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46518: [SPARK-48227][PYTHON][DOC] Document the requirement of seed in protos URL: https://github.com/apache/spark/pull/46518 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48214][INFRA] Ban import `org.slf4j.Logger` & `org.slf4j.LoggerFactory` [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46502: URL: https://github.com/apache/spark/pull/46502#issuecomment-2103617009 cc @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48214][INFRA] Ban import `org.slf4j.Logger` & `org.slf4j.LoggerFactory` [spark]

2024-05-09 Thread via GitHub
panbingkun commented on PR #46502: URL: https://github.com/apache/spark/pull/46502#issuecomment-2103613069 - According to @gengliangwang's suggestion, we did not migrate the `test` code in the `structured log`, so we need to exclude them, eg:

Re: [PR] [SPARK-47793][TEST][FOLLOWUP] Fix flaky test for Python data source exactly once. [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on PR #46481: URL: https://github.com/apache/spark/pull/46481#issuecomment-2103611989 Could you do the final review and sign-off, please, @HyukjinKwon ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [SPARK-48180][SQL] Improve error when UDTF call with TABLE arg forgets parentheses around multiple PARTITION/ORDER BY exprs [spark]

2024-05-09 Thread via GitHub
HyukjinKwon closed pull request #46451: [SPARK-48180][SQL] Improve error when UDTF call with TABLE arg forgets parentheses around multiple PARTITION/ORDER BY exprs URL: https://github.com/apache/spark/pull/46451 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-48180][SQL] Improve error when UDTF call with TABLE arg forgets parentheses around multiple PARTITION/ORDER BY exprs [spark]

2024-05-09 Thread via GitHub
HyukjinKwon commented on PR #46451: URL: https://github.com/apache/spark/pull/46451#issuecomment-2103611091 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48226][BUILD] Add `spark-ganglia-lgpl` to `lint-java` & `spark-ganglia-lgpl` and `jvm-profiler` to `sbt-checkstyle` [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun closed pull request #46501: [SPARK-48226][BUILD] Add `spark-ganglia-lgpl` to `lint-java` & `spark-ganglia-lgpl` and `jvm-profiler` to `sbt-checkstyle` URL: https://github.com/apache/spark/pull/46501 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] [SPARK-48227][PYTHON][DOC] Document the requirement of seed in protos [spark]

2024-05-09 Thread via GitHub
dongjoon-hyun commented on code in PR #46518: URL: https://github.com/apache/spark/pull/46518#discussion_r1596102379 ## connector/connect/common/src/main/protobuf/spark/connect/relations.proto: ## @@ -467,7 +467,9 @@ message Sample { // (Optional) Whether to sample with

Re: [PR] [SPARK-48148][CORE] JSON objects should not be modified when read as STRING [spark]

2024-05-09 Thread via GitHub
HyukjinKwon commented on PR #46408: URL: https://github.com/apache/spark/pull/46408#issuecomment-2103609498 btw you can trigger on your own https://github.com/eric-maynard/spark/runs/24789350525 I can't trigger :-). -- This is an automated message from the Apache Git Service. To respond

Re: [PR] [SPARK-48148][CORE] JSON objects should not be modified when read as STRING [spark]

2024-05-09 Thread via GitHub
HyukjinKwon closed pull request #46408: [SPARK-48148][CORE] JSON objects should not be modified when read as STRING URL: https://github.com/apache/spark/pull/46408 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-48148][CORE] JSON objects should not be modified when read as STRING [spark]

2024-05-09 Thread via GitHub
HyukjinKwon commented on PR #46408: URL: https://github.com/apache/spark/pull/46408#issuecomment-2103609100 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48089][SS][CONNECT] Fix 3.5 <> 4.0 StreamingQueryListener compatibility test [spark]

2024-05-09 Thread via GitHub
HyukjinKwon commented on PR #46513: URL: https://github.com/apache/spark/pull/46513#issuecomment-2103607909 Merged to branch-3.5. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-48089][SS][CONNECT] Fix 3.5 <> 4.0 StreamingQueryListener compatibility test [spark]

2024-05-09 Thread via GitHub
HyukjinKwon closed pull request #46513: [SPARK-48089][SS][CONNECT] Fix 3.5 <> 4.0 StreamingQueryListener compatibility test URL: https://github.com/apache/spark/pull/46513 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[PR] [SPARK-48227][PYTHON][DOC] Document the requirement of seed in protos [spark]

2024-05-09 Thread via GitHub
zhengruifeng opened a new pull request, #46518: URL: https://github.com/apache/spark/pull/46518 ### What changes were proposed in this pull request? Document the requirement of seed in protos ### Why are the changes needed? the seed should be set at client side

  1   2   3   >