Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
MaxGekk commented on PR #44527: URL: https://github.com/apache/spark/pull/44527#issuecomment-1871842676 @milastdbx Are you ok with the changes? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

Re: [PR] [SPARK-46524][SQL] Improve error messages for invalid save mode [spark]

2023-12-29 Thread via GitHub
MaxGekk commented on PR #44508: URL: https://github.com/apache/spark/pull/44508#issuecomment-1871843480 @allisonwang-db The test failures are related to your changes, please, fix them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
milastdbx commented on code in PR #44527: URL: https://github.com/apache/spark/pull/44527#discussion_r1438097001 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala: ## @@ -325,11 +325,17 @@ case class ExpressionEncoder[T]( assert(ser

[PR] [SPARK-46544][SQL] Support v2 DESCRIBE TABLE EXTENDED with table stats [spark]

2023-12-29 Thread via GitHub
Zouxxyy opened a new pull request, #44535: URL: https://github.com/apache/spark/pull/44535 ### What changes were proposed in this pull request? Support v2 DESCRIBE TABLE EXTENDED with table stats ### Why are the changes needed? Similar to #40058, make DS v1/v2 co

Re: [PR] [SPARK-46512][CORE] Optimize shuffle reading when both sort and combine are used. [spark]

2023-12-29 Thread via GitHub
zhengchenyu commented on PR #44512: URL: https://github.com/apache/spark/pull/44512#issuecomment-1871886874 @Ngone51 Thanks for you review! The combine is the key problem. I still combine, but combine in ExternalSorter will never trigger extra spill. Combine happen in insertAllAndUpd

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
MaxGekk commented on PR #44464: URL: https://github.com/apache/spark/pull/44464#issuecomment-1871940803 @cloud-fan @dongjoon-hyun @HyukjinKwon @heyihong Please, review this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub a

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun commented on PR #44464: URL: https://github.com/apache/spark/pull/44464#issuecomment-1871945159 Could you make the CI successful, @MaxGekk ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
MaxGekk commented on PR #44464: URL: https://github.com/apache/spark/pull/44464#issuecomment-1871948187 @dongjoon-hyun I guess [Run / Run Spark on Kubernetes Integration test](https://github.com/MaxGekk/spark/actions/runs/7354935787/job/20024695951#logs) not related to the changes, but I re

[PR] [SPARK-46504][PS][TESTS][FOLLOWUPS] Moving slow tests out of `IndexesTests`: conversion and drop methods [spark]

2023-12-29 Thread via GitHub
zhengruifeng opened a new pull request, #44536: URL: https://github.com/apache/spark/pull/44536 ### What changes were proposed in this pull request? Moving slow tests out of `IndexesTests`: conversion and drop methods ### Why are the changes needed? for testing parallelism

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
cloud-fan commented on PR #44527: URL: https://github.com/apache/spark/pull/44527#issuecomment-1871953377 Does it mean we will hit this bug if an empty struct is returned? `EXCEPT` may not be the only way to create empty struct. -- This is an automated message from the Apache Git Service.

Re: [PR] [SPARK-46544][SQL] Support v2 DESCRIBE TABLE EXTENDED with table stats [spark]

2023-12-29 Thread via GitHub
MaxGekk commented on PR #44535: URL: https://github.com/apache/spark/pull/44535#issuecomment-1871957198 @Zouxxyy Please, trigger GitHub actions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
stefankandic commented on PR #44527: URL: https://github.com/apache/spark/pull/44527#issuecomment-1871977812 > Does it mean we will hit this bug if an empty struct is returned? `EXCEPT` may not be the only way to create empty struct. yes, i think we will @cloud-fan -- This is an a

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
stefankandic commented on code in PR #44527: URL: https://github.com/apache/spark/pull/44527#discussion_r1438165576 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala: ## @@ -325,11 +325,17 @@ case class ExpressionEncoder[T]( assert(

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
heyihong commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438176043 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/SparkConnectClientSuite.scala: ## @@ -211,23 +211,27 @@ class SparkConnectClientSuit

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
heyihong commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438179268 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -230,55 +230,75 @@ private[client] object GrpcExcept

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
heyihong commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438180887 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -230,55 +230,75 @@ private[client] object GrpcExcept

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
heyihong commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438184661 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/SparkConnectClientSuite.scala: ## @@ -211,23 +211,27 @@ class SparkConnectClientSuit

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
MaxGekk commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438203537 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -230,55 +230,75 @@ private[client] object GrpcExcepti

[PR] Collations proof of concept [spark]

2023-12-29 Thread via GitHub
dbatomic opened a new pull request, #44537: URL: https://github.com/apache/spark/pull/44537 # Rough POC for collations in Spark ## High level changes - Collation suite that test currently supported features (start with this file). - Global, singleton `CollatorFactory`. For given

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
heyihong commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438236564 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -230,55 +230,75 @@ private[client] object GrpcExcept

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
heyihong commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438236564 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -230,55 +230,75 @@ private[client] object GrpcExcept

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
heyihong commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438236564 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -230,55 +230,75 @@ private[client] object GrpcExcept

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
beliefer commented on code in PR #44527: URL: https://github.com/apache/spark/pull/44527#discussion_r1438237574 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala: ## @@ -325,11 +325,17 @@ case class ExpressionEncoder[T]( assert(seri

Re: [PR] [SPARK-46331][SQL] Removing CodegenFallback from subset of DateTime expressions and version() expression [spark]

2023-12-29 Thread via GitHub
dbatomic commented on PR #44261: URL: https://github.com/apache/spark/pull/44261#issuecomment-1872068792 > LGTM except for [#44261 (comment)](https://github.com/apache/spark/pull/44261#discussion_r1437460354) . Can we verify the test coverage for current datetime functions in `ComputeCurre

Re: [PR] [SPARK-46331][SQL] Removing CodegenFallback from subset of DateTime expressions and version() expression [spark]

2023-12-29 Thread via GitHub
cloud-fan commented on code in PR #44261: URL: https://github.com/apache/spark/pull/44261#discussion_r1438257053 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/DateExpressionsSuite.scala: ## @@ -960,9 +946,9 @@ class DateExpressionsSuite extends SparkFu

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
MaxGekk commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438258631 ## connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala: ## @@ -230,55 +230,75 @@ private[client] object GrpcExcepti

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
cloud-fan commented on code in PR #44527: URL: https://github.com/apache/spark/pull/44527#discussion_r1438259318 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala: ## @@ -325,11 +325,17 @@ case class ExpressionEncoder[T]( assert(ser

Re: [PR] [SPARK-46490][SQL] Require error classes in `SparkThrowable` sub-classes [spark]

2023-12-29 Thread via GitHub
MaxGekk commented on code in PR #44464: URL: https://github.com/apache/spark/pull/44464#discussion_r1438269331 ## connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/SparkConnectClientSuite.scala: ## @@ -211,23 +211,27 @@ class SparkConnectClientSuite

[PR] [WIP][SPARK-46536]Support GROUP BY calendar_interval_type [spark]

2023-12-29 Thread via GitHub
stefankandic opened a new pull request, #44538: URL: https://github.com/apache/spark/pull/44538 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### Ho

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
stefankandic commented on code in PR #44527: URL: https://github.com/apache/spark/pull/44527#discussion_r1438299590 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala: ## @@ -325,11 +325,17 @@ case class ExpressionEncoder[T]( assert(

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
milastdbx commented on code in PR #44527: URL: https://github.com/apache/spark/pull/44527#discussion_r1438321786 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala: ## @@ -325,11 +325,17 @@ case class ExpressionEncoder[T]( assert(ser

Re: [PR] [SPARK-46539][SQL] SELECT * EXCEPT(all fields from a struct) results in an assertion failure [spark]

2023-12-29 Thread via GitHub
stefankandic commented on code in PR #44527: URL: https://github.com/apache/spark/pull/44527#discussion_r1438336908 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoder.scala: ## @@ -325,11 +325,17 @@ case class ExpressionEncoder[T]( assert(

Re: [PR] [SPARK-39800][SQL][WIP] DataSourceV2: View Support [spark]

2023-12-29 Thread via GitHub
jzhuge commented on PR #44197: URL: https://github.com/apache/spark/pull/44197#issuecomment-1872227012 Almost done converting to v2 command framework. Seeing great reduction in changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]

2023-12-29 Thread via GitHub
HyukjinKwon commented on PR #44519: URL: https://github.com/apache/spark/pull/44519#issuecomment-1872401581 @LuciferYang @dongjoon-hyun @zhengruifeng if anyone is online can you merge this one please ? I won't be away from keyboard today.. and this technically fixes the build. -- This i

Re: [PR] [MiNOR][DOCS] Fix a typo in HashAggregateExec.scala [spark]

2023-12-29 Thread via GitHub
github-actions[bot] commented on PR #42916: URL: https://github.com/apache/spark/pull/42916#issuecomment-1872402298 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

Re: [PR] [SPARK-44459][Core] Add System.runFinalization() to periodic cleanup [spark]

2023-12-29 Thread via GitHub
github-actions[bot] closed pull request #42893: [SPARK-44459][Core] Add System.runFinalization() to periodic cleanup URL: https://github.com/apache/spark/pull/42893 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

Re: [PR] [SPARK-44812][SQL] Push all predicates according to EqualTo and EqualNullSafe [spark]

2023-12-29 Thread via GitHub
github-actions[bot] closed pull request #42495: [SPARK-44812][SQL] Push all predicates according to EqualTo and EqualNullSafe URL: https://github.com/apache/spark/pull/42495 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

Re: [PR] [SPARK-45199][SQL] Release cast from attribute in filter to support predicate push down [spark]

2023-12-29 Thread via GitHub
github-actions[bot] closed pull request #42978: [SPARK-45199][SQL] Release cast from attribute in filter to support predicate push down URL: https://github.com/apache/spark/pull/42978 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitH

Re: [PR] [SPARK-43496][KUBERNETES] Add configuration for pod memory limits [spark]

2023-12-29 Thread via GitHub
github-actions[bot] closed pull request #41067: [SPARK-43496][KUBERNETES] Add configuration for pod memory limits URL: https://github.com/apache/spark/pull/41067 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]

2023-12-29 Thread via GitHub
zhengruifeng closed pull request #44519: [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources URL: https://github.com/apache/spark/pull/44519 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]

2023-12-29 Thread via GitHub
HyukjinKwon commented on PR #44519: URL: https://github.com/apache/spark/pull/44519#issuecomment-1872405120 Thx thx -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44519: URL: https://github.com/apache/spark/pull/44519#issuecomment-1872405056 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46509][CORE][SS] Replace `.reverse.find` with `.findLast` [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun closed pull request #44495: [SPARK-46509][CORE][SS] Replace `.reverse.find` with `.findLast` URL: https://github.com/apache/spark/pull/44495 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

Re: [PR] [SPARK-46509][CORE][SS] Replace `.reverse.find` with `.findLast` [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun commented on PR #44495: URL: https://github.com/apache/spark/pull/44495#issuecomment-1872410647 +1, LGTM. Merged to master. Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[PR] [INFRA] Pin `lxml==4.9.4` [spark]

2023-12-29 Thread via GitHub
zhengruifeng opened a new pull request, #44539: URL: https://github.com/apache/spark/pull/44539 ### What changes were proposed in this pull request? Pin 'lxml==4.9.4' ### Why are the changes needed? it seems the newly released lxml 5.0.0 breaks the CI ``` Collecting

Re: [PR] [INFRA] Pin `lxml==4.9.4` [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44539: URL: https://github.com/apache/spark/pull/44539#issuecomment-1872411270 let's wait for CI: https://github.com/zhengruifeng/spark/actions/runs/7361000751 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [INFRA] Pin `lxml==4.9.4` [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44539: URL: https://github.com/apache/spark/pull/44539#issuecomment-1872412049 the `Install Python packages (Python 3.9)` step in `SQL - Slow tests` now run successfully -- This is an automated message from the Apache Git Service. To respond to the message, p

Re: [PR] [SPARK-46545][INFRA] Pin `lxml==4.9.4` [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44539: URL: https://github.com/apache/spark/pull/44539#issuecomment-1872412298 cc @dongjoon-hyun @LuciferYang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] [SPARK-43496][KUBERNETES] Add configuration for pod memory limits [spark]

2023-12-29 Thread via GitHub
yerenkow commented on PR #41067: URL: https://github.com/apache/spark/pull/41067#issuecomment-1872414623 Can anyone PTAL at this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

Re: [PR] [SPARK-46542][SQL] Remove the check for `c>=0` from `ExternalCatalogUtils#needsEscaping` as it is always true [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun closed pull request #44533: [SPARK-46542][SQL] Remove the check for `c>=0` from `ExternalCatalogUtils#needsEscaping` as it is always true URL: https://github.com/apache/spark/pull/44533 -- This is an automated message from the Apache Git Service. To respond to the message, pleas

Re: [PR] [SPARK-46542][SQL] Remove the check for `c>=0` from `ExternalCatalogUtils#needsEscaping` as it is always true [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun commented on PR #44533: URL: https://github.com/apache/spark/pull/44533#issuecomment-1872416373 Merged to master. Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

Re: [PR] [SPARK-46545][INFRA] Pin `lxml==4.9.4` [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun closed pull request #44539: [SPARK-46545][INFRA] Pin `lxml==4.9.4` URL: https://github.com/apache/spark/pull/44539 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46545][INFRA] Pin `lxml==4.9.4` [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun commented on PR #44539: URL: https://github.com/apache/spark/pull/44539#issuecomment-1872416800 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

Re: [PR] [SPARK-46545][INFRA] Pin `lxml==4.9.4` [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44539: URL: https://github.com/apache/spark/pull/44539#issuecomment-1872416935 thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-46531][BUILD] Move the dependency management of `datasketches-java` from `catalyst` to `parent` [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun closed pull request #44521: [SPARK-46531][BUILD] Move the dependency management of `datasketches-java` from `catalyst` to `parent` URL: https://github.com/apache/spark/pull/44521 -- This is an automated message from the Apache Git Service. To respond to the message, please log o

Re: [PR] [SPARK-46531][BUILD] Move the dependency management of `datasketches-java` from `catalyst` to `parent` [spark]

2023-12-29 Thread via GitHub
dongjoon-hyun commented on PR #44521: URL: https://github.com/apache/spark/pull/44521#issuecomment-1872417207 +1, LGTM. Merged to master. Thank you, @LuciferYang . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] [SPARK-46543][PYTHON][CONNECT] Make `json_tuple` throw PySparkValueError for empty fields [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44534: URL: https://github.com/apache/spark/pull/44534#issuecomment-1872422308 ci: https://github.com/zhengruifeng/spark/actions/runs/7361173505 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

Re: [PR] Identify issue - Run Spark on Kubernetes Integration test failure [spark]

2023-12-29 Thread via GitHub
panbingkun closed pull request #44530: Identify issue - Run Spark on Kubernetes Integration test failure URL: https://github.com/apache/spark/pull/44530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-kubernetes` and `running-on-yarn` pages [spark]

2023-12-29 Thread via GitHub
panbingkun opened a new pull request, #44540: URL: https://github.com/apache/spark/pull/44540 ### What changes were proposed in this pull request? The pr aims to fix the formatting of tables in `running-on-kubernetes` and `running-on-yarn` pages. ### Why are the changes needed?

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-kubernetes` and `running-on-yarn` pages [spark]

2023-12-29 Thread via GitHub
panbingkun commented on PR #44540: URL: https://github.com/apache/spark/pull/44540#issuecomment-1872428044 1.https://spark.apache.org/docs/latest/running-on-kubernetes.html#configuration Before: https://github.com/apache/spark/assets/15246973/25207f6e-0d60-4381-89c1-2c7361bb72eb";>

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-kubernetes` and `running-on-yarn` pages [spark]

2023-12-29 Thread via GitHub
panbingkun commented on PR #44540: URL: https://github.com/apache/spark/pull/44540#issuecomment-1872428686 cc @zhengruifeng @HyukjinKwon @dongjoon-hyun @LuciferYang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[PR] debug appveyor [spark]

2023-12-29 Thread via GitHub
panbingkun opened a new pull request, #44541: URL: https://github.com/apache/spark/pull/44541 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How

Re: [PR] [SPARK-46504][PS][TESTS][FOLLOWUPS] Moving slow tests out of `IndexesTests`: conversion and drop methods [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44536: URL: https://github.com/apache/spark/pull/44536#issuecomment-1872432447 cc @HyukjinKwon @itholic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [SPARK-46543][PYTHON][CONNECT] Make `json_tuple` throw PySparkValueError for empty fields [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44534: URL: https://github.com/apache/spark/pull/44534#issuecomment-1872433182 cc @HyukjinKwon and @itholic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec

Re: [PR] [SPARK-46531][BUILD] Move the dependency management of `datasketches-java` from `catalyst` to `parent` [spark]

2023-12-29 Thread via GitHub
LuciferYang commented on PR #44521: URL: https://github.com/apache/spark/pull/44521#issuecomment-1872454895 Thanks @dongjoon-hyun ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] [SPARK-46545][INFRA] Pin `lxml==4.9.4` [spark]

2023-12-29 Thread via GitHub
LuciferYang commented on PR #44539: URL: https://github.com/apache/spark/pull/44539#issuecomment-1872454941 late LGTM -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To uns

Re: [PR] [SPARK-46542][SQL] Remove the check for `c>=0` from `ExternalCatalogUtils#needsEscaping` as it is always true [spark]

2023-12-29 Thread via GitHub
LuciferYang commented on PR #44533: URL: https://github.com/apache/spark/pull/44533#issuecomment-1872454973 Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-46509][CORE][SS] Replace `.reverse.find` with `.findLast` [spark]

2023-12-29 Thread via GitHub
LuciferYang commented on PR #44495: URL: https://github.com/apache/spark/pull/44495#issuecomment-1872455014 Thanks @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific com

Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]

2023-12-29 Thread via GitHub
LuciferYang commented on PR #44519: URL: https://github.com/apache/spark/pull/44519#issuecomment-1872455892 late LGTM, thanks @HyukjinKwon for fixing this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above t

Re: [PR] [SPARK-46546][DOCS] Fix the formatting of tables in `running-on-kubernetes` and `running-on-yarn` pages [spark]

2023-12-29 Thread via GitHub
nchammas commented on PR #44540: URL: https://github.com/apache/spark/pull/44540#issuecomment-1872460079 I think a better approach to fixing the styling of tables is to assign them to a class and then define the appropriate rules in [`custom.css`][1]. It will be much simpler and more robust

Re: [PR] [SPARK-45862][PYTHON][DOCS] Add user guide for basic dataframe operations [spark]

2023-12-29 Thread via GitHub
nchammas commented on code in PR #43972: URL: https://github.com/apache/spark/pull/43972#discussion_r1438476161 ## python/docs/source/user_guide/basic_dataframe_operations.rst: ## @@ -0,0 +1,169 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +or more cont

Re: [PR] [SPARK-46504][PS][TESTS][FOLLOWUPS] Moving slow tests out of `IndexesTests`: conversion and drop methods [spark]

2023-12-29 Thread via GitHub
zhengruifeng closed pull request #44536: [SPARK-46504][PS][TESTS][FOLLOWUPS] Moving slow tests out of `IndexesTests`: conversion and drop methods URL: https://github.com/apache/spark/pull/44536 -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] [SPARK-46504][PS][TESTS][FOLLOWUPS] Moving slow tests out of `IndexesTests`: conversion and drop methods [spark]

2023-12-29 Thread via GitHub
zhengruifeng commented on PR #44536: URL: https://github.com/apache/spark/pull/44536#issuecomment-1872461987 merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] [SPARK-46179][SQL] Add CrossDbmsQueryTestSuites, which allows generating golden files with other DBMS, starting off with Postgres [spark]

2023-12-29 Thread via GitHub
andylam-db commented on PR #44084: URL: https://github.com/apache/spark/pull/44084#issuecomment-1872466384 > Our goal is to guarantee Spark produces the same result as the reference DBMS, why do we want to have one golden file for each DBMS? What if they are different? Which one should Spar

Re: [PR] [DOCS][MINOR] Fix typos [spark]

2023-12-29 Thread via GitHub
nchammas commented on PR #44276: URL: https://github.com/apache/spark/pull/44276#issuecomment-1872471792 cc @yaooqinn - Small typo fix PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the spec