[GitHub] [spark] itholic commented on pull request #42954: [WIP][SPARK-43458][SPARK-43561][PS][TESTS] Enable `test_to_latex` for (Series|DataFrame) conversion

2023-09-16 Thread via GitHub
itholic commented on PR #42954: URL: https://github.com/apache/spark/pull/42954#issuecomment-1722155802 We've been disabling this test so far since we pin `jinja2<3.0.0` for documentation build, but ``jinja2>=3.1.2`` is required for `to_latex` from Pandas 2.0.0. But I just realized t

[GitHub] [spark] Hisoka-X commented on a diff in pull request #42952: [SPARK-45184][SQL] Remove orphaned error class documents

2023-09-16 Thread via GitHub
Hisoka-X commented on code in PR #42952: URL: https://github.com/apache/spark/pull/42952#discussion_r1327921647 ## core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala: ## @@ -222,6 +223,21 @@ class SparkThrowableSuite extends SparkFunSuite { |---""".stripMar

[GitHub] [spark] Hisoka-X commented on a diff in pull request #42952: [SPARK-45184][SQL] Remove orphaned error class documents

2023-09-16 Thread via GitHub
Hisoka-X commented on code in PR #42952: URL: https://github.com/apache/spark/pull/42952#discussion_r1327921121 ## core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala: ## @@ -323,6 +339,23 @@ class SparkThrowableSuite extends SparkFunSuite { assert(sqlErrorPare

[GitHub] [spark] itholic commented on a diff in pull request #42938: [SPARK-44788][CONNECT][PYTHON][SQL] Add from_xml and schema_of_xml to pyspark, spark connect and sql function

2023-09-16 Thread via GitHub
itholic commented on code in PR #42938: URL: https://github.com/apache/spark/pull/42938#discussion_r1327920967 ## python/pyspark/errors/error_classes.py: ## @@ -477,6 +477,11 @@ "Argument `` should be a Column or str, got ." ] }, + "NOT_COLUMN_OR_STR_OR_STRUCT" :

[GitHub] [spark] itholic commented on a diff in pull request #42938: [SPARK-44788][CONNECT][PYTHON][SQL] Add from_xml and schema_of_xml to pyspark, spark connect and sql function

2023-09-16 Thread via GitHub
itholic commented on code in PR #42938: URL: https://github.com/apache/spark/pull/42938#discussion_r1327925718 ## python/pyspark/sql/tests/test_functions.py: ## @@ -1286,6 +1281,27 @@ def test_from_csv(self): message_parameters={"arg_name": "schema", "arg_type": "in

[GitHub] [spark] itholic commented on a diff in pull request #42938: [SPARK-44788][CONNECT][PYTHON][SQL] Add from_xml and schema_of_xml to pyspark, spark connect and sql function

2023-09-16 Thread via GitHub
itholic commented on code in PR #42938: URL: https://github.com/apache/spark/pull/42938#discussion_r1327925718 ## python/pyspark/sql/tests/test_functions.py: ## @@ -1286,6 +1281,27 @@ def test_from_csv(self): message_parameters={"arg_name": "schema", "arg_type": "in

[GitHub] [spark] itholic commented on a diff in pull request #42938: [SPARK-44788][CONNECT][PYTHON][SQL] Add from_xml and schema_of_xml to pyspark, spark connect and sql function

2023-09-16 Thread via GitHub
itholic commented on code in PR #42938: URL: https://github.com/apache/spark/pull/42938#discussion_r1327925718 ## python/pyspark/sql/tests/test_functions.py: ## @@ -1286,6 +1281,27 @@ def test_from_csv(self): message_parameters={"arg_name": "schema", "arg_type": "in

[GitHub] [spark] itholic opened a new pull request, #42955: [SPARK-43628][SPARK-43629][CONNECT][PS][TESTS] Clear message for JVM dependent tests.

2023-09-16 Thread via GitHub
itholic opened a new pull request, #42955: URL: https://github.com/apache/spark/pull/42955 ### What changes were proposed in this pull request? This PR proposes to correct the message for JVM only tests from Spark Connect, and enable the tests when possible to workaround without JVM f

[GitHub] [spark] itholic commented on a diff in pull request #42955: [SPARK-43628][SPARK-43629][CONNECT][PS][TESTS] Clear message for JVM dependent tests.

2023-09-16 Thread via GitHub
itholic commented on code in PR #42955: URL: https://github.com/apache/spark/pull/42955#discussion_r1327935904 ## python/pyspark/pandas/tests/computation/test_compute.py: ## @@ -101,16 +101,10 @@ def test_mode(self): with self.assertRaises(ValueError): psdf

[GitHub] [spark] panbingkun commented on a diff in pull request #42952: [SPARK-45184][SQL] Remove orphaned error class documents

2023-09-16 Thread via GitHub
panbingkun commented on code in PR #42952: URL: https://github.com/apache/spark/pull/42952#discussion_r1327937136 ## core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala: ## @@ -222,6 +223,21 @@ class SparkThrowableSuite extends SparkFunSuite { |---""".stripM

[GitHub] [spark] itholic opened a new pull request, #42956: [SPARK-43654][CONNECT][PS][TESTS] Enable `InternalFrameParityTests.test_from_pandas`

2023-09-16 Thread via GitHub
itholic opened a new pull request, #42956: URL: https://github.com/apache/spark/pull/42956 ### What changes were proposed in this pull request? This PR proposes to enable `InternalFrameParityTests.test_from_pandas` ### Why are the changes needed? To improv

[GitHub] [spark] itholic commented on a diff in pull request #42956: [SPARK-43654][CONNECT][PS][TESTS] Enable `InternalFrameParityTests.test_from_pandas`

2023-09-16 Thread via GitHub
itholic commented on code in PR #42956: URL: https://github.com/apache/spark/pull/42956#discussion_r1327938259 ## python/pyspark/pandas/tests/connect/test_parity_internal.py: ## @@ -15,18 +15,86 @@ # limitations under the License. # import unittest +import pandas as pd fro

[GitHub] [spark] itholic commented on a diff in pull request #42956: [SPARK-43654][CONNECT][PS][TESTS] Enable `InternalFrameParityTests.test_from_pandas`

2023-09-16 Thread via GitHub
itholic commented on code in PR #42956: URL: https://github.com/apache/spark/pull/42956#discussion_r1327938259 ## python/pyspark/pandas/tests/connect/test_parity_internal.py: ## @@ -15,18 +15,86 @@ # limitations under the License. # import unittest +import pandas as pd fro

[GitHub] [spark] Daniel-Davies commented on pull request #42951: [SPARK-45078][SQL] Fix `array_insert` ImplicitCastInputTypes not work

2023-09-16 Thread via GitHub
Daniel-Davies commented on PR #42951: URL: https://github.com/apache/spark/pull/42951#issuecomment-1722195288 Thank you for fixing this @Hisoka-X! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] MaxGekk opened a new pull request, #42957: [WIP][SQL] Update error messages related to parameterized `sql()`

2023-09-16 Thread via GitHub
MaxGekk opened a new pull request, #42957: URL: https://github.com/apache/spark/pull/42957 ### What changes were proposed in this pull request? TODO ### Why are the changes needed? TODO ### Does this PR introduce _any_ user-facing change? No. ### How was this pat

[GitHub] [spark] panbingkun commented on pull request #42881: [SPARK-45122][DOCS] Automate updating versions.json

2023-09-16 Thread via GitHub
panbingkun commented on PR #42881: URL: https://github.com/apache/spark/pull/42881#issuecomment-1722216562 cc @srowen @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [spark] srowen commented on pull request #42919: [SPARK-45146][DOCS]Update the default value of 'spark.executor.logs.rolling.strategy'

2023-09-16 Thread via GitHub
srowen commented on PR #42919: URL: https://github.com/apache/spark/pull/42919#issuecomment-178752 Merged to master You reused a JIRA, when this should use a new JIRA. They're really closely related so it's OK here, but you should edit the JIRA to describe the second change you mad

[GitHub] [spark] srowen closed pull request #42919: [SPARK-45146][DOCS]Update the default value of 'spark.executor.logs.rolling.strategy'

2023-09-16 Thread via GitHub
srowen closed pull request #42919: [SPARK-45146][DOCS]Update the default value of 'spark.executor.logs.rolling.strategy' URL: https://github.com/apache/spark/pull/42919 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] srowen commented on pull request #42881: [SPARK-45122][DOCS] Automate updating versions.json

2023-09-16 Thread via GitHub
srowen commented on PR #42881: URL: https://github.com/apache/spark/pull/42881#issuecomment-178939 Seems OK. We also have to at some point remove older releases from the dropdown -- This is an automated message from the Apache Git Service. To respond to the message, please log on to G

[GitHub] [spark] panbingkun commented on pull request #42881: [SPARK-45122][DOCS] Automate updating versions.json

2023-09-16 Thread via GitHub
panbingkun commented on PR #42881: URL: https://github.com/apache/spark/pull/42881#issuecomment-1722232012 > Seems OK. We also have to at some point remove older releases from the dropdown At present, I haven't thought of it because it seems difficult to define `what is older release

[GitHub] [spark] panbingkun commented on pull request #42883: [SPARK-45127][DOCS] Exclude README.md from document build

2023-09-16 Thread via GitHub
panbingkun commented on PR #42883: URL: https://github.com/apache/spark/pull/42883#issuecomment-1722232170 cc @srowen -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] Hisoka-X commented on pull request #42951: [SPARK-45078][SQL] Fix `array_insert` ImplicitCastInputTypes not work

2023-09-16 Thread via GitHub
Hisoka-X commented on PR #42951: URL: https://github.com/apache/spark/pull/42951#issuecomment-1722234399 > Can't we require if someone extends `ExpectsInputTypes` then `inputTypes` shall return a non-empty `Seq`? Maybe because empty means do nothing this time? -- This is an automat

[GitHub] [spark] grundprinzip commented on a diff in pull request #42929: [SPARK-45167][CONNECT][PYTHON] Python client must call `release_all`

2023-09-16 Thread via GitHub
grundprinzip commented on code in PR #42929: URL: https://github.com/apache/spark/pull/42929#discussion_r1327963867 ## python/pyspark/sql/connect/client/reattach.py: ## @@ -14,14 +14,16 @@ # See the License for the specific language governing permissions and # limitations unde

[GitHub] [spark] MaxGekk commented on pull request #42951: [SPARK-45078][SQL] Fix `array_insert` ImplicitCastInputTypes not work

2023-09-16 Thread via GitHub
MaxGekk commented on PR #42951: URL: https://github.com/apache/spark/pull/42951#issuecomment-1722235564 > Maybe because empty means do nothing this time? If do nothing, what's the reason to expect some input types which extending of `ExpectsInputTypes` assumes? -- This is an automa

[GitHub] [spark] srowen closed pull request #42883: [SPARK-45127][DOCS] Exclude README.md from document build

2023-09-16 Thread via GitHub
srowen closed pull request #42883: [SPARK-45127][DOCS] Exclude README.md from document build URL: https://github.com/apache/spark/pull/42883 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [spark] srowen commented on pull request #42883: [SPARK-45127][DOCS] Exclude README.md from document build

2023-09-16 Thread via GitHub
srowen commented on PR #42883: URL: https://github.com/apache/spark/pull/42883#issuecomment-1722237524 Merged to master/3.5/3.4/3.3 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

[GitHub] [spark] Hisoka-X commented on pull request #42951: [SPARK-45078][SQL] Fix `array_insert` ImplicitCastInputTypes not work

2023-09-16 Thread via GitHub
Hisoka-X commented on PR #42951: URL: https://github.com/apache/spark/pull/42951#issuecomment-1722242679 > > Maybe because empty means do nothing this time? > > If do nothing, what's the reason to expect some input types which extending of `ExpectsInputTypes` assumes? Different

[GitHub] [spark] MaxGekk commented on pull request #42951: [SPARK-45078][SQL] Fix `array_insert` ImplicitCastInputTypes not work

2023-09-16 Thread via GitHub
MaxGekk commented on PR #42951: URL: https://github.com/apache/spark/pull/42951#issuecomment-1722282491 > Maybe because empty means do nothing this time? > You mean we should do like this? @MaxGekk Yep, eliminate the special meaning of empty `Seq`. -- This is an automated messag

[GitHub] [spark] gbloisi-openaire commented on pull request #42634: [SPARK-44910][SQL] Encoders.bean does not support superclasses with generic type arguments

2023-09-16 Thread via GitHub
gbloisi-openaire commented on PR #42634: URL: https://github.com/apache/spark/pull/42634#issuecomment-1722306800 Since I was not able to re-run all tests successfully anymore (tried a few times) I had to merge with master to get a [green workflow run](https://github.com/gbloisi-openaire/spa

[GitHub] [spark] itholic commented on pull request #42953: [SPARK-45185][BUILD][PYTHON] Ignore type check for preventing unexpected linter failure

2023-09-16 Thread via GitHub
itholic commented on PR #42953: URL: https://github.com/apache/spark/pull/42953#issuecomment-1722310243 For some reason, it doesn't seem to be causing any more errors, so let me just close it. https://github.com/apache/spark/assets/44108233/d433808d-4b64-4639-9695-1020af122789";>

[GitHub] [spark] itholic closed pull request #42953: [SPARK-45185][BUILD][PYTHON] Ignore type check for preventing unexpected linter failure

2023-09-16 Thread via GitHub
itholic closed pull request #42953: [SPARK-45185][BUILD][PYTHON] Ignore type check for preventing unexpected linter failure URL: https://github.com/apache/spark/pull/42953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] itholic commented on a diff in pull request #42793: [SPARK-45065][PYTHON][PS] Support Pandas 2.1.0

2023-09-16 Thread via GitHub
itholic commented on code in PR #42793: URL: https://github.com/apache/spark/pull/42793#discussion_r1328008105 ## python/docs/source/migration_guide/pyspark_upgrade.rst: ## @@ -42,6 +42,8 @@ Upgrading from PySpark 3.5 to 4.0 * In Spark 4.0, ``squeeze`` parameter from ``ps.read_

[GitHub] [spark] itholic opened a new pull request, #42958: [SPARK-45168][PYTHON][FOLLOWUP] Add migration guide for Pandas minimum version upgrade

2023-09-16 Thread via GitHub
itholic opened a new pull request, #42958: URL: https://github.com/apache/spark/pull/42958 ### What changes were proposed in this pull request? This is follow-up for https://github.com/apache/spark/pull/42930 to add migration guide about the minimum version of supported Pandas is upgr

[GitHub] [spark] itholic commented on pull request #42958: [SPARK-45168][PYTHON][FOLLOWUP] Add migration guide for Pandas minimum version upgrade

2023-09-16 Thread via GitHub
itholic commented on PR #42958: URL: https://github.com/apache/spark/pull/42958#issuecomment-1722315155 cc @zhengruifeng -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] itholic closed pull request #42953: [SPARK-45185][BUILD][PYTHON] Ignore type check for preventing unexpected linter failure

2023-09-16 Thread via GitHub
itholic closed pull request #42953: [SPARK-45185][BUILD][PYTHON] Ignore type check for preventing unexpected linter failure URL: https://github.com/apache/spark/pull/42953 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] valentinp17 commented on pull request #42884: [SPARK-42304][FOLLOWUP][SQL] Add test for `GET_TABLES_BY_TYPE_UNSUPPORTED_BY_HIVE_VERSION`

2023-09-16 Thread via GitHub
valentinp17 commented on PR #42884: URL: https://github.com/apache/spark/pull/42884#issuecomment-1722320412 @itholic Done! But I mess up and made merge at first attempt. So I had to make force push rebase. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] sandip-db commented on a diff in pull request #42938: [SPARK-44788][CONNECT][PYTHON][SQL] Add from_xml and schema_of_xml to pyspark, spark connect and sql function

2023-09-16 Thread via GitHub
sandip-db commented on code in PR #42938: URL: https://github.com/apache/spark/pull/42938#discussion_r1328021810 ## python/pyspark/sql/functions.py: ## @@ -13041,6 +13041,120 @@ def json_object_keys(col: "ColumnOrName") -> Column: return _invoke_function_over_columns("json_

[GitHub] [spark] github-actions[bot] commented on pull request #41417: [SPARK-43908][SQL] Choose the bigger rowCount to initialize BloomFilterAggregate in InjectRuntimeFilter

2023-09-16 Thread via GitHub
github-actions[bot] commented on PR #41417: URL: https://github.com/apache/spark/pull/41417#issuecomment-1722349647 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #41108: [SPARK-43427][Protobuf] spark protobuf: modify serde behavior of unsigned integer types

2023-09-16 Thread via GitHub
github-actions[bot] commented on PR #41108: URL: https://github.com/apache/spark/pull/41108#issuecomment-1722349658 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #40990: [SPARK-43317][SQL] Support combine adjacent aggregation

2023-09-16 Thread via GitHub
github-actions[bot] commented on PR #40990: URL: https://github.com/apache/spark/pull/40990#issuecomment-1722349664 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] github-actions[bot] commented on pull request #39691: [SPARK-31561][SQL] Add QUALIFY clause

2023-09-16 Thread via GitHub
github-actions[bot] commented on PR #39691: URL: https://github.com/apache/spark/pull/39691#issuecomment-1722349673 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.

[GitHub] [spark] Hisoka-X commented on pull request #42951: [SPARK-45078][SQL] Fix `array_insert` ImplicitCastInputTypes not work

2023-09-16 Thread via GitHub
Hisoka-X commented on PR #42951: URL: https://github.com/apache/spark/pull/42951#issuecomment-1722357585 The `collectionOperations.scala` have a lots of `Seq.empty`. If we need remove it all, I can create a PR for it. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] dongjoon-hyun commented on pull request #42954: [SPARK-43458][SPARK-43561][PS][TESTS] Enable `test_to_latex` for (Series|DataFrame) conversion

2023-09-16 Thread via GitHub
dongjoon-hyun commented on PR #42954: URL: https://github.com/apache/spark/pull/42954#issuecomment-1722384729 Yay! The CI passed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] dongjoon-hyun closed pull request #42954: [SPARK-43458][SPARK-43561][PS][TESTS] Enable `test_to_latex` for (Series|DataFrame) conversion

2023-09-16 Thread via GitHub
dongjoon-hyun closed pull request #42954: [SPARK-43458][SPARK-43561][PS][TESTS] Enable `test_to_latex` for (Series|DataFrame) conversion URL: https://github.com/apache/spark/pull/42954 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [spark] dongjoon-hyun commented on pull request #42954: [SPARK-43458][SPARK-43561][PS][TESTS] Enable `test_to_latex` for (Series|DataFrame) conversion

2023-09-16 Thread via GitHub
dongjoon-hyun commented on PR #42954: URL: https://github.com/apache/spark/pull/42954#issuecomment-1722384847 Merged to master for Apache Spark 4.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #42956: [SPARK-43654][CONNECT][PS][TESTS] Enable `InternalFrameParityTests.test_from_pandas`

2023-09-16 Thread via GitHub
dongjoon-hyun commented on code in PR #42956: URL: https://github.com/apache/spark/pull/42956#discussion_r1328040066 ## python/pyspark/pandas/tests/connect/test_parity_internal.py: ## @@ -15,18 +15,86 @@ # limitations under the License. # import unittest +import pandas as pd

[GitHub] [spark] dongjoon-hyun closed pull request #42958: [SPARK-45168][PYTHON][FOLLOWUP] Add migration guide for Pandas minimum version upgrade

2023-09-16 Thread via GitHub
dongjoon-hyun closed pull request #42958: [SPARK-45168][PYTHON][FOLLOWUP] Add migration guide for Pandas minimum version upgrade URL: https://github.com/apache/spark/pull/42958 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and