[GitHub] [spark] HyukjinKwon commented on pull request #38480: [SPARK-40994][DOCS][SQL] Add code example in JDBC data source with partitionColumn

2022-11-08 Thread GitBox
HyukjinKwon commented on PR #38480: URL: https://github.com/apache/spark/pull/38480#issuecomment-1307042228 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

2022-11-08 Thread GitBox
HyukjinKwon closed pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2 URL: https://github.com/apache/spark/pull/38547 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] swamirishi commented on a diff in pull request #38377: [SPARK-40901][CORE] Unable to store Spark Driver logs with Absolute Hadoop based URI FS Path

2022-11-08 Thread GitBox
swamirishi commented on code in PR #38377: URL: https://github.com/apache/spark/pull/38377#discussion_r1016480277 ## core/src/main/scala/org/apache/spark/util/logging/DriverLogger.scala: ## @@ -126,13 +126,13 @@ private[spark] class DriverLogger(conf: SparkConf) extends

[GitHub] [spark] HyukjinKwon commented on pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

2022-11-08 Thread GitBox
HyukjinKwon commented on PR #38547: URL: https://github.com/apache/spark/pull/38547#issuecomment-1307039107 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] LuciferYang commented on pull request #38536: [SPARK-40984][CORE][SQL] Use `NON_FOLDABLE_INPUT` instead of `FRAME_LESS_OFFSET_WITHOUT_FOLDABLE`

2022-11-08 Thread GitBox
LuciferYang commented on PR #38536: URL: https://github.com/apache/spark/pull/38536#issuecomment-1307033431 Thanks @panbingkun @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] LuciferYang commented on pull request #38530: [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`

2022-11-08 Thread GitBox
LuciferYang commented on PR #38530: URL: https://github.com/apache/spark/pull/38530#issuecomment-1307033028 Thanks @MaxGekk @itholic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] ulysses-you commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

2022-11-08 Thread GitBox
ulysses-you commented on code in PR #38547: URL: https://github.com/apache/spark/pull/38547#discussion_r1016463908 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala: ## @@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite

[GitHub] [spark] ulysses-you opened a new pull request, #38558: [SPARK-41048][SQL] Improve output partitioning and ordering with AQE cache

2022-11-08 Thread GitBox
ulysses-you opened a new pull request, #38558: URL: https://github.com/apache/spark/pull/38558 ### What changes were proposed in this pull request? Try our best to give a stable output partitioning and ordering if current executed plan is final plan. ### Why are the

[GitHub] [spark] MaxGekk closed pull request #38447: [SPARK-40973][SQL] Rename `_LEGACY_ERROR_TEMP_0055` to `UNCLOSED_BRACKETED_COMMENT`

2022-11-08 Thread GitBox
MaxGekk closed pull request #38447: [SPARK-40973][SQL] Rename `_LEGACY_ERROR_TEMP_0055` to `UNCLOSED_BRACKETED_COMMENT` URL: https://github.com/apache/spark/pull/38447 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] MaxGekk commented on pull request #38447: [SPARK-40973][SQL] Rename `_LEGACY_ERROR_TEMP_0055` to `UNCLOSED_BRACKETED_COMMENT`

2022-11-08 Thread GitBox
MaxGekk commented on PR #38447: URL: https://github.com/apache/spark/pull/38447#issuecomment-1306990491 +1, LGTM. Merging to master. Thank you, @itholic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] MaxGekk closed pull request #38530: [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`

2022-11-08 Thread GitBox
MaxGekk closed pull request #38530: [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE` instead of `MAP_FROM_ENTRIES_WRONG_TYPE` URL: https://github.com/apache/spark/pull/38530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-08 Thread GitBox
zhengruifeng commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1016358065 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -117,10 +129,91 @@ class

[GitHub] [spark] MaxGekk commented on pull request #38530: [SPARK-41027][SQL] Use `UNEXPECTED_INPUT_TYPE` instead of `MAP_FROM_ENTRIES_WRONG_TYPE`

2022-11-08 Thread GitBox
MaxGekk commented on PR #38530: URL: https://github.com/apache/spark/pull/38530#issuecomment-1306986962 +1, LGTM. Merging to master. Thank you, @LuciferYang and @itholic for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] MaxGekk closed pull request #38515: [SPARK-41015][SQL][PROTOBUF] UnitTest null check for data generator

2022-11-08 Thread GitBox
MaxGekk closed pull request #38515: [SPARK-41015][SQL][PROTOBUF] UnitTest null check for data generator URL: https://github.com/apache/spark/pull/38515 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] MaxGekk commented on pull request #38515: [SPARK-41015][SQL][PROTOBUF] UnitTest null check for data generator

2022-11-08 Thread GitBox
MaxGekk commented on PR #38515: URL: https://github.com/apache/spark/pull/38515#issuecomment-1306969204 +1, LGTM. Merging to master. Thank you, @SandishKumarHN and @rangadi for review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] MaxGekk closed pull request #38536: [SPARK-40984][CORE][SQL] Use `NON_FOLDABLE_INPUT` instead of `FRAME_LESS_OFFSET_WITHOUT_FOLDABLE`

2022-11-08 Thread GitBox
MaxGekk closed pull request #38536: [SPARK-40984][CORE][SQL] Use `NON_FOLDABLE_INPUT` instead of `FRAME_LESS_OFFSET_WITHOUT_FOLDABLE` URL: https://github.com/apache/spark/pull/38536 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] MaxGekk commented on pull request #38536: [SPARK-40984][CORE][SQL] Use `NON_FOLDABLE_INPUT` instead of `FRAME_LESS_OFFSET_WITHOUT_FOLDABLE`

2022-11-08 Thread GitBox
MaxGekk commented on PR #38536: URL: https://github.com/apache/spark/pull/38536#issuecomment-1306959410 +1, LGTM. Merging to master. Thank you, @LuciferYang. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] MaxGekk commented on a diff in pull request #38547: [SPARK-40798][SQL][TESTS][FOLLOW-UP] Disable ANSI at the test case for DSv2

2022-11-08 Thread GitBox
MaxGekk commented on code in PR #38547: URL: https://github.com/apache/spark/pull/38547#discussion_r1016411170 ## sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/AlterTableAddPartitionSuite.scala: ## @@ -129,7 +129,9 @@ class AlterTableAddPartitionSuite

[GitHub] [spark] AmplabJenkins commented on pull request #38555: [WIP][SPARK-41044][SQL] Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to DATATYPE_MISMATCH.INTERNAL_ERROR

2022-11-08 Thread GitBox
AmplabJenkins commented on PR #38555: URL: https://github.com/apache/spark/pull/38555#issuecomment-1306947596 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] MaxGekk closed pull request #38548: [SPARK-40663][SQL][FOLLOWUP] `SparkIllegalArgumentException` should accept `cause`

2022-11-08 Thread GitBox
MaxGekk closed pull request #38548: [SPARK-40663][SQL][FOLLOWUP] `SparkIllegalArgumentException` should accept `cause` URL: https://github.com/apache/spark/pull/38548 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] c21 commented on pull request #38480: [SPARK-40994][DOCS][SQL] Add code example in JDBC data source with partitionColumn

2022-11-08 Thread GitBox
c21 commented on PR #38480: URL: https://github.com/apache/spark/pull/38480#issuecomment-1306928731 cc @HyukjinKwon / @cloud-fan - any comment before merging? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] MaxGekk commented on pull request #38537: [SPARK-41043][SQL] Rename the error class `_LEGACY_ERROR_TEMP_2429` to `COLUMNS_NUM_MISMATCH`

2022-11-08 Thread GitBox
MaxGekk commented on PR #38537: URL: https://github.com/apache/spark/pull/38537#issuecomment-1306921664 @cloud-fan @srielau @itholic @LuciferYang @panbingkun Could you review this PR, please. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-08 Thread GitBox
zhengruifeng commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1016363328 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -117,10 +129,91 @@ class

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-08 Thread GitBox
zhengruifeng commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1016363328 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -117,10 +129,91 @@ class

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-08 Thread GitBox
zhengruifeng commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1016362177 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -117,10 +129,91 @@ class

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38468: [SPARK-41005][CONNECT][PYTHON] Arrow-based collect

2022-11-08 Thread GitBox
zhengruifeng commented on code in PR #38468: URL: https://github.com/apache/spark/pull/38468#discussion_r1016358065 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectStreamHandler.scala: ## @@ -117,10 +129,91 @@ class

[GitHub] [spark] panbingkun commented on a diff in pull request #38555: [WIP][SPARK-41044][SQL] Convert DATATYPE_MISMATCH.UNSPECIFIED_FRAME to DATATYPE_MISMATCH.INTERNAL_ERROR

2022-11-08 Thread GitBox
panbingkun commented on code in PR #38555: URL: https://github.com/apache/spark/pull/38555#discussion_r1016344333 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala: ## @@ -66,7 +66,13 @@ case class WindowSpecDefinition( override

[GitHub] [spark] WeichenXu123 commented on a diff in pull request #37734: [SPARK-40264][ML] add batch_infer_udf function to pyspark.ml.functions

2022-11-08 Thread GitBox
WeichenXu123 commented on code in PR #37734: URL: https://github.com/apache/spark/pull/37734#discussion_r1016311501 ## python/pyspark/ml/functions.py: ## @@ -106,6 +117,556 @@ def array_to_vector(col: Column) -> Column: return

[GitHub] [spark] cloud-fan commented on pull request #38557: [SPARK-38959][SQL][FOLLOWUP] Optimizer batch `PartitionPruning` should optimize subqueries

2022-11-08 Thread GitBox
cloud-fan commented on PR #38557: URL: https://github.com/apache/spark/pull/38557#issuecomment-1306818720 cc @aokolnychyi @viirya @wangyum -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan commented on pull request #38526: [SPARK-38959][SQL][FOLLOW-UP] Address feedback for RowLevelOperationRuntimeGroupFiltering

2022-11-08 Thread GitBox
cloud-fan commented on PR #38526: URL: https://github.com/apache/spark/pull/38526#issuecomment-1306817732 For the open question, I'm addressing it in https://github.com/apache/spark/pull/38557 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] cloud-fan closed pull request #38526: [SPARK-38959][SQL][FOLLOW-UP] Address feedback for RowLevelOperationRuntimeGroupFiltering

2022-11-08 Thread GitBox
cloud-fan closed pull request #38526: [SPARK-38959][SQL][FOLLOW-UP] Address feedback for RowLevelOperationRuntimeGroupFiltering URL: https://github.com/apache/spark/pull/38526 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] cloud-fan commented on pull request #38526: [SPARK-38959][SQL][FOLLOW-UP] Address feedback for RowLevelOperationRuntimeGroupFiltering

2022-11-08 Thread GitBox
cloud-fan commented on PR #38526: URL: https://github.com/apache/spark/pull/38526#issuecomment-1306816909 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] cloud-fan opened a new pull request, #38557: [SPARK-38959][SQL][FOLLOWUP] Optimizer batch `PartitionPruning` should optimize subqueries

2022-11-08 Thread GitBox
cloud-fan opened a new pull request, #38557: URL: https://github.com/apache/spark/pull/38557 ### What changes were proposed in this pull request? This is a followup to https://github.com/apache/spark/pull/36304 to simplify `RowLevelOperationRuntimeGroupFiltering`. It does 3

[GitHub] [spark] EnricoMi commented on pull request #38358: [SPARK-40588] FileFormatWriter materializes AQE plan before accessing outputOrdering

2022-11-08 Thread GitBox
EnricoMi commented on PR #38358: URL: https://github.com/apache/spark/pull/38358#issuecomment-1306782288 There is no Spark 3.x release that does not suffer from this. This blocks people from moving to Spark 3, while Spark 3.0 and 3.1 are already EOL. Please reconsider providing a fix

<    1   2   3