[GitHub] [spark] gengliangwang commented on a diff in pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
gengliangwang commented on code in PR #38379: URL: https://github.com/apache/spark/pull/38379#discussion_r1004034513 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -477,7 +477,10 @@ case class Add( override protected def

[GitHub] [spark] gengliangwang commented on a diff in pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
gengliangwang commented on code in PR #38379: URL: https://github.com/apache/spark/pull/38379#discussion_r1004033998 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -477,7 +477,10 @@ case class Add( override protected def

[GitHub] [spark] gengliangwang commented on a diff in pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
gengliangwang commented on code in PR #38379: URL: https://github.com/apache/spark/pull/38379#discussion_r1004033863 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -477,7 +477,10 @@ case class Add( override protected def

[GitHub] [spark] MaxGekk commented on pull request #38350: [WIP][SPARK-40752][SQL] Migrate type check failures of misc expressions onto error classes

2022-10-24 Thread GitBox
MaxGekk commented on PR #38350: URL: https://github.com/apache/spark/pull/38350#issuecomment-1290021018 @panbingkun Please, fix the coding style issue: ``` [error]

[GitHub] [spark] MaxGekk commented on pull request #38359: [WIP][SPARK-40751][SQL] Migrate type check failures of high order functions onto error classes

2022-10-24 Thread GitBox
MaxGekk commented on PR #38359: URL: https://github.com/apache/spark/pull/38359#issuecomment-1290020506 @panbingkun Please, fix coding style issue: ``` [error]

[GitHub] [spark] cloud-fan commented on a diff in pull request #38312: [SPARK-40819][SQL] Timestamp nanos behaviour regression

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38312: URL: https://github.com/apache/spark/pull/38312#discussion_r1004026112 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala: ## @@ -198,6 +205,31 @@ abstract class ParquetSchemaTest

[GitHub] [spark] cloud-fan commented on a diff in pull request #38345: [SPARK-40879][CONNECT] Support Join UsingColumns in proto

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38345: URL: https://github.com/apache/spark/pull/38345#discussion_r1004023351 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/dsl/package.scala: ## @@ -238,13 +238,17 @@ package object dsl { def join(

[GitHub] [spark] cloud-fan commented on a diff in pull request #38345: [SPARK-40879][CONNECT] Support Join UsingColumns in proto

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38345: URL: https://github.com/apache/spark/pull/38345#discussion_r1004022886 ## connector/connect/src/main/protobuf/spark/connect/relations.proto: ## @@ -109,6 +109,10 @@ message Join { Relation right = 2; Expression join_condition = 3;

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38375: [SPARK-40900][SQL] Reimplement `frequentItems` with dataframe operations

2022-10-24 Thread GitBox
zhengruifeng commented on code in PR #38375: URL: https://github.com/apache/spark/pull/38375#discussion_r1004021114 ## sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala: ## @@ -85,42 +51,144 @@ object FrequentItems extends Logging { cols:

[GitHub] [spark] MaxGekk commented on pull request #38360: [SPARK-40811][SQL][TESTS] Use `checkError()` to intercept `ParseException`: phase 2

2022-10-24 Thread GitBox
MaxGekk commented on PR #38360: URL: https://github.com/apache/spark/pull/38360#issuecomment-1290009670 @LuciferYang @panbingkun @itholic Could you review this PR, please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] cloud-fan commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1004001685 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -26,4 +26,12 @@ private[spark] object Connect { .intConf

[GitHub] [spark] cloud-fan commented on a diff in pull request #38375: [SPARK-40900][SQL] Reimplement `frequentItems` with dataframe operations

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38375: URL: https://github.com/apache/spark/pull/38375#discussion_r1004000396 ## sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala: ## @@ -85,42 +51,144 @@ object FrequentItems extends Logging { cols:

[GitHub] [spark] itholic commented on a diff in pull request #37955: [SPARK-40512][SPARK-40896][PS][INFRA] Upgrade pandas to 1.5.0

2022-10-24 Thread GitBox
itholic commented on code in PR #37955: URL: https://github.com/apache/spark/pull/37955#discussion_r1003999111 ## python/pyspark/pandas/strings.py: ## @@ -2316,7 +2316,7 @@ def zfill(self, width: int) -> "ps.Series": left). 1000 remains unchanged as it is longer than

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38375: [SPARK-40900][SQL] Reimplement `frequentItems` with dataframe operations

2022-10-24 Thread GitBox
zhengruifeng commented on code in PR #38375: URL: https://github.com/apache/spark/pull/38375#discussion_r1003991925 ## sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala: ## @@ -85,42 +51,142 @@ object FrequentItems extends Logging { cols:

[GitHub] [spark] grundprinzip commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
grundprinzip commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003991428 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -26,4 +26,12 @@ private[spark] object Connect { .intConf

[GitHub] [spark] itholic commented on pull request #38177: [SPARK-40663][SQL](Final) Migrate execution errors onto error classes

2022-10-24 Thread GitBox
itholic commented on PR #38177: URL: https://github.com/apache/spark/pull/38177#issuecomment-1289968424 Fixed & re-gened the golden files! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
HyukjinKwon commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003988774 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -26,4 +26,12 @@ private[spark] object Connect { .intConf

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003984608 ## connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufSerdeSuite.scala: ## @@ -163,18 +163,22 @@ class ProtobufSerdeSuite extends

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003984608 ## connector/protobuf/src/test/scala/org/apache/spark/sql/protobuf/ProtobufSerdeSuite.scala: ## @@ -163,18 +163,22 @@ class ProtobufSerdeSuite extends

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003984452 ## connector/protobuf/src/test/resources/protobuf/timestamp.proto: ## @@ -0,0 +1,26 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003983288 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufUtils.scala: ## @@ -196,27 +194,31 @@ private[sql] object ProtobufUtils extends

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38375: [SPARK-40900][SQL] Reimplement `frequentItems` with dataframe operations

2022-10-24 Thread GitBox
zhengruifeng commented on code in PR #38375: URL: https://github.com/apache/spark/pull/38375#discussion_r1003976467 ## sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala: ## @@ -85,42 +51,142 @@ object FrequentItems extends Logging { cols:

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38375: [SPARK-40900][SQL] Reimplement `frequentItems` with dataframe operations

2022-10-24 Thread GitBox
zhengruifeng commented on code in PR #38375: URL: https://github.com/apache/spark/pull/38375#discussion_r1003976380 ## sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala: ## @@ -85,42 +51,142 @@ object FrequentItems extends Logging { cols:

[GitHub] [spark] zhengruifeng commented on a diff in pull request #38375: [SPARK-40900][SQL] Reimplement `frequentItems` with dataframe operations

2022-10-24 Thread GitBox
zhengruifeng commented on code in PR #38375: URL: https://github.com/apache/spark/pull/38375#discussion_r1003975996 ## sql/core/src/main/scala/org/apache/spark/sql/execution/stat/FrequentItems.scala: ## @@ -85,42 +51,142 @@ object FrequentItems extends Logging { cols:

[GitHub] [spark] zhengruifeng commented on pull request #38383: [SPARK-40906][SQL] `Mode` should copy keys before inserting into Map

2022-10-24 Thread GitBox
zhengruifeng commented on PR #38383: URL: https://github.com/apache/spark/pull/38383#issuecomment-1289944021 will also send a separate fix for `PandasMode` since it's dedicated for Pandas -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] zhengruifeng opened a new pull request, #38383: [SPARK-40906][SQL] `Mode` should copy keys before inserting into Map

2022-10-24 Thread GitBox
zhengruifeng opened a new pull request, #38383: URL: https://github.com/apache/spark/pull/38383 ### What changes were proposed in this pull request? `Mode` should copy keys before inserting into Map ### Why are the changes needed? the result maybe incorrect: ``` val df

[GitHub] [spark] LuciferYang opened a new pull request, #38382: [SPARK-40905][BUILD] Upgrade rocksdbjni to 7.7.3

2022-10-24 Thread GitBox
LuciferYang opened a new pull request, #38382: URL: https://github.com/apache/spark/pull/38382 ### What changes were proposed in this pull request? This pr aims to upgrade rocksdbjni from 7.6.0 to 7.7.3. ### Why are the changes needed? This version bring the performance of

[GitHub] [spark] itholic commented on a diff in pull request #38177: [SPARK-40663][SQL](Final) Migrate execution errors onto error classes

2022-10-24 Thread GitBox
itholic commented on code in PR #38177: URL: https://github.com/apache/spark/pull/38177#discussion_r1003966632 ## sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/StringExpressionsSuite.scala: ## @@ -1124,8 +1124,9 @@ class StringExpressionsSuite extends

[GitHub] [spark] itholic commented on a diff in pull request #38177: [SPARK-40663][SQL](Final) Migrate execution errors onto error classes

2022-10-24 Thread GitBox
itholic commented on code in PR #38177: URL: https://github.com/apache/spark/pull/38177#discussion_r1003965906 ## sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala: ## @@ -2602,31 +2604,46 @@ private[sql] object QueryExecutionErrors extends

[GitHub] [spark] chenminghua8 commented on pull request #38381: [SPARK-40793][SQL] Fix the LogicalRelation computeStats for Row-level Runtime Filtering cannot be applied

2022-10-24 Thread GitBox
chenminghua8 commented on PR #38381: URL: https://github.com/apache/spark/pull/38381#issuecomment-1289930238 > e.g., `[SPARK-40793][SQL] Fix the LogicalRelation computeStats for Row-level Runtime Filtering cannot be applied ` @HyukjinKwon I have linked the JIRA ticket into the PR title.

[GitHub] [spark] cloud-fan commented on a diff in pull request #38336: [SPARK-40862][SQL] Support non-aggregated subqueries in RewriteCorrelatedScalarSubquery

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38336: URL: https://github.com/apache/spark/pull/38336#discussion_r1003962035 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala: ## @@ -561,14 +566,18 @@ object RewriteCorrelatedScalarSubquery extends

[GitHub] [spark] cloud-fan commented on a diff in pull request #38336: [SPARK-40862][SQL] Support non-aggregated subqueries in RewriteCorrelatedScalarSubquery

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38336: URL: https://github.com/apache/spark/pull/38336#discussion_r1003961963 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala: ## @@ -561,14 +566,18 @@ object RewriteCorrelatedScalarSubquery extends

[GitHub] [spark] cloud-fan commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003959231 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectInterceptorRegistry.scala: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] [spark] cloud-fan commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003958927 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -26,4 +26,12 @@ private[spark] object Connect { .intConf

[GitHub] [spark] cloud-fan closed pull request #38374: [SPARK-40899] [CONNECT] Make UserContext extensible.

2022-10-24 Thread GitBox
cloud-fan closed pull request #38374: [SPARK-40899] [CONNECT] Make UserContext extensible. URL: https://github.com/apache/spark/pull/38374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan commented on pull request #38374: [SPARK-40899] [CONNECT] Make UserContext extensible.

2022-10-24 Thread GitBox
cloud-fan commented on PR #38374: URL: https://github.com/apache/spark/pull/38374#issuecomment-1289921835 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
HyukjinKwon commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003955956 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectInterceptorRegistry.scala: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
HyukjinKwon commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003955262 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -26,4 +26,12 @@ private[spark] object Connect { .intConf

[GitHub] [spark] cloud-fan commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003954707 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectInterceptorRegistry.scala: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] [spark] cloud-fan commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003954578 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -26,4 +26,12 @@ private[spark] object Connect { .intConf

[GitHub] [spark] HyukjinKwon commented on pull request #38381: Fix the LogicalRelation computeStats for Row-level Runtime Filtering cannot be applied

2022-10-24 Thread GitBox
HyukjinKwon commented on PR #38381: URL: https://github.com/apache/spark/pull/38381#issuecomment-1289915367 e.g., `[SPARK-40793][SQL] Fix the LogicalRelation computeStats for Row-level Runtime Filtering cannot be applied ` -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] chenminghua8 commented on pull request #38381: Fix the LogicalRelation computeStats for Row-level Runtime Filtering cannot be applied

2022-10-24 Thread GitBox
chenminghua8 commented on PR #38381: URL: https://github.com/apache/spark/pull/38381#issuecomment-1289909307 > @chenminghua8 mind linking the JIRA ticket into the PR title? See also https://spark.apache.org/contributing.html @HyukjinKwon Thank you! but I don't know how to get the JIRA

[GitHub] [spark] HyukjinKwon commented on pull request #38381: Fix the LogicalRelation computeStats for Row-level Runtime Filtering cannot be applied

2022-10-24 Thread GitBox
HyukjinKwon commented on PR #38381: URL: https://github.com/apache/spark/pull/38381#issuecomment-1289904503 @chenminghua8 mind linking the JIRA ticket into the PR title? See also https://spark.apache.org/contributing.html -- This is an automated message from the Apache Git Service. To

[GitHub] [spark-docker] dcoliversun commented on pull request #19: [SPARK-40855] Add CONTRIBUTING.md for apache/spark-docker

2022-10-24 Thread GitBox
dcoliversun commented on PR #19: URL: https://github.com/apache/spark-docker/pull/19#issuecomment-1289903890 Thanks for your review :) @Yikun @holdenk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
HyukjinKwon commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003942285 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala: ## @@ -26,4 +26,12 @@ private[spark] object Connect { .intConf

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
HyukjinKwon commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003941174 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectInterceptorRegistry.scala: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
HyukjinKwon commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003941174 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectInterceptorRegistry.scala: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] [spark] chenminghua8 commented on pull request #38381: Fix the LogicalRelation computeStats for Row-level Runtime Filtering cannot be applied

2022-10-24 Thread GitBox
chenminghua8 commented on PR #38381: URL: https://github.com/apache/spark/pull/38381#issuecomment-1289898298 [SPARK-40793](https://issues.apache.org/jira/browse/SPARK-40793) is the JIRA for this PR. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #38320: [SPARK-40857] [CONNECT] Enable configurable GPRC Interceptors.

2022-10-24 Thread GitBox
HyukjinKwon commented on code in PR #38320: URL: https://github.com/apache/spark/pull/38320#discussion_r1003940378 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectInterceptorRegistry.scala: ## @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] [spark] chenminghua8 commented on pull request #38213: fix runtime filter do not execute when no stats

2022-10-24 Thread GitBox
chenminghua8 commented on PR #38213: URL: https://github.com/apache/spark/pull/38213#issuecomment-1289895440 change to Pull request 38381 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on pull request #38380: [SPARK-40904][K8S] Support `zsh` in K8s `entrypoint.sh`

2022-10-24 Thread GitBox
dongjoon-hyun commented on PR #38380: URL: https://github.com/apache/spark/pull/38380#issuecomment-1289894414 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] chenminghua8 opened a new pull request, #38381: Fix the LogicalRelation computeStats for Row-level Runtime Filtering cannot be applied

2022-10-24 Thread GitBox
chenminghua8 opened a new pull request, #38381: URL: https://github.com/apache/spark/pull/38381 ### What changes were proposed in this pull request? This RP modifies the "computeStats" method of the "LogicalRelation" class: when the external table does not perform

[GitHub] [spark-docker] Yikun commented on pull request #19: [SPARK-40855] Add CONTRIBUTING.md for apache/spark-docker

2022-10-24 Thread GitBox
Yikun commented on PR #19: URL: https://github.com/apache/spark-docker/pull/19#issuecomment-1289885331 @holdenk @dcoliversun Thanks! Merge to master (3.4.0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark-docker] Yikun closed pull request #19: [SPARK-40855] Add CONTRIBUTING.md for apache/spark-docker

2022-10-24 Thread GitBox
Yikun closed pull request #19: [SPARK-40855] Add CONTRIBUTING.md for apache/spark-docker URL: https://github.com/apache/spark-docker/pull/19 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] ulysses-you commented on a diff in pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
ulysses-you commented on code in PR #38379: URL: https://github.com/apache/spark/pull/38379#discussion_r1003919162 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -477,7 +477,10 @@ case class Add( override protected def

[GitHub] [spark] cloud-fan commented on a diff in pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38379: URL: https://github.com/apache/spark/pull/38379#discussion_r1003906257 ## sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala: ## @@ -4518,6 +4518,14 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession with

[GitHub] [spark] warrenzhu25 commented on pull request #38183: [SPARK-32288][UI] Add failure summary for failed tasks in stage page

2022-10-24 Thread GitBox
warrenzhu25 commented on PR #38183: URL: https://github.com/apache/spark/pull/38183#issuecomment-1289850864 @gengliangwang and @sarutak FYI @dongjoon-hyun Could you help take a look? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan commented on a diff in pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38379: URL: https://github.com/apache/spark/pull/38379#discussion_r1003905809 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -477,7 +477,10 @@ case class Add( override protected def

[GitHub] [spark] cloud-fan commented on a diff in pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38379: URL: https://github.com/apache/spark/pull/38379#discussion_r1003905297 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/arithmetic.scala: ## @@ -477,7 +477,10 @@ case class Add( override protected def

[GitHub] [spark] cloud-fan commented on a diff in pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
cloud-fan commented on code in PR #38379: URL: https://github.com/apache/spark/pull/38379#discussion_r1003905057 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala: ## @@ -242,7 +242,13 @@ abstract class Expression extends

[GitHub] [spark] pan3793 commented on pull request #38205: [SPARK-40747][CORE] Support setting driver log url using env vars on other resource managers

2022-10-24 Thread GitBox
pan3793 commented on PR #38205: URL: https://github.com/apache/spark/pull/38205#issuecomment-1289847809 @mridulm Yes, https://github.com/apache/spark/pull/38357 is the proposed version. I opened this PR mostly for collecting feedback in case the community has another idea. --

[GitHub] [spark] amaliujia commented on a diff in pull request #38301: [SPARK-40836][CONNECT] AnalyzeResult should use struct for schema

2022-10-24 Thread GitBox
amaliujia commented on code in PR #38301: URL: https://github.com/apache/spark/pull/38301#discussion_r1003890195 ## connector/connect/src/main/scala/org/apache/spark/sql/connect/planner/DataTypeProtoConverter.scala: ## @@ -50,11 +50,27 @@ object DataTypeProtoConverter {

[GitHub] [spark] github-actions[bot] closed pull request #37141: [SPARK-39024][CORE][YARN] Notify External Shuffle Service when Yarn Sends a Node in Decommissioning State

2022-10-24 Thread GitBox
github-actions[bot] closed pull request #37141: [SPARK-39024][CORE][YARN] Notify External Shuffle Service when Yarn Sends a Node in Decommissioning State URL: https://github.com/apache/spark/pull/37141 -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] github-actions[bot] closed pull request #37053: [SPARK-39452][GraphX] Extend EdgePartition1D with Destination based Strategy

2022-10-24 Thread GitBox
github-actions[bot] closed pull request #37053: [SPARK-39452][GraphX] Extend EdgePartition1D with Destination based Strategy URL: https://github.com/apache/spark/pull/37053 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] dongjoon-hyun commented on pull request #38380: [SPARK-40904][K8S] Support `zsh` in K8s `entrypoint.sh`

2022-10-24 Thread GitBox
dongjoon-hyun commented on PR #38380: URL: https://github.com/apache/spark/pull/38380#issuecomment-1289812287 Merged to master for Apache Spark 3.4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dongjoon-hyun closed pull request #38380: [SPARK-40904][K8S] Support `zsh` in K8s `entrypoint.sh`

2022-10-24 Thread GitBox
dongjoon-hyun closed pull request #38380: [SPARK-40904][K8S] Support `zsh` in K8s `entrypoint.sh` URL: https://github.com/apache/spark/pull/38380 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on pull request #38380: [SPARK-40904][K8S] Support `zsh` in K8s `entrypoint.sh`

2022-10-24 Thread GitBox
dongjoon-hyun commented on PR #38380: URL: https://github.com/apache/spark/pull/38380#issuecomment-1289808788 Thank you, @viirya . The `else` statement is only supported by `zsh`~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on pull request #38380: [SPARK-40904][K8S] Support `zsh` in K8s `entrypoint.sh`

2022-10-24 Thread GitBox
dongjoon-hyun commented on PR #38380: URL: https://github.com/apache/spark/pull/38380#issuecomment-1289803308 All tests passed. Could you review this, please, @viirya ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] amaliujia commented on a diff in pull request #38345: [SPARK-40879][CONNECT] Support Join UsingColumns in proto

2022-10-24 Thread GitBox
amaliujia commented on code in PR #38345: URL: https://github.com/apache/spark/pull/38345#discussion_r1003878598 ## connector/connect/src/main/protobuf/spark/connect/relations.proto: ## @@ -109,6 +109,7 @@ message Join { Relation right = 2; Expression join_condition = 3;

[GitHub] [spark] amaliujia commented on a diff in pull request #38345: [SPARK-40879][CONNECT] Support Join UsingColumns in proto

2022-10-24 Thread GitBox
amaliujia commented on code in PR #38345: URL: https://github.com/apache/spark/pull/38345#discussion_r1003878330 ## connector/connect/src/main/protobuf/spark/connect/relations.proto: ## @@ -109,6 +109,7 @@ message Join { Relation right = 2; Expression join_condition = 3;

[GitHub] [spark] allisonwang-db commented on pull request #38336: [SPARK-40862][SQL] Support non-aggregated subqueries in RewriteCorrelatedScalarSubquery

2022-10-24 Thread GitBox
allisonwang-db commented on PR #38336: URL: https://github.com/apache/spark/pull/38336#issuecomment-1289772401 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] dongjoon-hyun opened a new pull request, #38380: [SPARK-40904][K8S] Support zsh in K8s entrypoint.sh

2022-10-24 Thread GitBox
dongjoon-hyun opened a new pull request, #38380: URL: https://github.com/apache/spark/pull/38380 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ###

[GitHub] [spark] gengliangwang commented on pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
gengliangwang commented on PR #38379: URL: https://github.com/apache/spark/pull/38379#issuecomment-1289658114 I confirmed that the regression is caused by the refactoring PR https://github.com/apache/spark/pull/36698/. Before the refactor, the query will look like ```

[GitHub] [spark] amaliujia commented on a diff in pull request #38374: [SPARK-40899] [CONNECT] Make UserContext extensible.

2022-10-24 Thread GitBox
amaliujia commented on code in PR #38374: URL: https://github.com/apache/spark/pull/38374#discussion_r1003781192 ## connector/connect/src/main/protobuf/spark/connect/base.proto: ## @@ -51,6 +52,12 @@ message Request { message UserContext { string user_id = 1;

[GitHub] [spark] sadikovi commented on a diff in pull request #38277: [SPARK-40815][SQL] Introduce DelegateSymlinkTextInputFormat to handle empty splits when "spark.hadoopRDD.ignoreEmptySplits" is ena

2022-10-24 Thread GitBox
sadikovi commented on code in PR #38277: URL: https://github.com/apache/spark/pull/38277#discussion_r1003781190 ## sql/hive/src/main/java/org/apache/hadoop/hive/ql/io/DelegateSymlinkTextInputFormat.java: ## @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] [spark] maryannxue commented on a diff in pull request #38358: [SPARK-40588] FileFormatWriter materializes AQE plan before accessing outputOrdering

2022-10-24 Thread GitBox
maryannxue commented on code in PR #38358: URL: https://github.com/apache/spark/pull/38358#discussion_r1003780647 ## sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala: ## @@ -221,6 +221,8 @@ case class AdaptiveSparkPlanExec(

[GitHub] [spark] awdavidson commented on a diff in pull request #38312: [SPARK-40819][SQL] Timestamp nanos behaviour regression

2022-10-24 Thread GitBox
awdavidson commented on code in PR #38312: URL: https://github.com/apache/spark/pull/38312#discussion_r1003756142 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala: ## @@ -198,6 +205,31 @@ abstract class ParquetSchemaTest

[GitHub] [spark] gengliangwang commented on pull request #38379: [SPARK-40903][SQL] Avoid reordering decimal Add for canonicalization

2022-10-24 Thread GitBox
gengliangwang commented on PR #38379: URL: https://github.com/apache/spark/pull/38379#issuecomment-1289600215 cc @peter-toth @ulysses-you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] gengliangwang opened a new pull request, #38379: fix decimal add's canonicalized

2022-10-24 Thread GitBox
gengliangwang opened a new pull request, #38379: URL: https://github.com/apache/spark/pull/38379 ### What changes were proposed in this pull request? Avoid reordering Add for canonicalizing if it is decimal type. Expressions are canonicalized for comparisons and

[GitHub] [spark] rangadi commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
rangadi commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003729392 ## connector/protobuf/src/main/scala/org/apache/spark/sql/protobuf/utils/ProtobufUtils.scala: ## @@ -196,27 +194,31 @@ private[sql] object ProtobufUtils extends

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003686011 ## connector/protobuf/pom.xml: ## @@ -123,6 +123,7 @@ com.google.protobuf:protoc:${protobuf.version}

[GitHub] [spark] LucaCanali commented on pull request #33559: [SPARK-34265][PYTHON][SQL] Instrument Python UDFs using SQL metrics

2022-10-24 Thread GitBox
LucaCanali commented on PR #33559: URL: https://github.com/apache/spark/pull/33559#issuecomment-1289502676 Thank you @cloud-fan ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] rangadi commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
rangadi commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003664791 ## connector/protobuf/pom.xml: ## @@ -123,6 +123,7 @@ com.google.protobuf:protoc:${protobuf.version} ${protobuf.version} +

[GitHub] [spark] srowen commented on pull request #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10

2022-10-24 Thread GitBox
srowen commented on PR #38352: URL: https://github.com/apache/spark/pull/38352#issuecomment-1289480615 Huh, that also seems unrelated. Let's hold onto this for a day or two and then rerun if needed -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] mridulm commented on a diff in pull request #37638: [SPARK-33573][SHUFFLE][YARN] Shuffle server side metrics for Push-based shuffle

2022-10-24 Thread GitBox
mridulm commented on code in PR #37638: URL: https://github.com/apache/spark/pull/37638#discussion_r1003658008 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -1904,4 +1941,42 @@ long getPos() { return pos;

[GitHub] [spark] mridulm commented on a diff in pull request #37638: [SPARK-33573][SHUFFLE][YARN] Shuffle server side metrics for Push-based shuffle

2022-10-24 Thread GitBox
mridulm commented on code in PR #37638: URL: https://github.com/apache/spark/pull/37638#discussion_r1003656427 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -593,6 +607,9 @@ public void onData(String streamId,

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003653232 ## connector/protobuf/pom.xml: ## @@ -123,6 +123,7 @@ com.google.protobuf:protoc:${protobuf.version}

[GitHub] [spark] mridulm commented on pull request #38205: [SPARK-40747][CORE] Support setting driver log url using env vars on other resource managers

2022-10-24 Thread GitBox
mridulm commented on PR #38205: URL: https://github.com/apache/spark/pull/38205#issuecomment-1289465582 I tagged @tgravescs on #38357, assuming that is the version that will get supported - or is this what we are looking at ? -- This is an automated message from the Apache Git Service.

[GitHub] [spark] rangadi commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
rangadi commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003651551 ## connector/protobuf/pom.xml: ## @@ -123,6 +123,7 @@ com.google.protobuf:protoc:${protobuf.version} ${protobuf.version} +

[GitHub] [spark] mridulm commented on pull request #38357: [SPARK-40887][K8S] Allow Spark on K8s to integrate w/ Log Service

2022-10-24 Thread GitBox
mridulm commented on PR #38357: URL: https://github.com/apache/spark/pull/38357#issuecomment-1289462750 +CC @tgravescs Since you have more context on this from yarn pov than I do -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] otterc commented on a diff in pull request #37638: [SPARK-33573][SHUFFLE][YARN] Shuffle server side metrics for Push-based shuffle

2022-10-24 Thread GitBox
otterc commented on code in PR #37638: URL: https://github.com/apache/spark/pull/37638#discussion_r1003646453 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -1904,4 +1941,42 @@ long getPos() { return pos;

[GitHub] [spark] bjornjorgensen commented on pull request #38352: [SPARK-40801][BUILD][3.2] Upgrade `Apache commons-text` to 1.10

2022-10-24 Thread GitBox
bjornjorgensen commented on PR #38352: URL: https://github.com/apache/spark/pull/38352#issuecomment-1289451302 @srowen I did rerun the failed tests now and now they pass. But there are 2 python tests that don't pass, -The python3.9 -m black command was not found and

[GitHub] [spark] mridulm commented on a diff in pull request #37638: [SPARK-33573][SHUFFLE][YARN] Shuffle server side metrics for Push-based shuffle

2022-10-24 Thread GitBox
mridulm commented on code in PR #37638: URL: https://github.com/apache/spark/pull/37638#discussion_r1003638095 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -1197,15 +1230,15 @@ public void onData(String streamId,

[GitHub] [spark] mridulm commented on a diff in pull request #37638: [SPARK-33573][SHUFFLE][YARN] Shuffle server side metrics for Push-based shuffle

2022-10-24 Thread GitBox
mridulm commented on code in PR #37638: URL: https://github.com/apache/spark/pull/37638#discussion_r1003635126 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -1197,15 +1230,15 @@ public void onData(String streamId,

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003632674 ## connector/protobuf/pom.xml: ## @@ -123,6 +123,7 @@ com.google.protobuf:protoc:${protobuf.version}

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003632674 ## connector/protobuf/pom.xml: ## @@ -123,6 +123,7 @@ com.google.protobuf:protoc:${protobuf.version}

[GitHub] [spark] otterc commented on a diff in pull request #37638: [SPARK-33573][SHUFFLE][YARN] Shuffle server side metrics for Push-based shuffle

2022-10-24 Thread GitBox
otterc commented on code in PR #37638: URL: https://github.com/apache/spark/pull/37638#discussion_r1003631087 ## common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/RemoteBlockPushResolver.java: ## @@ -593,6 +607,9 @@ public void onData(String streamId,

[GitHub] [spark] rangadi commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
rangadi commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003630603 ## connector/protobuf/pom.xml: ## @@ -123,6 +123,7 @@ com.google.protobuf:protoc:${protobuf.version} ${protobuf.version} +

[GitHub] [spark] tgravescs commented on pull request #38032: [WIP][SPARK-40597][CORE] local mode should respect TASK_MAX_FAILURES like all other cluster managers

2022-10-24 Thread GitBox
tgravescs commented on PR #38032: URL: https://github.com/apache/spark/pull/38032#issuecomment-1289432118 I don't have a strong opinion, there is already the LOCAL_N_FAILURES_REGEX mode that could just be used. why not just use that? If we do this I think the default in local mode

[GitHub] [spark] SandishKumarHN commented on a diff in pull request #38344: [SPARK-40777][SQL][PROTOBUF] Protobuf import support and move error-classes.

2022-10-24 Thread GitBox
SandishKumarHN commented on code in PR #38344: URL: https://github.com/apache/spark/pull/38344#discussion_r1003623154 ## connector/protobuf/pom.xml: ## @@ -123,6 +123,7 @@ com.google.protobuf:protoc:${protobuf.version}

  1   2   3   >