[GitHub] [spark] cloud-fan commented on a diff in pull request #36776: [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] `PushableColumnWithoutNestedColumn` need be translated to predicate too

2022-06-06 Thread GitBox
cloud-fan commented on code in PR #36776: URL: https://github.com/apache/spark/pull/36776#discussion_r890793608 ## sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala: ## @@ -55,8 +55,13 @@ class V2ExpressionBuilder( } else {

[GitHub] [spark] xiuzhu9527 commented on pull request #36784: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid credentials'

2022-06-06 Thread GitBox
xiuzhu9527 commented on PR #36784: URL: https://github.com/apache/spark/pull/36784#issuecomment-1148226308 @HyukjinKwon please take a look, Thanks ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] xiuzhu9527 opened a new pull request, #36784: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid credentials'

2022-06-06 Thread GitBox
xiuzhu9527 opened a new pull request, #36784: URL: https://github.com/apache/spark/pull/36784 ### What changes were proposed in this pull request? In the PR, Fixed the problem that the DN is (cn=user,ou=people, dc=example, dc=com) LDAP login failure. ### Why are the

[GitHub] [spark] xiuzhu9527 closed pull request #36783: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid …

2022-06-06 Thread GitBox
xiuzhu9527 closed pull request #36783: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid … URL: https://github.com/apache/spark/pull/36783 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] xiuzhu9527 opened a new pull request, #36783: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid …

2022-06-06 Thread GitBox
xiuzhu9527 opened a new pull request, #36783: URL: https://github.com/apache/spark/pull/36783 ### What changes were proposed in this pull request? In the PR, Fixed the problem that the DN is (cn=user,ou=people, dc=example, dc=com) LDAP login failure. ### Why are

[GitHub] [spark] HyukjinKwon closed pull request #36782: [SPARK-39394][DOCS][SS] Improve PySpark Structured Streaming page more readable

2022-06-06 Thread GitBox
HyukjinKwon closed pull request #36782: [SPARK-39394][DOCS][SS] Improve PySpark Structured Streaming page more readable URL: https://github.com/apache/spark/pull/36782 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] HyukjinKwon commented on pull request #36782: [SPARK-39394][DOCS][SS] Improve PySpark Structured Streaming page more readable

2022-06-06 Thread GitBox
HyukjinKwon commented on PR #36782: URL: https://github.com/apache/spark/pull/36782#issuecomment-1148212500 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] mridulm commented on a diff in pull request #36775: [SPARK-39389]Filesystem closed should not be considered as corrupt files

2022-06-06 Thread GitBox
mridulm commented on code in PR #36775: URL: https://github.com/apache/spark/pull/36775#discussion_r890765675 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala: ## @@ -253,6 +253,9 @@ class FileScanRDD( // Throw

[GitHub] [spark] mridulm commented on a diff in pull request #36775: [SPARK-39389]Filesystem closed should not be considered as corrupt files

2022-06-06 Thread GitBox
mridulm commented on code in PR #36775: URL: https://github.com/apache/spark/pull/36775#discussion_r890765675 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala: ## @@ -253,6 +253,9 @@ class FileScanRDD( // Throw

[GitHub] [spark] huaxingao commented on pull request #36777: [SPARK-39390][CORE] Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log

2022-06-06 Thread GitBox
huaxingao commented on PR #36777: URL: https://github.com/apache/spark/pull/36777#issuecomment-1148178978 Merged to master. Thanks @dcoliversun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] huaxingao closed pull request #36777: [SPARK-39390][CORE] Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log

2022-06-06 Thread GitBox
huaxingao closed pull request #36777: [SPARK-39390][CORE] Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log URL: https://github.com/apache/spark/pull/36777 -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] otterc commented on a diff in pull request #35906: [SPARK-33236][shuffle] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2022-06-06 Thread GitBox
otterc commented on code in PR #35906: URL: https://github.com/apache/spark/pull/35906#discussion_r890584847 ## common/network-yarn/src/main/java/org/apache/spark/network/yarn/YarnShuffleService.java: ## @@ -230,11 +241,14 @@ protected void serviceInit(Configuration

[GitHub] [spark] itholic opened a new pull request, #36782: [SPARK-39394][DOCS] Improve PySpark Structured Streaming page more readable

2022-06-06 Thread GitBox
itholic opened a new pull request, #36782: URL: https://github.com/apache/spark/pull/36782 ### What changes were proposed in this pull request? This PR proposes to improve the PySpark Structured Streaming API reference page to be more readable, So far, the PySpark Structured

[GitHub] [spark] JoshRosen commented on a diff in pull request #36775: [SPARK-39389]Filesystem closed should not be considered as corrupt files

2022-06-06 Thread GitBox
JoshRosen commented on code in PR #36775: URL: https://github.com/apache/spark/pull/36775#discussion_r890727456 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala: ## @@ -253,6 +253,9 @@ class FileScanRDD( // Throw

[GitHub] [spark] HyukjinKwon commented on pull request #35484: [SPARK-38181][SS][DOCS] Update comments in KafkaDataConsumer.scala

2022-06-06 Thread GitBox
HyukjinKwon commented on PR #35484: URL: https://github.com/apache/spark/pull/35484#issuecomment-1148140363 Oh, also please enable GItHub Actions in your forked repository (https://github.com/ArvinZheng/spark). Apache Spark repository leverages PR author's GitHub Actions resources. --

[GitHub] [spark] HyukjinKwon commented on pull request #35484: [SPARK-38181][SS][DOCS] Update comments in KafkaDataConsumer.scala

2022-06-06 Thread GitBox
HyukjinKwon commented on PR #35484: URL: https://github.com/apache/spark/pull/35484#issuecomment-1148140028 @ArvinZheng mind rebasing this PR? seems like something went wrong about finding GA actions in your fork. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] anishshri-db commented on pull request #35484: [SPARK-38181][SS][DOCS] Update comments in KafkaDataConsumer.scala

2022-06-06 Thread GitBox
anishshri-db commented on PR #35484: URL: https://github.com/apache/spark/pull/35484#issuecomment-1148138847 Seems like the tests are failing though ? Maybe try merging back and re-trigger ? -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] anishshri-db commented on a diff in pull request #35484: [SPARK-38181][SS][DOCS] Update comments in KafkaDataConsumer.scala

2022-06-06 Thread GitBox
anishshri-db commented on code in PR #35484: URL: https://github.com/apache/spark/pull/35484#discussion_r890722033 ## connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/consumer/KafkaDataConsumer.scala: ## @@ -298,9 +296,10 @@ private[kafka010] class

[GitHub] [spark] JoshRosen commented on pull request #36775: [SPARK-39389]Filesystem closed should not be considered as corrupt files

2022-06-06 Thread GitBox
JoshRosen commented on PR #36775: URL: https://github.com/apache/spark/pull/36775#issuecomment-1148136217 Does checking for filesystem closed exceptions completely fix this issue or are we vulnerable to race conditions? Skimming through the [Hadoop DFSClient

[GitHub] [spark] beliefer commented on pull request #36776: [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] `PushableColumnWithoutNestedColumn` need be translated to predicate too

2022-06-06 Thread GitBox
beliefer commented on PR #36776: URL: https://github.com/apache/spark/pull/36776#issuecomment-1148132608 ping @huaxingao cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] beliefer commented on a diff in pull request #36773: [SPARK-39385][SQL] Translate linear regression aggregate functions for pushdown

2022-06-06 Thread GitBox
beliefer commented on code in PR #36773: URL: https://github.com/apache/spark/pull/36773#discussion_r890715857 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala: ## @@ -750,6 +750,22 @@ object DataSourceStrategy

[GitHub] [spark] HeartSaVioR commented on pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-06 Thread GitBox
HeartSaVioR commented on PR #36737: URL: https://github.com/apache/spark/pull/36737#issuecomment-1148131018 Sorry I'll find a time sooner. I'll also find someone able to review this in prior. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] beliefer commented on a diff in pull request #36773: [SPARK-39385][SQL] Translate linear regression aggregate functions for pushdown

2022-06-06 Thread GitBox
beliefer commented on code in PR #36773: URL: https://github.com/apache/spark/pull/36773#discussion_r890714758 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala: ## @@ -72,6 +72,26 @@ private[sql] object H2Dialect extends JdbcDialect {

[GitHub] [spark] AngersZhuuuu commented on pull request #35612: [SPARK-38289][SQL] Refactor SQL CLI exit code to make it more clear

2022-06-06 Thread GitBox
AngersZh commented on PR #35612: URL: https://github.com/apache/spark/pull/35612#issuecomment-1148126351 > Tests are running in https://github.com/AngersZh/spark/runs/6765635006 > > Seems like Scala 2.13 build fails as below: > > ``` > [error]

[GitHub] [spark] beliefer commented on a diff in pull request #36773: [SPARK-39385][SQL] Translate linear regression aggregate functions for pushdown

2022-06-06 Thread GitBox
beliefer commented on code in PR #36773: URL: https://github.com/apache/spark/pull/36773#discussion_r890696454 ## sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala: ## @@ -,6 +,28 @@ class JDBCV2Suite extends QueryTest with SharedSparkSession with

[GitHub] [spark] beliefer commented on a diff in pull request #36773: [SPARK-39385][SQL] Translate linear regression aggregate functions for pushdown

2022-06-06 Thread GitBox
beliefer commented on code in PR #36773: URL: https://github.com/apache/spark/pull/36773#discussion_r890694554 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala: ## @@ -72,6 +72,26 @@ private[sql] object H2Dialect extends JdbcDialect {

[GitHub] [spark] beliefer commented on a diff in pull request #36662: [SPARK-39286][DOC] Update documentation for the decode function

2022-06-06 Thread GitBox
beliefer commented on code in PR #36662: URL: https://github.com/apache/spark/pull/36662#discussion_r890692975 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala: ## @@ -2504,9 +2504,10 @@ object Decode { usage = """

[GitHub] [spark] beliefer commented on a diff in pull request #36662: [SPARK-39286][DOC] Update documentation for the decode function

2022-06-06 Thread GitBox
beliefer commented on code in PR #36662: URL: https://github.com/apache/spark/pull/36662#discussion_r890691547 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala: ## @@ -2504,9 +2504,10 @@ object Decode { usage = """

[GitHub] [spark] dcoliversun commented on a diff in pull request #36777: [SPARK-39390][CORE] Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log

2022-06-06 Thread GitBox
dcoliversun commented on code in PR #36777: URL: https://github.com/apache/spark/pull/36777#discussion_r890683738 ## core/src/main/scala/org/apache/spark/SecurityManager.scala: ## @@ -87,10 +87,14 @@ private[spark] class SecurityManager( private var secretKey: String = _

[GitHub] [spark] dcoliversun commented on a diff in pull request #36777: [SPARK-39390][CORE] Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log

2022-06-06 Thread GitBox
dcoliversun commented on code in PR #36777: URL: https://github.com/apache/spark/pull/36777#discussion_r890683347 ## core/src/main/scala/org/apache/spark/SecurityManager.scala: ## @@ -87,10 +87,14 @@ private[spark] class SecurityManager( private var secretKey: String = _

[GitHub] [spark] HyukjinKwon commented on pull request #35612: [SPARK-38289][SQL] Refactor SQL CLI exit code to make it more clear

2022-06-06 Thread GitBox
HyukjinKwon commented on PR #35612: URL: https://github.com/apache/spark/pull/35612#issuecomment-1148079558 Tests are running in https://github.com/AngersZh/spark/runs/6765635006 Seems like Scala 2.13 build fails as below: ``` [error]

[GitHub] [spark] huaxingao commented on a diff in pull request #36777: [SPARK-39390][CORE] Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log

2022-06-06 Thread GitBox
huaxingao commented on code in PR #36777: URL: https://github.com/apache/spark/pull/36777#discussion_r890673570 ## core/src/main/scala/org/apache/spark/SecurityManager.scala: ## @@ -87,10 +87,14 @@ private[spark] class SecurityManager( private var secretKey: String = _

[GitHub] [spark] github-actions[bot] closed pull request #32397: [WIP][SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2022-06-06 Thread GitBox
github-actions[bot] closed pull request #32397: [WIP][SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode URL: https://github.com/apache/spark/pull/32397 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] github-actions[bot] closed pull request #35417: [SPARK-38102][CORE] Support custom commitProtocolClass in saveAsNewAPIHadoopDataset

2022-06-06 Thread GitBox
github-actions[bot] closed pull request #35417: [SPARK-38102][CORE] Support custom commitProtocolClass in saveAsNewAPIHadoopDataset URL: https://github.com/apache/spark/pull/35417 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] github-actions[bot] closed pull request #35363: [SPARK-38066][SQL] evaluateEquality should ignore attribute without min/max ColumnStat

2022-06-06 Thread GitBox
github-actions[bot] closed pull request #35363: [SPARK-38066][SQL] evaluateEquality should ignore attribute without min/max ColumnStat URL: https://github.com/apache/spark/pull/35363 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] huaxingao commented on a diff in pull request #36773: [SPARK-39385][SQL] Translate linear regression aggregate functions for pushdown

2022-06-06 Thread GitBox
huaxingao commented on code in PR #36773: URL: https://github.com/apache/spark/pull/36773#discussion_r890658489 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala: ## @@ -72,6 +72,26 @@ private[sql] object H2Dialect extends JdbcDialect {

[GitHub] [spark] vli-databricks commented on a diff in pull request #36780: [SPARK-39392][SQL] Refine ANSI error messages for try_* function hints

2022-06-06 Thread GitBox
vli-databricks commented on code in PR #36780: URL: https://github.com/apache/spark/pull/36780#discussion_r890657167 ## core/src/main/resources/error/error-classes.json: ## @@ -195,7 +195,7 @@ }, "INVALID_ARRAY_INDEX_IN_ELEMENT_AT" : { "message" : [ - "The index

[GitHub] [spark] huaxingao commented on pull request #36781: [SPARK-39393][SQL] Parquet data source only supports push-down predicate filters for non-repeated primitive types

2022-06-06 Thread GitBox
huaxingao commented on PR #36781: URL: https://github.com/apache/spark/pull/36781#issuecomment-1148022608 @Borjianamin98 Could you please add a test? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] mridulm commented on pull request #36734: [SPARK-38987][SHUFFLE] Throw FetchFailedException when merged shuffle blocks are corrupted and spark.shuffle.detectCorrupt is set to true

2022-06-06 Thread GitBox
mridulm commented on PR #36734: URL: https://github.com/apache/spark/pull/36734#issuecomment-1147982565 Merged to master, thanks for working on this @akpatnam25 ! Thanks for the review @otterc :-) -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] mridulm closed pull request #36734: [SPARK-38987][SHUFFLE] Throw FetchFailedException when merged shuffle blocks are corrupted and spark.shuffle.detectCorrupt is set to true

2022-06-06 Thread GitBox
mridulm closed pull request #36734: [SPARK-38987][SHUFFLE] Throw FetchFailedException when merged shuffle blocks are corrupted and spark.shuffle.detectCorrupt is set to true URL: https://github.com/apache/spark/pull/36734 -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36775: [SPARK-39389]Filesystem closed should not be considered as corrupt files

2022-06-06 Thread GitBox
dongjoon-hyun commented on code in PR #36775: URL: https://github.com/apache/spark/pull/36775#discussion_r890600656 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala: ## @@ -253,6 +253,9 @@ class FileScanRDD( // Throw

[GitHub] [spark] dtenedor commented on a diff in pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-06 Thread GitBox
dtenedor commented on code in PR #36745: URL: https://github.com/apache/spark/pull/36745#discussion_r890579762 ## sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala: ## @@ -41,21 +41,29 @@ import

[GitHub] [spark] dtenedor commented on a diff in pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-06 Thread GitBox
dtenedor commented on code in PR #36745: URL: https://github.com/apache/spark/pull/36745#discussion_r890579364 ## sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala: ## @@ -186,7 +186,7 @@ abstract class BaseSessionStateBuilder( new

[GitHub] [spark] dongjoon-hyun commented on pull request #36772: [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3

2022-06-06 Thread GitBox
dongjoon-hyun commented on PR #36772: URL: https://github.com/apache/spark/pull/36772#issuecomment-1147900459 Apache ORC community uses 2.8.1 based on `Panagiotis Garefalakis`'s comment (which I shared here) because he is the Hive committer and ORC PMC member. In Apache Spark community,

[GitHub] [spark] Borjianamin98 opened a new pull request, #36781: [SPARK-39393][SQL] Parquet data source only supports push-down predicate filters for non-repeated primitive types

2022-06-06 Thread GitBox
Borjianamin98 opened a new pull request, #36781: URL: https://github.com/apache/spark/pull/36781 ### What changes were proposed in this pull request? In Spark version 3.1.0 and newer, Spark creates extra filter predicate conditions for repeated parquet columns. These fields do not

[GitHub] [spark] gengliangwang commented on a diff in pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36745: URL: https://github.com/apache/spark/pull/36745#discussion_r890519223 ## sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala: ## @@ -41,21 +41,29 @@ import

[GitHub] [spark] gengliangwang commented on a diff in pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36745: URL: https://github.com/apache/spark/pull/36745#discussion_r890517016 ## sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala: ## @@ -186,7 +186,7 @@ abstract class BaseSessionStateBuilder(

[GitHub] [spark] LucaCanali commented on a diff in pull request #36662: [SPARK-39286][DOC] Update documentation for the decode function

2022-06-06 Thread GitBox
LucaCanali commented on code in PR #36662: URL: https://github.com/apache/spark/pull/36662#discussion_r890504556 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala: ## @@ -2504,9 +2504,10 @@ object Decode { usage = """

[GitHub] [spark] hvanhovell closed pull request #36779: [SPARK-39391][CORE] Reuse Partitioner classes

2022-06-06 Thread GitBox
hvanhovell closed pull request #36779: [SPARK-39391][CORE] Reuse Partitioner classes URL: https://github.com/apache/spark/pull/36779 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] LucaCanali commented on a diff in pull request #36662: [SPARK-39286][DOC] Update documentation for the decode function

2022-06-06 Thread GitBox
LucaCanali commented on code in PR #36662: URL: https://github.com/apache/spark/pull/36662#discussion_r890493367 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala: ## @@ -2504,9 +2504,10 @@ object Decode { usage = """

[GitHub] [spark] cxzl25 commented on pull request #36772: [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3

2022-06-06 Thread GitBox
cxzl25 commented on PR #36772: URL: https://github.com/apache/spark/pull/36772#issuecomment-1147811164 > can we have a test case for this I see that the UT of HIVE-25190 has adjusted the xmx to 3g. The current xmx of spark's uni test is 4g. I'm not sure if this scenario can be

[GitHub] [spark] gengliangwang commented on pull request #36780: [SPARK-39392] Refine ANSI error messages.

2022-06-06 Thread GitBox
gengliangwang commented on PR #36780: URL: https://github.com/apache/spark/pull/36780#issuecomment-1147808366 cc @srielau as well -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] gengliangwang commented on a diff in pull request #36780: [SPARK-39392] Refine ANSI error messages.

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36780: URL: https://github.com/apache/spark/pull/36780#discussion_r890466926 ## core/src/main/resources/error/error-classes.json: ## @@ -195,7 +195,7 @@ }, "INVALID_ARRAY_INDEX_IN_ELEMENT_AT" : { "message" : [ - "The index

[GitHub] [spark] vli-databricks commented on pull request #36780: [SPARK-39392] Refine ANSI error messages.

2022-06-06 Thread GitBox
vli-databricks commented on PR #36780: URL: https://github.com/apache/spark/pull/36780#issuecomment-1147802194 @gengliangwang please take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] vli-databricks opened a new pull request, #36780: [SPARK-39392] Refine ANSI error messages.

2022-06-06 Thread GitBox
vli-databricks opened a new pull request, #36780: URL: https://github.com/apache/spark/pull/36780 ### What changes were proposed in this pull request? Refine ANSI error messages and remove 'To return NULL instead' ### Why are the changes needed? Improve error

[GitHub] [spark] gengliangwang commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890447000 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,81 @@ abstract class SparkFunSuite } } + /** + * Checks an exception

[GitHub] [spark] gengliangwang commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890443211 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,81 @@ abstract class SparkFunSuite } } + /** + * Checks an exception

[GitHub] [spark] gengliangwang commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890433533 ## core/src/main/java/org/apache/spark/SparkThrowable.java: ## @@ -35,6 +35,9 @@ public interface SparkThrowable { // Succinct, human-readable, unique, and

[GitHub] [spark] gengliangwang commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890432585 ## core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java: ## @@ -28,6 +28,7 @@ @Private public final class SparkOutOfMemoryError extends

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890414735 ## core/src/main/java/org/apache/spark/SparkThrowable.java: ## @@ -46,4 +49,13 @@ default String getSqlState() { default boolean isInternalError() { return

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890413958 ## core/src/main/java/org/apache/spark/SparkThrowable.java: ## @@ -46,4 +49,13 @@ default String getSqlState() { default boolean isInternalError() { return

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890413958 ## core/src/main/java/org/apache/spark/SparkThrowable.java: ## @@ -46,4 +49,13 @@ default String getSqlState() { default boolean isInternalError() { return

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890411559 ## core/src/main/java/org/apache/spark/SparkThrowable.java: ## @@ -35,6 +35,9 @@ public interface SparkThrowable { // Succinct, human-readable, unique, and

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890410763 ## core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java: ## @@ -28,6 +28,7 @@ @Private public final class SparkOutOfMemoryError extends

[GitHub] [spark] dongjoon-hyun commented on pull request #36772: [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3

2022-06-06 Thread GitBox
dongjoon-hyun commented on PR #36772: URL: https://github.com/apache/spark/pull/36772#issuecomment-1147728250 BTW, Apache ORC community upgraded the storage api to 2.8.1 at [ORC-867](https://issues.apache.org/jira/browse/ORC-867) at Apache ORC 1.7.0+ -- This is an automated message from

[GitHub] [spark] gengliangwang commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890392259 ## core/src/main/scala/org/apache/spark/SparkException.scala: ## @@ -28,23 +28,47 @@ class SparkException( message: String, cause: Throwable,

[GitHub] [spark] gengliangwang commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890391580 ## core/src/main/scala/org/apache/spark/SparkException.scala: ## @@ -28,23 +28,47 @@ class SparkException( message: String, cause: Throwable,

[GitHub] [spark] gengliangwang commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
gengliangwang commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r890373656 ## core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java: ## @@ -28,6 +28,7 @@ @Private public final class SparkOutOfMemoryError extends

[GitHub] [spark] JoshRosen commented on pull request #36753: [SPARK-39259][SQL][3.2] Evaluate timestamps consistently in subqueries

2022-06-06 Thread GitBox
JoshRosen commented on PR #36753: URL: https://github.com/apache/spark/pull/36753#issuecomment-1147680899 When updating this PR, let's also pull in my changes from https://github.com/apache/spark/pull/36765 . When merging this, we should probably pick it all the way back to 3.0 (since it

[GitHub] [spark] hvanhovell opened a new pull request, #36779: [SPARK-39391][CORE] Reuse Partitioner classes

2022-06-06 Thread GitBox
hvanhovell opened a new pull request, #36779: URL: https://github.com/apache/spark/pull/36779 ### What changes were proposed in this pull request? This PR creates two new `Partitioner` classes: - `ConstantPartitioner`: This moves all tuples in a RDD into a single partition. This

[GitHub] [spark] AmplabJenkins commented on pull request #36775: [SPARK-39389]Filesystem closed should not be considered as corrupt files

2022-06-06 Thread GitBox
AmplabJenkins commented on PR #36775: URL: https://github.com/apache/spark/pull/36775#issuecomment-1147644959 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] dtenedor opened a new pull request, #36778: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ALTER COLUMNS to V2 data sources

2022-06-06 Thread GitBox
dtenedor opened a new pull request, #36778: URL: https://github.com/apache/spark/pull/36778 ### What changes were proposed in this pull request? Extend DEFAULT column support in ALTER TABLE ALTER COLUMNS commands to include V2 data sources. (Note: this depends on

[GitHub] [spark] dtenedor commented on pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-06 Thread GitBox
dtenedor commented on PR #36745: URL: https://github.com/apache/spark/pull/36745#issuecomment-1147641785 @gengliangwang @HyukjinKwon @cloud-fan this is ready to merge at your convenience (or leave review comment(s) for further iteration if desired) -- This is an automated message from

[GitHub] [spark] LuciferYang commented on a diff in pull request #36774: [WIP][SPARK-39388][SQL] Reuse `orcScheam` when push down Orc predicates

2022-06-06 Thread GitBox
LuciferYang commented on code in PR #36774: URL: https://github.com/apache/spark/pull/36774#discussion_r890306149 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala: ## @@ -153,10 +152,9 @@ class OrcFileFormat } else {

[GitHub] [spark] LuciferYang commented on a diff in pull request #36774: [WIP][SPARK-39388][SQL] Reuse `orcScheam` when push down Orc predicates

2022-06-06 Thread GitBox
LuciferYang commented on code in PR #36774: URL: https://github.com/apache/spark/pull/36774#discussion_r890291383 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala: ## @@ -153,10 +152,9 @@ class OrcFileFormat } else {

[GitHub] [spark] LuciferYang commented on a diff in pull request #36774: [WIP][SPARK-39388][SQL] Reuse `orcScheam` when push down Orc predicates

2022-06-06 Thread GitBox
LuciferYang commented on code in PR #36774: URL: https://github.com/apache/spark/pull/36774#discussion_r890285150 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala: ## @@ -153,10 +152,9 @@ class OrcFileFormat } else {

[GitHub] [spark] LuciferYang commented on a diff in pull request #36774: [WIP][SPARK-39388][SQL] Reuse `orcScheam` when push down Orc predicates

2022-06-06 Thread GitBox
LuciferYang commented on code in PR #36774: URL: https://github.com/apache/spark/pull/36774#discussion_r890291383 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala: ## @@ -153,10 +152,9 @@ class OrcFileFormat } else {

[GitHub] [spark] LuciferYang commented on a diff in pull request #36774: [WIP][SPARK-39388][SQL] Reuse `orcScheam` when push down Orc predicates

2022-06-06 Thread GitBox
LuciferYang commented on code in PR #36774: URL: https://github.com/apache/spark/pull/36774#discussion_r890285150 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala: ## @@ -153,10 +152,9 @@ class OrcFileFormat } else {

[GitHub] [spark] LuciferYang commented on a diff in pull request #36774: [WIP][SPARK-39388][SQL] Reuse `orcScheam` when push down Orc predicates

2022-06-06 Thread GitBox
LuciferYang commented on code in PR #36774: URL: https://github.com/apache/spark/pull/36774#discussion_r890285150 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala: ## @@ -153,10 +152,9 @@ class OrcFileFormat } else {

[GitHub] [spark] chenzhx commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

2022-06-06 Thread GitBox
chenzhx commented on code in PR #36663: URL: https://github.com/apache/spark/pull/36663#discussion_r890253947 ## sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala: ## @@ -259,6 +259,55 @@ class V2ExpressionBuilder( } else {

[GitHub] [spark] srowen commented on a diff in pull request #36773: [SPARK-39385][SQL] Translate linear regression aggregate functions for pushdown

2022-06-06 Thread GitBox
srowen commented on code in PR #36773: URL: https://github.com/apache/spark/pull/36773#discussion_r890245878 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala: ## @@ -750,6 +750,22 @@ object DataSourceStrategy

[GitHub] [spark] AmplabJenkins commented on pull request #36777: [SPARK-39390][CORE] Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log

2022-06-06 Thread GitBox
AmplabJenkins commented on PR #36777: URL: https://github.com/apache/spark/pull/36777#issuecomment-1147544750 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] srielau commented on pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-06 Thread GitBox
srielau commented on PR #36693: URL: https://github.com/apache/spark/pull/36693#issuecomment-1147543765 @gengliangwang @cloud-fan Hoping for a review. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] cloud-fan commented on a diff in pull request #35612: [SPARK-38289][SQL] Refactor SQL CLI exit code to make it more clear

2022-06-06 Thread GitBox
cloud-fan commented on code in PR #35612: URL: https://github.com/apache/spark/pull/35612#discussion_r890193035 ## sql/hive-thriftserver/src/main/java/org/apache/hive/service/server/HiveServer2.java: ## @@ -259,7 +260,7 @@ static class HelpOptionExecutor implements

[GitHub] [spark] cloud-fan commented on a diff in pull request #36662: [SPARK-39286][DOC] Update documentation for the decode function

2022-06-06 Thread GitBox
cloud-fan commented on code in PR #36662: URL: https://github.com/apache/spark/pull/36662#discussion_r890191027 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala: ## @@ -2504,9 +2504,10 @@ object Decode { usage = """

[GitHub] [spark] cloud-fan commented on a diff in pull request #36663: [SPARK-38899][SQL]DS V2 supports push down datetime functions

2022-06-06 Thread GitBox
cloud-fan commented on code in PR #36663: URL: https://github.com/apache/spark/pull/36663#discussion_r890188771 ## sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala: ## @@ -259,6 +259,55 @@ class V2ExpressionBuilder( } else {

[GitHub] [spark] Eugene-Mark commented on pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

2022-06-06 Thread GitBox
Eugene-Mark commented on PR #36499: URL: https://github.com/apache/spark/pull/36499#issuecomment-1147448291 Agreed with you that it's better not to modify Oracle related part, just removed from the commit. Yes, I suggest we use scale = 18. And for precision, when `Number(*)` or

[GitHub] [spark] cloud-fan commented on a diff in pull request #36698: [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

2022-06-06 Thread GitBox
cloud-fan commented on code in PR #36698: URL: https://github.com/apache/spark/pull/36698#discussion_r890146130 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/decimalExpressions.scala: ## @@ -232,3 +216,33 @@ case class CheckOverflowInSum( override

[GitHub] [spark] cloud-fan commented on pull request #35250: [SPARK-37961][SQL] Override maxRows/maxRowsPerPartition for some logical operators

2022-06-06 Thread GitBox
cloud-fan commented on PR #35250: URL: https://github.com/apache/spark/pull/35250#issuecomment-1147439706 Somehow this PR lost track. @zhengruifeng do you want to reopen it and get it in? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] srowen commented on pull request #36732: [SPARK-39345][CORE][SQL][DSTREAM][ML][MESOS][SS] Replace `filter(!condition)` with `filterNot(condition)`

2022-06-06 Thread GitBox
srowen commented on PR #36732: URL: https://github.com/apache/spark/pull/36732#issuecomment-1147422759 No, we wouldn't backport this, that's more change. Does this offer any benefit? I'm not sure it's more readable even. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] srowen commented on pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

2022-06-06 Thread GitBox
srowen commented on PR #36499: URL: https://github.com/apache/spark/pull/36499#issuecomment-1147421904 You're saying, basically, assume scale=18? that's seems reasonable. Or are you saying there needs to be an arbitrary precision type? I don't see how a DB would support that. I'm

[GitHub] [spark] cloud-fan closed pull request #36763: [SPARK-39376][SQL] Hide duplicated columns in star expansion of subquery alias from NATURAL/USING JOIN

2022-06-06 Thread GitBox
cloud-fan closed pull request #36763: [SPARK-39376][SQL] Hide duplicated columns in star expansion of subquery alias from NATURAL/USING JOIN URL: https://github.com/apache/spark/pull/36763 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dcoliversun opened a new pull request, #36777: [SPARK-39390][CORE] Hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log

2022-06-06 Thread GitBox
dcoliversun opened a new pull request, #36777: URL: https://github.com/apache/spark/pull/36777 ### What changes were proposed in this pull request? This PR aims to hide and optimize `viewAcls`/`viewAclsGroups`/`modifyAcls`/`modifyAclsGroups` from INFO log. ### Why are

[GitHub] [spark] beliefer commented on a diff in pull request #36774: [WIP][SPARK-39388][SQL] Reuse `orcScheam` when push down Orc predicates

2022-06-06 Thread GitBox
beliefer commented on code in PR #36774: URL: https://github.com/apache/spark/pull/36774#discussion_r890086418 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala: ## @@ -153,10 +152,9 @@ class OrcFileFormat } else { //

[GitHub] [spark] beliefer opened a new pull request, #36776: [SPARK-38997][SPARK-39037][FOLLOWUP] `PushableColumnWithoutNestedColumn` need be translated to predicate too

2022-06-06 Thread GitBox
beliefer opened a new pull request, #36776: URL: https://github.com/apache/spark/pull/36776 ### What changes were proposed in this pull request? https://github.com/apache/spark/pull/35768 assume the expression in `And`, `Or` and `Not` must be predicate.

[GitHub] [spark] HyukjinKwon commented on pull request #36772: [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3

2022-06-06 Thread GitBox
HyukjinKwon commented on PR #36772: URL: https://github.com/apache/spark/pull/36772#issuecomment-1147321618 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #36772: [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3

2022-06-06 Thread GitBox
HyukjinKwon closed pull request #36772: [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3 URL: https://github.com/apache/spark/pull/36772 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #36772: [SPARK-39387][BUILD] Upgrade hive-storage-api to 2.7.3

2022-06-06 Thread GitBox
HyukjinKwon commented on PR #36772: URL: https://github.com/apache/spark/pull/36772#issuecomment-1147321275 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36775: [SPARK-39389]Filesystem closed should not be considered as corrupt files

2022-06-06 Thread GitBox
HyukjinKwon commented on code in PR #36775: URL: https://github.com/apache/spark/pull/36775#discussion_r890038485 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileScanRDD.scala: ## @@ -253,6 +253,9 @@ class FileScanRDD( // Throw

[GitHub] [spark] LuciferYang commented on a diff in pull request #36774: [WIP][SPARK-39388][SQL] Reuse `orcScheam` when push down Orc predicates

2022-06-06 Thread GitBox
LuciferYang commented on code in PR #36774: URL: https://github.com/apache/spark/pull/36774#discussion_r890028863 ## sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala: ## @@ -153,10 +152,9 @@ class OrcFileFormat } else {

  1   2   >