[GitHub] [spark] pan3793 commented on a diff in pull request #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template

2022-06-07 Thread GitBox
pan3793 commented on code in PR #36789: URL: https://github.com/apache/spark/pull/36789#discussion_r891948046 ## conf/spark-env.sh.template: ## @@ -79,3 +80,6 @@ # Options for beeline # - SPARK_BEELINE_OPTS, to set config properties only for the beeline cli (e.g. "-Dx=y") #

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template

2022-06-07 Thread GitBox
HyukjinKwon commented on code in PR #36789: URL: https://github.com/apache/spark/pull/36789#discussion_r891946395 ## conf/spark-env.sh.template: ## @@ -79,3 +80,6 @@ # Options for beeline # - SPARK_BEELINE_OPTS, to set config properties only for the beeline cli (e.g.

[GitHub] [spark] HyukjinKwon commented on pull request #36683: [SPARK-39301][SQL][PYTHON] Leverage LocalRelation and respect Arrow batch size in createDataFrame with Arrow optimization

2022-06-07 Thread GitBox
HyukjinKwon commented on PR #36683: URL: https://github.com/apache/spark/pull/36683#issuecomment-1149492294 cc @mengxr and @WeichenXu123 in case you guys have some comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36683: [SPARK-39301][SQL][PYTHON] Leverage LocalRelation and respect Arrow batch size in createDataFrame with Arrow optimization

2022-06-07 Thread GitBox
HyukjinKwon commented on code in PR #36683: URL: https://github.com/apache/spark/pull/36683#discussion_r891944792 ## python/pyspark/sql/pandas/conversion.py: ## @@ -596,7 +596,7 @@ def _create_from_pandas_with_arrow( ] # Slice the DataFrame to be batched

[GitHub] [spark] LuciferYang commented on pull request #36800: [SPARK-39409][BUILD] Upgrade scala-maven-plugin to 4.6.2

2022-06-07 Thread GitBox
LuciferYang commented on PR #36800: URL: https://github.com/apache/spark/pull/36800#issuecomment-1149491231 thanks @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #36800: [SPARK-39409][BUILD] Upgrade scala-maven-plugin to 4.6.2

2022-06-07 Thread GitBox
HyukjinKwon closed pull request #36800: [SPARK-39409][BUILD] Upgrade scala-maven-plugin to 4.6.2 URL: https://github.com/apache/spark/pull/36800 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] HyukjinKwon commented on pull request #36800: [SPARK-39409][BUILD] Upgrade scala-maven-plugin to 4.6.2

2022-06-07 Thread GitBox
HyukjinKwon commented on PR #36800: URL: https://github.com/apache/spark/pull/36800#issuecomment-1149490024 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on pull request #36802: [SPARK-39321][SQL][TESTS][FOLLOW-UP] Respect CastWithAnsiOffSuite.ansiEnabled in 'cast string to date #2'

2022-06-07 Thread GitBox
HyukjinKwon commented on PR #36802: URL: https://github.com/apache/spark/pull/36802#issuecomment-1149489864 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[GitHub] [spark] HyukjinKwon opened a new pull request, #36802: [SPARK-39321][SQL][TESTS][FOLLOW-UP] Respect CastWithAnsiOffSuite.ansiEnabled in 'cast string to date #2'

2022-06-07 Thread GitBox
HyukjinKwon opened a new pull request, #36802: URL: https://github.com/apache/spark/pull/36802 ### What changes were proposed in this pull request? This PR fixes the test to make `CastWithAnsiOffSuite` properly respect `ansiEnabled` in `cast string to date #2` test by using

[GitHub] [spark] HyukjinKwon commented on pull request #36797: [SPARK-39394][DOCS][SS][3.3] Improve PySpark Structured Streaming page more readable

2022-06-07 Thread GitBox
HyukjinKwon commented on PR #36797: URL: https://github.com/apache/spark/pull/36797#issuecomment-1149483125 Merged to branch-3.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #36797: [SPARK-39394][DOCS][SS][3.3] Improve PySpark Structured Streaming page more readable

2022-06-07 Thread GitBox
HyukjinKwon closed pull request #36797: [SPARK-39394][DOCS][SS][3.3] Improve PySpark Structured Streaming page more readable URL: https://github.com/apache/spark/pull/36797 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] LuciferYang commented on pull request #36781: [SPARK-39393][SQL] Parquet data source only supports push-down predicate filters for non-repeated primitive types

2022-06-07 Thread GitBox
LuciferYang commented on PR #36781: URL: https://github.com/apache/spark/pull/36781#issuecomment-1149455635 I think this pr should be backport to previous Spark version, because when run `SPARK-39393: Do not push down predicate filters for repeated primitive fields` without this pr, I

[GitHub] [spark] Yaohua628 opened a new pull request, #36801: [SPARK-39404][SQL][Streaming] Minor fix for querying `_metadata` in streaming

2022-06-07 Thread GitBox
Yaohua628 opened a new pull request, #36801: URL: https://github.com/apache/spark/pull/36801 ### What changes were proposed in this pull request? We added the support to query the `_metadata` column with a file-based streaming source: https://github.com/apache/spark/pull/35676.

[GitHub] [spark] AmplabJenkins commented on pull request #36787: [SPARK-39387][FOLLOWUP][TESTS] Add a test case for HIVE-25190

2022-06-07 Thread GitBox
AmplabJenkins commented on PR #36787: URL: https://github.com/apache/spark/pull/36787#issuecomment-1149442173 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AngersZhuuuu commented on pull request #36786: [SPARK-39400][SQL] spark-sql should remove hive resource dir in all case

2022-06-07 Thread GitBox
AngersZh commented on PR #36786: URL: https://github.com/apache/spark/pull/36786#issuecomment-1149434288 > Could we add a test? Need to test after built, seems hard to write test in UT... -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] LuciferYang opened a new pull request, #36800: [SPARK-39409][BUILD] Upgrade scala-maven-plugin to 4.6.2

2022-06-07 Thread GitBox
LuciferYang opened a new pull request, #36800: URL: https://github.com/apache/spark/pull/36800 ### What changes were proposed in this pull request? This pr aims upgrade scala-maven-plugin to 4.6.2 ### Why are the changes needed? This version brings some bug fix related to

[GitHub] [spark] amaliujia commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
amaliujia commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891887309 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -553,4 +571,103 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891881553 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -553,4 +571,103 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891880524 ## sql/core/src/test/scala/org/apache/spark/sql/execution/GlobalTempViewSuite.scala: ## @@ -165,7 +165,8 @@ class GlobalTempViewSuite extends QueryTest with

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891879819 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -117,14 +131,45 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891879819 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -117,14 +131,45 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] itholic commented on pull request #36782: [SPARK-39394][DOCS][SS] Improve PySpark Structured Streaming page more readable

2022-06-07 Thread GitBox
itholic commented on PR #36782: URL: https://github.com/apache/spark/pull/36782#issuecomment-1149410531 @HyukjinKwon Sure, I created a PR: https://github.com/apache/spark/pull/36797 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891879436 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -117,14 +131,45 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891879053 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -97,8 +98,21 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] Borjianamin98 commented on a diff in pull request #36781: [SPARK-39393][SQL] Parquet data source only supports push-down predicate filters for non-repeated primitive types

2022-06-07 Thread GitBox
Borjianamin98 commented on code in PR #36781: URL: https://github.com/apache/spark/pull/36781#discussion_r891878580 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala: ## @@ -1316,6 +1317,34 @@ abstract class

[GitHub] [spark] cloud-fan commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891878020 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala: ## @@ -965,6 +965,10 @@ class SessionCatalog(

[GitHub] [spark] cloud-fan commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891876268 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class

[GitHub] [spark] dtenedor opened a new pull request, #36799: [SPARK-39350][SQL] Add flag to control breaking change process for: DESC NAMESPACE EXTENDED should redact properties

2022-06-07 Thread GitBox
dtenedor opened a new pull request, #36799: URL: https://github.com/apache/spark/pull/36799 ### What changes were proposed in this pull request? Add a flag to control breaking change process for: DESC NAMESPACE EXTENDED should redact properties. ### Why are the changes needed?

[GitHub] [spark] dtenedor commented on pull request #36799: [SPARK-39350][SQL] Add flag to control breaking change process for: DESC NAMESPACE EXTENDED should redact properties

2022-06-07 Thread GitBox
dtenedor commented on PR #36799: URL: https://github.com/apache/spark/pull/36799#issuecomment-1149401878 @HyukjinKwon would you be able to help with this PR for the breaking change process? -- This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
HeartSaVioR commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891867452 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class

[GitHub] [spark] AmplabJenkins commented on pull request #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template

2022-06-07 Thread GitBox
AmplabJenkins commented on PR #36789: URL: https://github.com/apache/spark/pull/36789#issuecomment-1149394119 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] wangyum commented on pull request #36798: [SPARK-39408][SQL] Update the buildKeys for DynamicPruningSubquery.withNewPlan

2022-06-07 Thread GitBox
wangyum commented on PR #36798: URL: https://github.com/apache/spark/pull/36798#issuecomment-1149393100 cc @maryannxue @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] wangyum opened a new pull request, #36798: [SPARK-39408][SQL] Update the buildKeys for DynamicPruningSubquery.withNewPlan

2022-06-07 Thread GitBox
wangyum opened a new pull request, #36798: URL: https://github.com/apache/spark/pull/36798 ### What changes were proposed in this pull request? This pr updates the buildKeys for `DynamicPruningSubquery.withNewPlan`. ### Why are the changes needed? Fix bug. Otherwise, the

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
HeartSaVioR commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891861831 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891847061 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] pan3793 commented on a diff in pull request #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template

2022-06-07 Thread GitBox
pan3793 commented on code in PR #36789: URL: https://github.com/apache/spark/pull/36789#discussion_r891852053 ## conf/spark-env.sh.template: ## @@ -79,3 +80,6 @@ # Options for beeline # - SPARK_BEELINE_OPTS, to set config properties only for the beeline cli (e.g. "-Dx=y") #

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891842533 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891842533 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891848795 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] srowen commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
srowen commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891850083 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891848795 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] xiuzhu9527 commented on pull request #36784: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid credentials'

2022-06-07 Thread GitBox
xiuzhu9527 commented on PR #36784: URL: https://github.com/apache/spark/pull/36784#issuecomment-1149366600 @pan3793 Thank you very much for your reply. This problem has affected users' use. Can we fix this problem first? -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891847061 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
HeartSaVioR commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891846346 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class

[GitHub] [spark] cloud-fan closed pull request #36703: [SPARK-39321][SQL] Refactor TryCast to use RuntimeReplaceable

2022-06-07 Thread GitBox
cloud-fan closed pull request #36703: [SPARK-39321][SQL] Refactor TryCast to use RuntimeReplaceable URL: https://github.com/apache/spark/pull/36703 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] cloud-fan commented on pull request #36703: [SPARK-39321][SQL] Refactor TryCast to use RuntimeReplaceable

2022-06-07 Thread GitBox
cloud-fan commented on PR #36703: URL: https://github.com/apache/spark/pull/36703#issuecomment-1149360596 thanks for the review, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] dcoliversun commented on a diff in pull request #36781: [SPARK-39393][SQL] Parquet data source only supports push-down predicate filters for non-repeated primitive types

2022-06-07 Thread GitBox
dcoliversun commented on code in PR #36781: URL: https://github.com/apache/spark/pull/36781#discussion_r891843252 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala: ## @@ -1316,6 +1317,34 @@ abstract class ParquetFilterSuite

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891842533 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] nyingping commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
nyingping commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891842533 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] cloud-fan commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891840233 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class

[GitHub] [spark] cloud-fan commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
cloud-fan commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891839935 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class

[GitHub] [spark] itholic opened a new pull request, #36797: Spark 39394 3.3

2022-06-07 Thread GitBox
itholic opened a new pull request, #36797: URL: https://github.com/apache/spark/pull/36797 ### What changes were proposed in this pull request? Hotfix https://github.com/apache/spark/pull/36782 for branch-3.3. ### Why are the changes needed? The improvement of

[GitHub] [spark] AmplabJenkins commented on pull request #36792: [SPARK-39392][SQL][3.3] Refine ANSI error messages for try_* function hints

2022-06-07 Thread GitBox
AmplabJenkins commented on PR #36792: URL: https://github.com/apache/spark/pull/36792#issuecomment-1149336001 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #36704: [SPARK-39346][SQL] Convert asserts/illegal state exception to internal errors on each phase

2022-06-07 Thread GitBox
HeartSaVioR commented on code in PR #36704: URL: https://github.com/apache/spark/pull/36704#discussion_r891827819 ## connector/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala: ## @@ -666,9 +667,10 @@ abstract class

[GitHub] [spark] wangyum commented on pull request #36790: [SPARK-39402][SQL] Optimize ReplaceCTERefWithRepartition to support coalesce partitions

2022-06-07 Thread GitBox
wangyum commented on PR #36790: URL: https://github.com/apache/spark/pull/36790#issuecomment-1149320236 cc @maryannxue @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon closed pull request #36788: [SPARK-39401][SQL][TESTS] Replace withView with withTempView in CTEInlineSuite

2022-06-07 Thread GitBox
HyukjinKwon closed pull request #36788: [SPARK-39401][SQL][TESTS] Replace withView with withTempView in CTEInlineSuite URL: https://github.com/apache/spark/pull/36788 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] HyukjinKwon commented on pull request #36788: [SPARK-39401][SQL][TESTS] Replace withView with withTempView in CTEInlineSuite

2022-06-07 Thread GitBox
HyukjinKwon commented on PR #36788: URL: https://github.com/apache/spark/pull/36788#issuecomment-1149311965 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template

2022-06-07 Thread GitBox
HyukjinKwon commented on code in PR #36789: URL: https://github.com/apache/spark/pull/36789#discussion_r891812728 ## conf/spark-env.sh.template: ## @@ -79,3 +80,6 @@ # Options for beeline # - SPARK_BEELINE_OPTS, to set config properties only for the beeline cli (e.g.

[GitHub] [spark] HyukjinKwon commented on pull request #36782: [SPARK-39394][DOCS][SS] Improve PySpark Structured Streaming page more readable

2022-06-07 Thread GitBox
HyukjinKwon commented on PR #36782: URL: https://github.com/apache/spark/pull/36782#issuecomment-1149309255 @itholic mind creating a backporting PR to 3.3? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon commented on a diff in pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-07 Thread GitBox
HyukjinKwon commented on code in PR #36793: URL: https://github.com/apache/spark/pull/36793#discussion_r891811193 ## python/pyspark/sql/tests/test_session.py: ## @@ -379,6 +381,54 @@ def test_use_custom_class_for_extensions(self): ) +class

[GitHub] [spark] dongjoon-hyun commented on pull request #36795: [SPARK-39407][SQL][TESTS] Fix ParquetIOSuite to handle the case where DNS responses on `nonexistent`

2022-06-07 Thread GitBox
dongjoon-hyun commented on PR #36795: URL: https://github.com/apache/spark/pull/36795#issuecomment-1149307716 Thank you, @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun closed pull request #36795: [SPARK-39407][SQL][TESTS] Fix ParquetIOSuite to handle the case where DNS responses on `nonexistent`

2022-06-07 Thread GitBox
dongjoon-hyun closed pull request #36795: [SPARK-39407][SQL][TESTS] Fix ParquetIOSuite to handle the case where DNS responses on `nonexistent` URL: https://github.com/apache/spark/pull/36795 -- This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] dongjoon-hyun commented on pull request #36795: [SPARK-39407][SQL][TESTS] Fix ParquetIOSuite to handle the case where DNS responses on `nonexistent`

2022-06-07 Thread GitBox
dongjoon-hyun commented on PR #36795: URL: https://github.com/apache/spark/pull/36795#issuecomment-1149307127 Thank you, @huaxingao . Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] JoshRosen closed pull request #36794: Add Serializable Trait to DirectBinaryAvroDecoder

2022-06-07 Thread GitBox
JoshRosen closed pull request #36794: Add Serializable Trait to DirectBinaryAvroDecoder URL: https://github.com/apache/spark/pull/36794 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] eswardhinakaran-toast commented on pull request #36794: Add Serializable Trait to DirectBinaryAvroDecoder

2022-06-07 Thread GitBox
eswardhinakaran-toast commented on PR #36794: URL: https://github.com/apache/spark/pull/36794#issuecomment-1149281389 i meant to close this PR. accidental pull request On Tue, Jun 7, 2022 at 7:40 PM UCB AMPLab ***@***.***> wrote: > Can one of the admins verify this patch? >

[GitHub] [spark] AmplabJenkins commented on pull request #36794: Add Serializable Trait to DirectBinaryAvroDecoder

2022-06-07 Thread GitBox
AmplabJenkins commented on PR #36794: URL: https://github.com/apache/spark/pull/36794#issuecomment-1149277712 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] eswardhinakaran-toast closed pull request #36796: Add serializable trait to direct binary avro decoder

2022-06-07 Thread GitBox
eswardhinakaran-toast closed pull request #36796: Add serializable trait to direct binary avro decoder URL: https://github.com/apache/spark/pull/36796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] eswardhinakaran-toast opened a new pull request, #36796: Add serializable trait to direct binary avro decoder

2022-06-07 Thread GitBox
eswardhinakaran-toast opened a new pull request, #36796: URL: https://github.com/apache/spark/pull/36796 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] dongjoon-hyun opened a new pull request, #36795: [SPARK-39407][SQL][TESTS] Fix ParquetIOSuite to handle the case where DNS responses on

2022-06-07 Thread GitBox
dongjoon-hyun opened a new pull request, #36795: URL: https://github.com/apache/spark/pull/36795 … ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] eswardhinakaran-toast opened a new pull request, #36794: Add Serializable Trait to DirectBinaryAvroDecoder

2022-06-07 Thread GitBox
eswardhinakaran-toast opened a new pull request, #36794: URL: https://github.com/apache/spark/pull/36794 ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change?

[GitHub] [spark] xinrong-databricks commented on a diff in pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-07 Thread GitBox
xinrong-databricks commented on code in PR #36793: URL: https://github.com/apache/spark/pull/36793#discussion_r891581163 ## python/pyspark/sql/session.py: ## @@ -952,12 +953,30 @@ def createDataFrame( # type: ignore[misc] schema = [x.encode("utf-8") if not

[GitHub] [spark] dongjoon-hyun commented on pull request #35561: [MINOR][DOCS] Fixed closing tags in running-on-kubernetes.md

2022-06-07 Thread GitBox
dongjoon-hyun commented on PR #35561: URL: https://github.com/apache/spark/pull/35561#issuecomment-1149066369 Thank YOU for your contribution. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36787: [SPARK-39387][BUILD][FOLLOWUP] Upgrade hive-storage-api to 2.7.3

2022-06-07 Thread GitBox
dongjoon-hyun commented on code in PR #36787: URL: https://github.com/apache/spark/pull/36787#discussion_r891607243 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala: ## @@ -832,6 +832,18 @@ abstract class OrcQuerySuite extends

[GitHub] [spark] zr-msft commented on pull request #35561: [MINOR][DOCS] Fixed closing tags in running-on-kubernetes.md

2022-06-07 Thread GitBox
zr-msft commented on PR #35561: URL: https://github.com/apache/spark/pull/35561#issuecomment-1149052032 thank you @dongjoon-hyun, I made the assumption that the live site was updated after a PR was merged. I see my updates on the rc5 docs and thank you for the quick response -- This is

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891606582 ## sql/catalyst/src/main/scala/org/apache/spark/sql/AnalysisException.scala: ## @@ -36,13 +36,31 @@ class AnalysisException protected[sql] ( @transient val plan:

[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #36787: [SPARK-39387][BUILD][FOLLOWUP] Upgrade hive-storage-api to 2.7.3

2022-06-07 Thread GitBox
dongjoon-hyun commented on code in PR #36787: URL: https://github.com/apache/spark/pull/36787#discussion_r891605317 ## sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcQuerySuite.scala: ## @@ -832,6 +832,18 @@ abstract class OrcQuerySuite extends

[GitHub] [spark] Eugene-Mark commented on a diff in pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

2022-06-07 Thread GitBox
Eugene-Mark commented on code in PR #36499: URL: https://github.com/apache/spark/pull/36499#discussion_r891588142 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala: ## @@ -96,4 +97,29 @@ private case object TeradataDialect extends JdbcDialect {

[GitHub] [spark] Eugene-Mark commented on a diff in pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

2022-06-07 Thread GitBox
Eugene-Mark commented on code in PR #36499: URL: https://github.com/apache/spark/pull/36499#discussion_r891588142 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala: ## @@ -96,4 +97,29 @@ private case object TeradataDialect extends JdbcDialect {

[GitHub] [spark] xinrong-databricks commented on a diff in pull request #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-07 Thread GitBox
xinrong-databricks commented on code in PR #36793: URL: https://github.com/apache/spark/pull/36793#discussion_r891581163 ## python/pyspark/sql/session.py: ## @@ -952,12 +953,30 @@ def createDataFrame( # type: ignore[misc] schema = [x.encode("utf-8") if not

[GitHub] [spark] xinrong-databricks opened a new pull request, #36793: [SPARK-39406][PYTHON] Accept NumPy array in createDataFrame

2022-06-07 Thread GitBox
xinrong-databricks opened a new pull request, #36793: URL: https://github.com/apache/spark/pull/36793 ### What changes were proposed in this pull request? Accept numpy array in createDataFrame, with existing dtypes support. ### Why are the changes needed? As part of

[GitHub] [spark] amaliujia commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
amaliujia commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891563533 ## sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala: ## @@ -117,14 +128,44 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {

[GitHub] [spark] LuciferYang closed pull request #36694: [MINOR][BUILD] Remove redundant maven `` definition

2022-06-07 Thread GitBox
LuciferYang closed pull request #36694: [MINOR][BUILD] Remove redundant maven `` definition URL: https://github.com/apache/spark/pull/36694 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] vli-databricks commented on pull request #36792: [SPARK-39392][SQL][3.3] Refine ANSI error messages for try_* function hints

2022-06-07 Thread GitBox
vli-databricks commented on PR #36792: URL: https://github.com/apache/spark/pull/36792#issuecomment-1149003141 @MaxGekk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] vli-databricks opened a new pull request, #36792: [SPARK-39392][SQL][3.3] Refine ANSI error messages for try_* function hints

2022-06-07 Thread GitBox
vli-databricks opened a new pull request, #36792: URL: https://github.com/apache/spark/pull/36792 Refine ANSI error messages and remove 'To return NULL instead' ### What changes were proposed in this pull request? ### Why are the changes needed? Improve

[GitHub] [spark] jerrypeng commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
jerrypeng commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891557370 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] amaliujia commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
amaliujia commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891537461 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -553,4 +570,100 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

[GitHub] [spark] amaliujia commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
amaliujia commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891534128 ## sql/core/src/main/scala/org/apache/spark/sql/catalog/interface.scala: ## @@ -64,15 +65,34 @@ class Database( @Stable class Table( val name: String, -

[GitHub] [spark] jerrypeng commented on a diff in pull request #36737: [SPARK-39347] [SS] Generate wrong time window when (timestamp-startTime) % slideDuration…

2022-06-07 Thread GitBox
jerrypeng commented on code in PR #36737: URL: https://github.com/apache/spark/pull/36737#discussion_r891532480 ## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ## @@ -3963,8 +3966,10 @@ object TimeWindowing extends Rule[LogicalPlan] {

[GitHub] [spark] Eugene-Mark commented on a diff in pull request #36499: [SPARK-38846][SQL] Add explicit data mapping between Teradata Numeric Type and Spark DecimalType

2022-06-07 Thread GitBox
Eugene-Mark commented on code in PR #36499: URL: https://github.com/apache/spark/pull/36499#discussion_r891528661 ## sql/core/src/main/scala/org/apache/spark/sql/jdbc/TeradataDialect.scala: ## @@ -96,4 +97,29 @@ private case object TeradataDialect extends JdbcDialect {

[GitHub] [spark] amaliujia commented on a diff in pull request #36586: [SPARK-39236][SQL] Make CreateTable and ListTables be compatible with 3 layer namespace

2022-06-07 Thread GitBox
amaliujia commented on code in PR #36586: URL: https://github.com/apache/spark/pull/36586#discussion_r891497676 ## sql/core/src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala: ## @@ -553,4 +570,100 @@ class CatalogSuite extends SharedSparkSession with AnalysisTest

[GitHub] [spark] vli-databricks closed pull request #36791: [SPARK-39392][SQL][3.3] Refine ANSI error messages for try_* function…

2022-06-07 Thread GitBox
vli-databricks closed pull request #36791: [SPARK-39392][SQL][3.3] Refine ANSI error messages for try_* function… URL: https://github.com/apache/spark/pull/36791 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] vli-databricks opened a new pull request, #36791: [SPARK-39392][SQL][3.3] Refine ANSI error messages for try_* function…

2022-06-07 Thread GitBox
vli-databricks opened a new pull request, #36791: URL: https://github.com/apache/spark/pull/36791 … hints ### What changes were proposed in this pull request? Refine ANSI error messages and remove 'To return NULL instead' ### Why are the changes needed?

[GitHub] [spark] dongjoon-hyun commented on pull request #35561: [MINOR][DOCS] Fixed closing tags in running-on-kubernetes.md

2022-06-07 Thread GitBox
dongjoon-hyun commented on PR #35561: URL: https://github.com/apache/spark/pull/35561#issuecomment-1148929814 Hi, @zr-msft . Did you check Apache Spark 3.3 RC5 document? It should be there. - https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc5-docs/_site/index.html For

[GitHub] [spark] zr-msft commented on pull request #35561: [MINOR][DOCS] Fixed closing tags in running-on-kubernetes.md

2022-06-07 Thread GitBox
zr-msft commented on PR #35561: URL: https://github.com/apache/spark/pull/35561#issuecomment-1148923071 @dongjoon-hyun I've periodically checked the docs site and I'm not seeing any changes show up based on commits i've added from this PR: *

[GitHub] [spark] pan3793 commented on pull request #36784: [SPARK-39396][SQL] Fix LDAP login exception 'error code 49 - invalid credentials'

2022-06-07 Thread GitBox
pan3793 commented on PR #36784: URL: https://github.com/apache/spark/pull/36784#issuecomment-1148900904 Thanks for ping me, I think the current `LdapAuthenticationProviderImpl` comes from a very old version of Hive w/o UT, so the exsiting UT can not cover your change. The

[GitHub] [spark] dtenedor commented on a diff in pull request #36745: [SPARK-39359][SQL] Restrict DEFAULT columns to allowlist of supported data source types

2022-06-07 Thread GitBox
dtenedor commented on code in PR #36745: URL: https://github.com/apache/spark/pull/36745#discussion_r891452022 ## sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala: ## @@ -2881,6 +2881,15 @@ object SQLConf { .booleanConf

[GitHub] [spark] wangyum opened a new pull request, #36790: [SPARK-39402][SQL] Optimize ReplaceCTERefWithRepartition to support coalesce partitions

2022-06-07 Thread GitBox
wangyum opened a new pull request, #36790: URL: https://github.com/apache/spark/pull/36790 ### What changes were proposed in this pull request? Optimize `ReplaceCTERefWithRepartition` to support coalesce partitions. For example: Before this PR | After this PR -- | --

[GitHub] [spark] pan3793 opened a new pull request, #36789: [SPARK-39403] Add SPARK_SUBMIT_OPTS in spark-env.sh.template

2022-06-07 Thread GitBox
pan3793 opened a new pull request, #36789: URL: https://github.com/apache/spark/pull/36789 ### What changes were proposed in this pull request? Add SPARK_SUBMIT_OPTS in spark-env.sh.template ### Why are the changes needed? Spark support using SPARK_SUBMIT_OPTS to

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891425678 ## core/src/main/java/org/apache/spark/memory/SparkOutOfMemoryError.java: ## @@ -39,11 +39,17 @@ public SparkOutOfMemoryError(OutOfMemoryError e) { } public

[GitHub] [spark] srielau commented on a diff in pull request #36693: [SPARK-39349] Add a centralized CheckError method for QA of error path

2022-06-07 Thread GitBox
srielau commented on code in PR #36693: URL: https://github.com/apache/spark/pull/36693#discussion_r891425369 ## core/src/test/scala/org/apache/spark/SparkFunSuite.scala: ## @@ -264,6 +264,87 @@ abstract class SparkFunSuite } } + /** + * Checks an exception with an

  1   2   3   >