[GitHub] [spark] SparkQA commented on pull request #31480: [SPARK-32384][CORE] repartitionAndSortWithinPartitions avoid shuffle with same partitioner

2021-02-05 Thread GitBox
SparkQA commented on pull request #31480: URL: https://github.com/apache/spark/pull/31480#issuecomment-773777161 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AngersZhuuuu edited a comment on pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
AngersZh edited a comment on pull request #30957: URL: https://github.com/apache/spark/pull/30957#issuecomment-773141579 Gentle ping @maropu @cloud-fan Sorry for my late reply. Now I have change it to use json format. And it work well. Since there are two function `StructToJson` and

[GitHub] [spark] dongjoon-hyun commented on pull request #31133: [SPARK-26836][SQL] Supporting Avro schema evolution for partitioned Hive tables

2021-02-05 Thread GitBox
dongjoon-hyun commented on pull request #31133: URL: https://github.com/apache/spark/pull/31133#issuecomment-773564889 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] Ngone51 commented on pull request #30650: [SPARK-24818][CORE] Support delay scheduling for barrier execution

2021-02-05 Thread GitBox
Ngone51 commented on pull request #30650: URL: https://github.com/apache/spark/pull/30650#issuecomment-773860013 That sounds great to me. I've addressed it in 555b3a8 with a unit test. Please take another look. BTW, do you think we need a conf to be able to restore the original

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31482: [SPARK-34346][CORE][SQL][3.1] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31482: URL: https://github.com/apache/spark/pull/31482#issuecomment-773839228 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] jnh5y commented on pull request #31461: [SPARK-7768][CORE][SQL] Open UserDefinedType as a Developer API

2021-02-05 Thread GitBox
jnh5y commented on pull request #31461: URL: https://github.com/apache/spark/pull/31461#issuecomment-773355284 Hi all, LocationTech GeoMesa is a consumer of the previously package private UDT/UDF APIs in order implement spatial data types (like Point, LineString, Polygon) and associated

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31448: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`.

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31448: URL: https://github.com/apache/spark/pull/31448#issuecomment-773111773 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HyukjinKwon closed pull request #31460: [SPARK-34346][CORE][SQL] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf regres

2021-02-05 Thread GitBox
HyukjinKwon closed pull request #31460: URL: https://github.com/apache/spark/pull/31460 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] razajafri commented on pull request #31284: [SPARK-34167][SQL]Reading parquet with IntDecimal written as a LongDecimal blows up

2021-02-05 Thread GitBox
razajafri commented on pull request #31284: URL: https://github.com/apache/spark/pull/31284#issuecomment-773471709 > Do we need to change this part? > >

[GitHub] [spark] SparkQA removed a comment on pull request #31475: [SPARK-34360][SQL] Support table truncation by v2 Table Catalogs

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31475: URL: https://github.com/apache/spark/pull/31475#issuecomment-773379588 **[Test build #134876 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134876/testReport)** for PR 31475 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #31481: [SQL][MINOR][TEST][3.1] Re-enable some DS v2 char/varchar test

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31481: URL: https://github.com/apache/spark/pull/31481#issuecomment-773796124 **[Test build #134909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134909/testReport)** for PR 31481 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #31462: [SPARK-34347][SQL] CatalogImpl.uncacheTable should invalidate in cascade for temp views

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31462: URL: https://github.com/apache/spark/pull/31462#issuecomment-773065658 **[Test build #134858 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134858/testReport)** for PR 31462 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #31476: [SPARK-34366][SQL] Add interface for DS v2 metrics

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31476: URL: https://github.com/apache/spark/pull/31476#issuecomment-773632690 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31475: [SPARK-34360][SQL] Support table truncation by v2 Table Catalogs

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31475: URL: https://github.com/apache/spark/pull/31475#issuecomment-773460642 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31474: [SPARK-34359][SQL] add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31474: URL: https://github.com/apache/spark/pull/31474#issuecomment-773460645 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan commented on pull request #31481: [MINOR][REST] Re-enable some DS v2 char/varchar test

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31481: URL: https://github.com/apache/spark/pull/31481#issuecomment-773790463 @yaooqinn @maropu This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] Ngone51 commented on pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
Ngone51 commented on pull request #31348: URL: https://github.com/apache/spark/pull/31348#issuecomment-773878666 cc @cloud-fan @jiangxb1987 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31474: [SPARK-34359][SQL] Add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
HyukjinKwon commented on a change in pull request #31474: URL: https://github.com/apache/spark/pull/31474#discussion_r570649690 ## File path: docs/sql-migration-guide.md ## @@ -105,6 +105,8 @@ license: | - In Spark 3.0.2, `PARTITION(col=null)` is always parsed as a null

[GitHub] [spark] beliefer commented on a change in pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system

2021-02-05 Thread GitBox
beliefer commented on a change in pull request #31466: URL: https://github.com/apache/spark/pull/31466#discussion_r570699101 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -260,9 +260,6 @@ class SQLQueryTestSuite extends QueryTest with

[GitHub] [spark] beliefer commented on a change in pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly

2021-02-05 Thread GitBox
beliefer commented on a change in pull request #31245: URL: https://github.com/apache/spark/pull/31245#discussion_r570791022 ## File path: docs/sql-migration-guide.md ## @@ -40,6 +40,10 @@ license: | - In Spark 3.2, script transform default FIELD DELIMIT is `\u0001` for no

[GitHub] [spark] maropu commented on a change in pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system

2021-02-05 Thread GitBox
maropu commented on a change in pull request #31466: URL: https://github.com/apache/spark/pull/31466#discussion_r570644961 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -566,7 +563,14 @@ class SQLQueryTestSuite extends QueryTest with

[GitHub] [spark] yaooqinn commented on pull request #31481: [MINOR][TEST] Re-enable some DS v2 char/varchar test

2021-02-05 Thread GitBox
yaooqinn commented on pull request #31481: URL: https://github.com/apache/spark/pull/31481#issuecomment-773798240 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] HeartSaVioR commented on pull request #31471: [SPARK-34355][SQL] Add log and time cost for commit job

2021-02-05 Thread GitBox
HeartSaVioR commented on pull request #31471: URL: https://github.com/apache/spark/pull/31471#issuecomment-773744959 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] dongjoon-hyun commented on pull request #31482: [SPARK-34346][CORE][SQL][3.1] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may caus

2021-02-05 Thread GitBox
dongjoon-hyun commented on pull request #31482: URL: https://github.com/apache/spark/pull/31482#issuecomment-773823111 Thank you for making a backport, @yaooqinn . This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #31437: [SPARK-34329][YARN] When hit ApplicationAttemptNotFoundException, we can't just stop app for all case

2021-02-05 Thread GitBox
SparkQA commented on pull request #31437: URL: https://github.com/apache/spark/pull/31437#issuecomment-773393996 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31472: [SPARK-34356][ML] OVR transform fix potential column conflict

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31472: URL: https://github.com/apache/spark/pull/31472#issuecomment-773287504 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31384: [SPARK-31816][SQL][DOCS] Added high level description about JDBC connection providers for users/developers

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31384: URL: https://github.com/apache/spark/pull/31384#issuecomment-773460643 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] Ngone51 commented on a change in pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
Ngone51 commented on a change in pull request #31348: URL: https://github.com/apache/spark/pull/31348#discussion_r570351605 ## File path: core/src/main/scala/org/apache/spark/deploy/DeployMessage.scala ## @@ -200,6 +200,8 @@ private[deploy] object DeployMessages { case

[GitHub] [spark] SparkQA commented on pull request #31484: [SPARK-34374][SQL][DSTREAM] Use standard methods to extract keys or values from a Map

2021-02-05 Thread GitBox
SparkQA commented on pull request #31484: URL: https://github.com/apache/spark/pull/31484#issuecomment-773846745 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] maropu commented on pull request #31456: [SPARK-34343][SQL][TESTS] Add missing test for some non-array types in PostgreSQL

2021-02-05 Thread GitBox
maropu commented on pull request #31456: URL: https://github.com/apache/spark/pull/31456#issuecomment-773798533 Thanks! Merged to master. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA removed a comment on pull request #31445: [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31445: URL: https://github.com/apache/spark/pull/31445#issuecomment-773404117 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #31485: [SPARK-SQL][34137] Update suquery's stats when build LogicalPlan's stats

2021-02-05 Thread GitBox
SparkQA commented on pull request #31485: URL: https://github.com/apache/spark/pull/31485#issuecomment-773875429 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA removed a comment on pull request #28885: [SPARK-29375][SPARK-28940][SPARK-32041][SQL] Whole plan exchange and subquery reuse

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #28885: URL: https://github.com/apache/spark/pull/28885#issuecomment-773464818 **[Test build #134885 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134885/testReport)** for PR 28885 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #31133: [SPARK-26836][SQL] Supporting Avro schema evolution for partitioned Hive tables with "avro.schema.literal"

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31133: URL: https://github.com/apache/spark/pull/31133#issuecomment-773582316 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31348: URL: https://github.com/apache/spark/pull/31348#issuecomment-773794502 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] viirya commented on a change in pull request #31440: [SPARK-34331][SQL] Speed up DS v2 metadata col resolution

2021-02-05 Thread GitBox
viirya commented on a change in pull request #31440: URL: https://github.com/apache/spark/pull/31440#discussion_r570697705 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -964,11 +964,37 @@ class Analyzer(override val

[GitHub] [spark] zhengruifeng commented on pull request #31480: [SPARK-32384][CORE] repartitionAndSortWithinPartitions avoid shuffle with same partitioner

2021-02-05 Thread GitBox
zhengruifeng commented on pull request #31480: URL: https://github.com/apache/spark/pull/31480#issuecomment-773752136 this previous pr https://github.com/apache/spark/pull/29185 is stale, so I open this one. testCode: ``` import org.apache.spark.HashPartitioner val

[GitHub] [spark] AmplabJenkins commented on pull request #31133: [SPARK-26836][SQL] Supporting Avro schema evolution for partitioned Hive tables with "avro.schema.literal"

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31133: URL: https://github.com/apache/spark/pull/31133#issuecomment-773608335 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] Ngone51 commented on pull request #30650: [SPARK-24818][CORE] Support delay scheduling for barrier execution

2021-02-05 Thread GitBox
Ngone51 commented on pull request #30650: URL: https://github.com/apache/spark/pull/30650#issuecomment-773860013 That sounds great to me. I've addressed it in 555b3a8 with a unit test. Please take another look. BTW, do you think we need a conf to be able to restore the original

[GitHub] [spark] SparkQA removed a comment on pull request #31474: [SPARK-34359][SQL] add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31474: URL: https://github.com/apache/spark/pull/31474#issuecomment-773388356 **[Test build #134877 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134877/testReport)** for PR 31474 at commit

[GitHub] [spark] cloud-fan commented on pull request #31478: [SPARK-34371][SQL][TESTS] Run the datetime rebasing tests for Parquet datasource v1 and v2

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31478: URL: https://github.com/apache/spark/pull/31478#issuecomment-773842135 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] maropu commented on pull request #31477: [SPARK-34369][SQL][WEBUI] Track number of pairs processed out of Join.

2021-02-05 Thread GitBox
maropu commented on pull request #31477: URL: https://github.com/apache/spark/pull/31477#issuecomment-773681938 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] jiangxb1987 commented on a change in pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
jiangxb1987 commented on a change in pull request #31348: URL: https://github.com/apache/spark/pull/31348#discussion_r570008104 ## File path: core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala ## @@ -321,4 +321,15 @@ private[spark] class

[GitHub] [spark] SparkQA removed a comment on pull request #31437: [SPARK-34329][YARN] When hit ApplicationAttemptNotFoundException, we can't just stop app for all case

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31437: URL: https://github.com/apache/spark/pull/31437#issuecomment-773393996 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #31284: [SPARK-34167][SQL]Reading parquet with IntDecimal written as a LongDecimal blows up

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31284: URL: https://github.com/apache/spark/pull/31284#issuecomment-773632692 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #30957: URL: https://github.com/apache/spark/pull/30957#issuecomment-773222390 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan commented on pull request #31486: [SPARK-34359][SQL][3.1] Add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31486: URL: https://github.com/apache/spark/pull/31486#issuecomment-773870656 @HyukjinKwon @dongjoon-hyun @maropu This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31474: [SPARK-34359][SQL] add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
AngersZh commented on a change in pull request #31474: URL: https://github.com/apache/spark/pull/31474#discussion_r570299776 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowNamespacesSuite.scala ## @@ -38,6 +38,12 @@ trait

[GitHub] [spark] AngersZhuuuu edited a comment on pull request #31437: [SPARK-34329][YARN] When hit ApplicationAttemptNotFoundException, we can't just stop app for all case

2021-02-05 Thread GitBox
AngersZh edited a comment on pull request #31437: URL: https://github.com/apache/spark/pull/31437#issuecomment-773369627 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #31258: [SPARK-34168] [SQL] Support DPP in AQE when the join is Broadcast hash join at the beginning

2021-02-05 Thread GitBox
SparkQA commented on pull request #31258: URL: https://github.com/apache/spark/pull/31258#issuecomment-773092921 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA commented on pull request #31466: [WIP][SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system

2021-02-05 Thread GitBox
SparkQA commented on pull request #31466: URL: https://github.com/apache/spark/pull/31466#issuecomment-773122521 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins commented on pull request #31455: [SPARK-34342][SQL] Format DateLiteral and TimestampLiteral toString

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31455: URL: https://github.com/apache/spark/pull/31455#issuecomment-773393046 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #31479: [SPARK-34373][SQL] HiveThriftServer2 startWithContext may hang with a race issue

2021-02-05 Thread GitBox
SparkQA commented on pull request #31479: URL: https://github.com/apache/spark/pull/31479#issuecomment-773776064 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
AngersZh commented on a change in pull request #30957: URL: https://github.com/apache/spark/pull/30957#discussion_r570798783 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/BaseScriptTransformationSuite.scala ## @@ -471,6 +473,126 @@ abstract class

[GitHub] [spark] cloud-fan commented on a change in pull request #31440: [SPARK-34331][SQL] Speed up DS v2 metadata col resolution

2021-02-05 Thread GitBox
cloud-fan commented on a change in pull request #31440: URL: https://github.com/apache/spark/pull/31440#discussion_r570797933 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ## @@ -964,11 +964,37 @@ class Analyzer(override val

[GitHub] [spark] SparkQA commented on pull request #31481: [MINOR][TEST] Re-enable some DS v2 char/varchar test

2021-02-05 Thread GitBox
SparkQA commented on pull request #31481: URL: https://github.com/apache/spark/pull/31481#issuecomment-773796124 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] maropu commented on pull request #31483: [SPARK-33434][PYTHON][DOCS] Added RuntimeConfig to PySpark docs

2021-02-05 Thread GitBox
maropu commented on pull request #31483: URL: https://github.com/apache/spark/pull/31483#issuecomment-773843320 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] Ngone51 commented on pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
Ngone51 commented on pull request #31348: URL: https://github.com/apache/spark/pull/31348#issuecomment-773878666 cc @cloud-fan @jiangxb1987 This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] ulysses-you commented on pull request #31471: [SPARK-34355][SQL] Add log and time cost for commit job

2021-02-05 Thread GitBox
ulysses-you commented on pull request #31471: URL: https://github.com/apache/spark/pull/31471#issuecomment-773773206 @HeartSaVioR thanks for the sharing. Yea, agree. It's better to provide a progress-like stuff during commit job, but seems it's hard to do this at Spark side.

[GitHub] [spark] AmplabJenkins commented on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31204: URL: https://github.com/apache/spark/pull/31204#issuecomment-773839233 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan commented on pull request #31474: [SPARK-34359][SQL] add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31474: URL: https://github.com/apache/spark/pull/31474#issuecomment-773350647 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins commented on pull request #31445: [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31445: URL: https://github.com/apache/spark/pull/31445#issuecomment-773417127 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan closed pull request #31474: [SPARK-34359][SQL] Add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
cloud-fan closed pull request #31474: URL: https://github.com/apache/spark/pull/31474 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31468: URL: https://github.com/apache/spark/pull/31468#issuecomment-773222391 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #31464: [SPARK-34339][CORE][SQL] Expose the number of total paths in Utils.buildLocationMetadata()

2021-02-05 Thread GitBox
SparkQA commented on pull request #31464: URL: https://github.com/apache/spark/pull/31464#issuecomment-773079984 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] gaborgsomogyi commented on pull request #31384: [SPARK-31816][SQL][DOCS] Added high level description about JDBC connection providers for users/developers

2021-02-05 Thread GitBox
gaborgsomogyi commented on pull request #31384: URL: https://github.com/apache/spark/pull/31384#issuecomment-773353352 @HeartSaVioR thanks for having a look and made it better! This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon commented on pull request #31467: [SPARK-33212][FOLLOW-UP][BUILD] Uses provided properties for Hadoop client dependencies in root pom

2021-02-05 Thread GitBox
HyukjinKwon commented on pull request #31467: URL: https://github.com/apache/spark/pull/31467#issuecomment-773128556 Merged to master. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AmplabJenkins commented on pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31466: URL: https://github.com/apache/spark/pull/31466#issuecomment-773146059 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #31133: [SPARK-26836][SQL] Supporting Avro schema evolution for partitioned Hive tables

2021-02-05 Thread GitBox
dongjoon-hyun edited a comment on pull request #31133: URL: https://github.com/apache/spark/pull/31133#issuecomment-773567152 Thank you for updating, @attilapiros . But, I'm still not sure about the other properties in those namespace. For `avro.schema.literal`, your test case provides

[GitHub] [spark] maropu commented on a change in pull request #31477: [SPARK-34369][SQL][WEBUI] Track number of pairs processed out of Join.

2021-02-05 Thread GitBox
maropu commented on a change in pull request #31477: URL: https://github.com/apache/spark/pull/31477#discussion_r570624874 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -57,7 +57,8 @@ case class

[GitHub] [spark] zsxwing commented on a change in pull request #31476: [SPARK-34366][SQL] Add interface for DS v2 metrics

2021-02-05 Thread GitBox
zsxwing commented on a change in pull request #31476: URL: https://github.com/apache/spark/pull/31476#discussion_r57030 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/CustomMetric.java ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] maryannxue commented on pull request #30829: [SPARK-33832][SQL] Add an option in AQE to mitigate skew even if it c…

2021-02-05 Thread GitBox
maryannxue commented on pull request #30829: URL: https://github.com/apache/spark/pull/30829#issuecomment-77371 @ekoifman This check is to guard against re-planning that turns an already shuffle-materialized SMJ to BHJ while causing extra shuffles downstream. One way around this

[GitHub] [spark] SparkQA commented on pull request #29210: [SPARK-24497][SQL] Support recursive SQL query

2021-02-05 Thread GitBox
SparkQA commented on pull request #29210: URL: https://github.com/apache/spark/pull/29210#issuecomment-773464278 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] sarutak commented on pull request #31456: [SPARK-34343][SQL][TESTS] Add missing test for some non-array types in PostgreSQL

2021-02-05 Thread GitBox
sarutak commented on pull request #31456: URL: https://github.com/apache/spark/pull/31456#issuecomment-773514333 SPARK-34357 (#31473) changed the type-mapping of JDBC `time` from `Integer` to `Timestamp` so I've modified the test for `time` in this PR to follow the previous change. I

[GitHub] [spark] saikocat commented on a change in pull request #31473: [SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone

2021-02-05 Thread GitBox
saikocat commented on a change in pull request #31473: URL: https://github.com/apache/spark/pull/31473#discussion_r570218581 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala ## @@ -470,6 +455,27 @@ object JdbcUtils extends

[GitHub] [spark] zhengruifeng commented on a change in pull request #31394: [SPARK-34291][ML] LSH hashDistance optimization

2021-02-05 Thread GitBox
zhengruifeng commented on a change in pull request #31394: URL: https://github.com/apache/spark/pull/31394#discussion_r569996166 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/BucketedRandomProjectionLSH.scala ## @@ -97,7 +97,19 @@ class

[GitHub] [spark] sunchao commented on a change in pull request #31462: [SPARK-34347][SQL] CatalogImpl.uncacheTable should invalidate in cascade for temp views

2021-02-05 Thread GitBox
sunchao commented on a change in pull request #31462: URL: https://github.com/apache/spark/pull/31462#discussion_r570011001 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ## @@ -909,12 +909,12 @@ class SessionCatalog(

[GitHub] [spark] AmplabJenkins commented on pull request #31481: [SQL][MINOR][TEST][3.1] Re-enable some DS v2 char/varchar test

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31481: URL: https://github.com/apache/spark/pull/31481#issuecomment-773813067 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31348: URL: https://github.com/apache/spark/pull/31348#issuecomment-773753068 **[Test build #134902 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134902/testReport)** for PR 31348 at commit

[GitHub] [spark] sarutak edited a comment on pull request #31473: [SPARK-34357] Revert JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone

2021-02-05 Thread GitBox
sarutak edited a comment on pull request #31473: URL: https://github.com/apache/spark/pull/31473#issuecomment-773268380 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] MaxGekk commented on a change in pull request #31460: [SPARK-34346][CORE][SQL] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cau

2021-02-05 Thread GitBox
MaxGekk commented on a change in pull request #31460: URL: https://github.com/apache/spark/pull/31460#discussion_r570032069 ## File path: core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala ## @@ -450,13 +450,22 @@ private[spark] object SparkHadoopUtil {

[GitHub] [spark] SparkQA commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-05 Thread GitBox
SparkQA commented on pull request #31468: URL: https://github.com/apache/spark/pull/31468#issuecomment-773118983 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] viirya commented on a change in pull request #31451: [WIP][SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-02-05 Thread GitBox
viirya commented on a change in pull request #31451: URL: https://github.com/apache/spark/pull/31451#discussion_r570444688 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExecBase.scala ## @@ -32,8 +32,13 @@ import

[GitHub] [spark] wangyum commented on a change in pull request #31455: [SPARK-34342][SQL] Format DateLiteral and TimestampLiteral toString

2021-02-05 Thread GitBox
wangyum commented on a change in pull request #31455: URL: https://github.com/apache/spark/pull/31455#discussion_r570665361 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala ## @@ -298,11 +298,21 @@ case class Literal (value:

[GitHub] [spark] SparkQA removed a comment on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31470: URL: https://github.com/apache/spark/pull/31470#issuecomment-773155684 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #31474: [SPARK-34359][SQL] add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31474: URL: https://github.com/apache/spark/pull/31474#issuecomment-773460645 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-02-05 Thread GitBox
SparkQA commented on pull request #31204: URL: https://github.com/apache/spark/pull/31204#issuecomment-773796762 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] maropu commented on a change in pull request #31349: [SPARK-34246][SQL] New type coercion syntax rules in ANSI mode

2021-02-05 Thread GitBox
maropu commented on a change in pull request #31349: URL: https://github.com/apache/spark/pull/31349#discussion_r570661910 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercion.scala ## @@ -0,0 +1,316 @@ +/* + * Licensed to the

[GitHub] [spark] sarutak commented on pull request #31473: [SPARK-34357] Revert JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone

2021-02-05 Thread GitBox
sarutak commented on pull request #31473: URL: https://github.com/apache/spark/pull/31473#issuecomment-773264767 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA commented on pull request #31455: [SPARK-34342][SQL] Format DateLiteral and TimestampLiteral toString

2021-02-05 Thread GitBox
SparkQA commented on pull request #31455: URL: https://github.com/apache/spark/pull/31455#issuecomment-773307449 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] tgravescs commented on pull request #30650: [SPARK-24818][CORE] Support delay scheduling for barrier execution

2021-02-05 Thread GitBox
tgravescs commented on pull request #30650: URL: https://github.com/apache/spark/pull/30650#issuecomment-773575406 I don't have a strong opinion, seems like a good alternative. We generally recommend people run with locality wait=0 most of the time anyway because you can see very bad

[GitHub] [spark] zhengruifeng commented on pull request #31469: [MINOR][ML] Param Validation should throw IllegalArgumentException

2021-02-05 Thread GitBox
zhengruifeng commented on pull request #31469: URL: https://github.com/apache/spark/pull/31469#issuecomment-773133202 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31204: URL: https://github.com/apache/spark/pull/31204#issuecomment-773839233 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #31472: [SPARK-34356][ML] OVR transform fix potential column conflict

2021-02-05 Thread GitBox
SparkQA commented on pull request #31472: URL: https://github.com/apache/spark/pull/31472#issuecomment-773239575 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AngersZhuuuu commented on pull request #31437: [SPARK-34329][YARN] When hit ApplicationAttemptNotFoundException, we can't just stop app for all case

2021-02-05 Thread GitBox
AngersZh commented on pull request #31437: URL: https://github.com/apache/spark/pull/31437#issuecomment-773369627 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #30869: URL: https://github.com/apache/spark/pull/30869#issuecomment-773901509 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134927/

[GitHub] [spark] SparkQA commented on pull request #31463: [PYTHON][MINOR] Fix docstring of DataFrame.join

2021-02-05 Thread GitBox
SparkQA commented on pull request #31463: URL: https://github.com/apache/spark/pull/31463#issuecomment-773743657 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] github-actions[bot] commented on pull request #30158: [SPARK-33249][CORE][UI] Add status plugin for live application

2021-02-05 Thread GitBox
github-actions[bot] commented on pull request #30158: URL: https://github.com/apache/spark/pull/30158#issuecomment-773702450 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] SparkQA commented on pull request #31460: [SPARK-34346][CORE][SQL] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf regr

2021-02-05 Thread GitBox
SparkQA commented on pull request #31460: URL: https://github.com/apache/spark/pull/31460#issuecomment-773084538 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] ulysses-you commented on a change in pull request #31471: [SPARK-34355][SQL] Add log and time cost for commit job

2021-02-05 Thread GitBox
ulysses-you commented on a change in pull request #31471: URL: https://github.com/apache/spark/pull/31471#discussion_r570209231 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala ## @@ -217,8 +217,11 @@ object

<    1   2   3   4   5   6   7   8   >