[GitHub] [spark] AmplabJenkins commented on pull request #29210: [SPARK-24497][SQL] Support recursive SQL query

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #29210: URL: https://github.com/apache/spark/pull/29210#issuecomment-773499544 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan closed pull request #31473: [SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone

2021-02-05 Thread GitBox
cloud-fan closed pull request #31473: URL: https://github.com/apache/spark/pull/31473 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] SparkQA commented on pull request #31133: [SPARK-26836][SQL] Supporting Avro schema evolution for partitioned Hive tables

2021-02-05 Thread GitBox
SparkQA commented on pull request #31133: URL: https://github.com/apache/spark/pull/31133#issuecomment-773582316 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] sunchao commented on a change in pull request #31476: [SPARK-34366][SQL] Add interface for DS v2 metrics

2021-02-05 Thread GitBox
sunchao commented on a change in pull request #31476: URL: https://github.com/apache/spark/pull/31476#discussion_r570505479 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/CustomMetric.java ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache

[GitHub] [spark] saikocat commented on pull request #31473: [SPARK-34357] Revert JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone

2021-02-05 Thread GitBox
saikocat commented on pull request #31473: URL: https://github.com/apache/spark/pull/31473#issuecomment-773235806 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] beliefer commented on pull request #31448: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`.

2021-02-05 Thread GitBox
beliefer commented on pull request #31448: URL: https://github.com/apache/spark/pull/31448#issuecomment-773760880 cc @cloud-fan This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] viirya commented on pull request #31451: [WIP][SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-02-05 Thread GitBox
viirya commented on pull request #31451: URL: https://github.com/apache/spark/pull/31451#issuecomment-773564010 > I think that this could use some tests. It would be easier to add tests if it were split to add the interfaces, a batch implementation (with tests), and a stream

[GitHub] [spark] SparkQA removed a comment on pull request #31258: [SPARK-34168] [SQL] Support DPP in AQE when the join is Broadcast hash join at the beginning

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31258: URL: https://github.com/apache/spark/pull/31258#issuecomment-773092921 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] zhengruifeng edited a comment on pull request #31472: [SPARK-34356][ML] OVR transform fix potential column conflict

2021-02-05 Thread GitBox
zhengruifeng edited a comment on pull request #31472: URL: https://github.com/apache/spark/pull/31472#issuecomment-773203630 in 3.0.1 and master ``` scala> val df =

[GitHub] [spark] SparkQA commented on pull request #31467: [SPARK-33212][FOLLOW-UP][BUILD] Uses provided properties for Hadoop client dependencies in root pom

2021-02-05 Thread GitBox
SparkQA commented on pull request #31467: URL: https://github.com/apache/spark/pull/31467#issuecomment-773083370 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA removed a comment on pull request #31478: [SPARK-34371][SQL][TESTS] Run the datetime rebasing tests for Parquet datasource v1 and v2

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31478: URL: https://github.com/apache/spark/pull/31478#issuecomment-773691454 **[Test build #134894 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134894/testReport)** for PR 31478 at commit

[GitHub] [spark] maropu commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-05 Thread GitBox
maropu commented on pull request #31468: URL: https://github.com/apache/spark/pull/31468#issuecomment-773688646 The fix itself looks fine. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31481: [SQL][MINOR][TEST][3.1] Re-enable some DS v2 char/varchar test

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31481: URL: https://github.com/apache/spark/pull/31481#issuecomment-773813067 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39492/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31455: [SPARK-34342][SQL] Format DateLiteral and TimestampLiteral toString

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31455: URL: https://github.com/apache/spark/pull/31455#issuecomment-773393046 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #30957: URL: https://github.com/apache/spark/pull/30957#issuecomment-773150220 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] zhengruifeng commented on a change in pull request #31472: [SPARK-34356][ML] OVR transform fix potential column conflict

2021-02-05 Thread GitBox
zhengruifeng commented on a change in pull request #31472: URL: https://github.com/apache/spark/pull/31472#discussion_r570108063 ## File path: mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ## @@ -223,6 +223,13 @@ class OneVsRestSuite extends

[GitHub] [spark] maropu commented on a change in pull request #31455: [SPARK-34342][SQL] Format DateLiteral and TimestampLiteral toString

2021-02-05 Thread GitBox
maropu commented on a change in pull request #31455: URL: https://github.com/apache/spark/pull/31455#discussion_r570631170 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala ## @@ -298,11 +298,21 @@ case class Literal (value:

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system

2021-02-05 Thread GitBox
HyukjinKwon commented on a change in pull request #31466: URL: https://github.com/apache/spark/pull/31466#discussion_r570176457 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -260,9 +260,6 @@ class SQLQueryTestSuite extends QueryTest

[GitHub] [spark] HyukjinKwon edited a comment on pull request #31460: [SPARK-34346][CORE][SQL] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may caus

2021-02-05 Thread GitBox
HyukjinKwon edited a comment on pull request #31460: URL: https://github.com/apache/spark/pull/31460#issuecomment-773711647 Merged to master. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] zhengruifeng commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-05 Thread GitBox
zhengruifeng commented on pull request #31468: URL: https://github.com/apache/spark/pull/31468#issuecomment-773112265 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #30650: [SPARK-24818][CORE] Support delay scheduling for barrier execution

2021-02-05 Thread GitBox
SparkQA commented on pull request #30650: URL: https://github.com/apache/spark/pull/30650#issuecomment-773876021 **[Test build #134923 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134923/testReport)** for PR 30650 at commit

[GitHub] [spark] viirya commented on pull request #31461: [SPARK-7768][CORE][SQL] Open UserDefinedType as a Developer API

2021-02-05 Thread GitBox
viirya commented on pull request #31461: URL: https://github.com/apache/spark/pull/31461#issuecomment-773084514 > So what about #16478 @viirya ? is that still a sound proposal for refactoring the interface? That seems more ideal to add first. Hm, yea, the refactoring is trying to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31460: [SPARK-34346][CORE][SQL] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may c

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31460: URL: https://github.com/apache/spark/pull/31460#issuecomment-773087469 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
AngersZh commented on a change in pull request #30957: URL: https://github.com/apache/spark/pull/30957#discussion_r570795335 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala ## @@ -47,7 +47,13 @@ trait

[GitHub] [spark] SparkQA commented on pull request #31482: [SPARK-34346][CORE][SQL] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf regr

2021-02-05 Thread GitBox
SparkQA commented on pull request #31482: URL: https://github.com/apache/spark/pull/31482#issuecomment-773819012 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] cloud-fan commented on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31468: URL: https://github.com/apache/spark/pull/31468#issuecomment-773128349 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins commented on pull request #31477: [SPARK-34369][WEBUI][WIP] Track number of pairs processed out of Join.

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31477: URL: https://github.com/apache/spark/pull/31477#issuecomment-773574298 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #31476: [SPARK-34366][SQL] Add interface for DS v2 metrics

2021-02-05 Thread GitBox
SparkQA commented on pull request #31476: URL: https://github.com/apache/spark/pull/31476#issuecomment-773579295 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] zhengruifeng commented on pull request #29185: [SPARK-32384][CORE] repartitionAndSortWithinPartitions avoid shuffle with same partitioner

2021-02-05 Thread GitBox
zhengruifeng commented on pull request #29185: URL: https://github.com/apache/spark/pull/29185#issuecomment-773135060 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] cloud-fan commented on pull request #31440: [SPARK-34331][SQL] Speed up DS v2 metadata col resolution

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31440: URL: https://github.com/apache/spark/pull/31440#issuecomment-773475101 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] beliefer commented on pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system

2021-02-05 Thread GitBox
beliefer commented on pull request #31466: URL: https://github.com/apache/spark/pull/31466#issuecomment-773140157 cc @cloud-fan @wangyum @maropu This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] Ngone51 commented on a change in pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
Ngone51 commented on a change in pull request #31348: URL: https://github.com/apache/spark/pull/31348#discussion_r570796853 ## File path: core/src/main/scala/org/apache/spark/deploy/master/Master.scala ## @@ -550,6 +502,55 @@ private[deploy] class Master( } else {

[GitHub] [spark] viirya commented on a change in pull request #31476: [SPARK-34366][SQL] Add interface for DS v2 metrics

2021-02-05 Thread GitBox
viirya commented on a change in pull request #31476: URL: https://github.com/apache/spark/pull/31476#discussion_r570503897 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/CustomMetric.java ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31478: [SPARK-34371][SQL][TESTS] Run the datetime rebasing tests for Parquet datasource v1 and v2

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31478: URL: https://github.com/apache/spark/pull/31478#issuecomment-773684374 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #31475: [SPARK-34360][SQL] Support table truncation by v2 Table Catalogs

2021-02-05 Thread GitBox
SparkQA commented on pull request #31475: URL: https://github.com/apache/spark/pull/31475#issuecomment-773379588 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31477: [SPARK-34369][SQL][WEBUI] Track number of pairs processed out of Join.

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31477: URL: https://github.com/apache/spark/pull/31477#issuecomment-773574298 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] cloud-fan commented on pull request #31460: [SPARK-34346][CORE][SQL] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf re

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31460: URL: https://github.com/apache/spark/pull/31460#issuecomment-773127405 Hi @dongjoon-hyun , do you have more concerns about this fix? This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #31463: [PYTHON][MINOR] Fix docstring of DataFrame.join

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31463: URL: https://github.com/apache/spark/pull/31463#issuecomment-773743657 **[Test build #134900 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134900/testReport)** for PR 31463 at commit

[GitHub] [spark] srowen commented on pull request #31372: [SPARK-34272][SQL] Pretty SQL should check NonSQLExpression

2021-02-05 Thread GitBox
srowen commented on pull request #31372: URL: https://github.com/apache/spark/pull/31372#issuecomment-773362443 Fair enough, but it exists and already purports to provide a `.sql` method. You're already proposing a different/better `.sql` representation. Why wouldn't it be the natural

[GitHub] [spark] mridulm edited a comment on pull request #30650: [SPARK-24818][CORE] Support delay scheduling for barrier execution

2021-02-05 Thread GitBox
mridulm edited a comment on pull request #30650: URL: https://github.com/apache/spark/pull/30650#issuecomment-773512448 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] beliefer commented on a change in pull request #31474: [SPARK-34359][SQL] Add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
beliefer commented on a change in pull request #31474: URL: https://github.com/apache/spark/pull/31474#discussion_r570697714 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -3069,6 +3069,15 @@ object SQLConf { .booleanConf

[GitHub] [spark] SparkQA commented on pull request #31472: [SPARK-34356][ML] OVR transform fix potential column conflict

2021-02-05 Thread GitBox
SparkQA commented on pull request #31472: URL: https://github.com/apache/spark/pull/31472#issuecomment-773877317 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39501/

[GitHub] [spark] SparkQA removed a comment on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31204: URL: https://github.com/apache/spark/pull/31204#issuecomment-773796762 **[Test build #134911 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134911/testReport)** for PR 31204 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31456: [SPARK-34343][SQL][TESTS] Add missing test for some non-array types in PostgreSQL

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31456: URL: https://github.com/apache/spark/pull/31456#issuecomment-773603415 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #31394: [SPARK-34291][ML] LSH hashDistance optimization

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31394: URL: https://github.com/apache/spark/pull/31394#issuecomment-773155810 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31445: [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31445: URL: https://github.com/apache/spark/pull/31445#issuecomment-773417127 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] razajafri commented on a change in pull request #31284: [SPARK-34167][SQL]Reading parquet with IntDecimal written as a LongDecimal blows up

2021-02-05 Thread GitBox
razajafri commented on a change in pull request #31284: URL: https://github.com/apache/spark/pull/31284#discussion_r570430807 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java ## @@ -145,7 +146,24 @@ public

[GitHub] [spark] SparkQA removed a comment on pull request #31473: [SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31473: URL: https://github.com/apache/spark/pull/31473#issuecomment-773306832 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #31486: [SPARK-34359][SQL][3.1] Add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
SparkQA commented on pull request #31486: URL: https://github.com/apache/spark/pull/31486#issuecomment-773877044 **[Test build #134924 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134924/testReport)** for PR 31486 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #31413: [SPARK-32985][SQL] Decouple bucket scan and bucket filter pruning for data source v1

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31413: URL: https://github.com/apache/spark/pull/31413#issuecomment-773056185 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AngersZhuuuu commented on pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
AngersZh commented on pull request #30957: URL: https://github.com/apache/spark/pull/30957#issuecomment-773141579 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] cloud-fan commented on pull request #31179: [SPARK-34113][SQL] Use metric data update metadata statistic's size and rowCount

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31179: URL: https://github.com/apache/spark/pull/31179#issuecomment-773131159 how big is the overhead? I had an impression that auto stats update is very expensive and not many people are using it...

[GitHub] [spark] SparkQA commented on pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system

2021-02-05 Thread GitBox
SparkQA commented on pull request #31466: URL: https://github.com/apache/spark/pull/31466#issuecomment-773878017 **[Test build #134907 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134907/testReport)** for PR 31466 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #31483: [SPARK-33434][PYTHON][DOCS] Added RuntimeConfig to PySpark docs

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31483: URL: https://github.com/apache/spark/pull/31483#issuecomment-773851149 **[Test build #134920 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134920/testReport)** for PR 31483 at commit

[GitHub] [spark] holdenk commented on a change in pull request #31427: [SPARK-34209][SQL] Delegate table name validation to the session catalog

2021-02-05 Thread GitBox
holdenk commented on a change in pull request #31427: URL: https://github.com/apache/spark/pull/31427#discussion_r570428856 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala ## @@ -2100,8 +2100,7 @@ class DataSourceV2SQLSuite

[GitHub] [spark] SparkQA commented on pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
SparkQA commented on pull request #31348: URL: https://github.com/apache/spark/pull/31348#issuecomment-773753068 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA commented on pull request #31471: [SPARK-34355][SQL] Add log and time cost for commit job

2021-02-05 Thread GitBox
SparkQA commented on pull request #31471: URL: https://github.com/apache/spark/pull/31471#issuecomment-773187923 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31258: [SPARK-34168] [SQL] Support DPP in AQE when the join is Broadcast hash join at the beginning

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31258: URL: https://github.com/apache/spark/pull/31258#issuecomment-773146054 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] tgravescs commented on pull request #31437: [SPARK-34329][YARN] When hit ApplicationAttemptNotFoundException, we can't just stop app for all case

2021-02-05 Thread GitBox
tgravescs commented on pull request #31437: URL: https://github.com/apache/spark/pull/31437#issuecomment-773366575 ok, please update the description with those details. maybe cluster mode doesn't matter here because application master would be killed anyway. The original change I

[GitHub] [spark] rdblue commented on a change in pull request #31476: [SPARK-34366][SQL] Add interface for DS v2 metrics

2021-02-05 Thread GitBox
rdblue commented on a change in pull request #31476: URL: https://github.com/apache/spark/pull/31476#discussion_r570608230 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/Scan.java ## @@ -102,4 +102,13 @@ default MicroBatchStream

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31476: [SPARK-34366][SQL] Add interface for DS v2 metrics

2021-02-05 Thread GitBox
dongjoon-hyun commented on a change in pull request #31476: URL: https://github.com/apache/spark/pull/31476#discussion_r570572005 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/CustomMetric.java ## @@ -0,0 +1,38 @@ +/* + * Licensed to the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31470: URL: https://github.com/apache/spark/pull/31470#issuecomment-773222379 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #31468: [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input rdd has single partition

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31468: URL: https://github.com/apache/spark/pull/31468#issuecomment-773118983 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] srowen commented on pull request #31463: [WIP] Fix docstring of join

2021-02-05 Thread GitBox
srowen commented on pull request #31463: URL: https://github.com/apache/spark/pull/31463#issuecomment-773356694 Seems fine - are there other instances? You can mark it as non-WIP / not draft. This is an automated message

[GitHub] [spark] SparkQA commented on pull request #31469: [MINOR][ML] Param Validation should throw IllegalArgumentException

2021-02-05 Thread GitBox
SparkQA commented on pull request #31469: URL: https://github.com/apache/spark/pull/31469#issuecomment-773148826 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31437: [SPARK-34329][YARN] When hit ApplicationAttemptNotFoundException, we can't just stop app for all case

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31437: URL: https://github.com/apache/spark/pull/31437#issuecomment-773395117 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29185: [SPARK-32384][CORE] repartitionAndSortWithinPartitions avoid shuffle with same partitioner

2021-02-05 Thread GitBox
SparkQA commented on pull request #29185: URL: https://github.com/apache/spark/pull/29185#issuecomment-773152387 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] attilapiros commented on pull request #31133: [SPARK-26836][SQL] Supporting Avro schema evolution for partitioned Hive tables

2021-02-05 Thread GitBox
attilapiros commented on pull request #31133: URL: https://github.com/apache/spark/pull/31133#issuecomment-773570005 @dongjoon-hyun this sound good to me. Here I will use explicitly `AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName()` and for the rest I will open separate

[GitHub] [spark] gaborgsomogyi commented on a change in pull request #31384: [SPARK-31816][SQL][DOCS] Added high level description about JDBC connection providers for users/developers

2021-02-05 Thread GitBox
gaborgsomogyi commented on a change in pull request #31384: URL: https://github.com/apache/spark/pull/31384#discussion_r570119198 ## File path: sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md ## @@ -0,0 +1,81 @@ +--- +license: | + Licensed to the Apache Software

[GitHub] [spark] SparkQA commented on pull request #31474: [SPARK-34359][SQL] add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
SparkQA commented on pull request #31474: URL: https://github.com/apache/spark/pull/31474#issuecomment-773388356 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31479: [SPARK-34373][SQL] HiveThriftServer2 startWithContext may hang with a race issue

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31479: URL: https://github.com/apache/spark/pull/31479#issuecomment-773790360 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AngersZhuuuu edited a comment on pull request #29087: [SPARK-28227][SQL] Support projection, aggregate/window functions, and lateral view in the TRANSFORM clause

2021-02-05 Thread GitBox
AngersZh edited a comment on pull request #29087: URL: https://github.com/apache/spark/pull/29087#issuecomment-773142558 ping @HyukjinKwon @cloud-fan @viirya Can you take a look about this pr's current change. This is

[GitHub] [spark] attilapiros commented on a change in pull request #31445: [SPARK-34334][K8S] Correctly identify timed out pending pod requests as excess request

2021-02-05 Thread GitBox
attilapiros commented on a change in pull request #31445: URL: https://github.com/apache/spark/pull/31445#discussion_r570299313 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsAllocator.scala ## @@ -223,14

[GitHub] [spark] SparkQA removed a comment on pull request #31460: [SPARK-34346][CORE][SQL] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause p

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31460: URL: https://github.com/apache/spark/pull/31460#issuecomment-773056155 **[Test build #134856 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134856/testReport)** for PR 31460 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31348: [SPARK-34245][CORE] Ensure Master removes executors that failed to send finished state

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31348: URL: https://github.com/apache/spark/pull/31348#issuecomment-773813065 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134902/

[GitHub] [spark] SparkQA removed a comment on pull request #31469: [MINOR][ML] Param Validation should throw IllegalArgumentException

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31469: URL: https://github.com/apache/spark/pull/31469#issuecomment-773148826 **[Test build #134864 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134864/testReport)** for PR 31469 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #30957: URL: https://github.com/apache/spark/pull/30957#issuecomment-773222390 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] MaxGekk commented on pull request #31478: [SPARK-34371][SQL][TESTS] Run the datetime rebasing tests for Parquet datasource v1 and v2

2021-02-05 Thread GitBox
MaxGekk commented on pull request #31478: URL: https://github.com/apache/spark/pull/31478#issuecomment-773816713 @cloud-fan @gengliangwang @HyukjinKwon Could you review this PR. This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #30869: URL: https://github.com/apache/spark/pull/30869#issuecomment-773901509 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134927/

[GitHub] [spark] HeartSaVioR commented on pull request #31464: [SPARK-34339][CORE][SQL] Expose the number of total paths in Utils.buildLocationMetadata()

2021-02-05 Thread GitBox
HeartSaVioR commented on pull request #31464: URL: https://github.com/apache/spark/pull/31464#issuecomment-773072398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [spark] AmplabJenkins commented on pull request #31462: [SPARK-34347][SQL] CatalogImpl.uncacheTable should invalidate in cascade for temp views

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31462: URL: https://github.com/apache/spark/pull/31462#issuecomment-773111769 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #31475: [SPARK-34360][SQL] Support table truncation by v2 Table Catalogs

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31475: URL: https://github.com/apache/spark/pull/31475#issuecomment-773460642 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] github-actions[bot] closed pull request #29185: [SPARK-32384][CORE] repartitionAndSortWithinPartitions avoid shuffle with same partitioner

2021-02-05 Thread GitBox
github-actions[bot] closed pull request #29185: URL: https://github.com/apache/spark/pull/29185 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] viirya commented on a change in pull request #31462: [SPARK-34347][SQL] CatalogImpl.uncacheTable should invalidate in cascade for temp views

2021-02-05 Thread GitBox
viirya commented on a change in pull request #31462: URL: https://github.com/apache/spark/pull/31462#discussion_r569995302 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ## @@ -909,12 +909,12 @@ class SessionCatalog(

[GitHub] [spark] SparkQA commented on pull request #31456: [SPARK-34343][SQL][TESTS] Add missing test for some non-array types in PostgreSQL

2021-02-05 Thread GitBox
SparkQA commented on pull request #31456: URL: https://github.com/apache/spark/pull/31456#issuecomment-773574181 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
SparkQA commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-773914040 **[Test build #134931 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134931/testReport)** for PR 31488 at commit

[GitHub] [spark] cloud-fan commented on pull request #31473: [SPARK-34357][SQL] Map JDBC SQL TIME type to TimestampType with time portion fixed regardless of timezone

2021-02-05 Thread GitBox
cloud-fan commented on pull request #31473: URL: https://github.com/apache/spark/pull/31473#issuecomment-773470315 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31484: [SPARK-34374][SQL][DSTREAM] Use standard methods to extract keys or values from a Map

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31484: URL: https://github.com/apache/spark/pull/31484#issuecomment-773897293 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39499/

[GitHub] [spark] AmplabJenkins commented on pull request #31448: [SPARK-28137][SQL] Data Type Formatting Functions: `to_number`.

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31448: URL: https://github.com/apache/spark/pull/31448#issuecomment-773111773 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
SparkQA commented on pull request #30957: URL: https://github.com/apache/spark/pull/30957#issuecomment-773916553 **[Test build #134912 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134912/testReport)** for PR 30957 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #30957: URL: https://github.com/apache/spark/pull/30957#issuecomment-773792948 **[Test build #134912 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134912/testReport)** for PR 30957 at commit

[GitHub] [spark] SparkQA commented on pull request #31481: [SQL][MINOR][TEST][3.1] Re-enable some DS v2 char/varchar test

2021-02-05 Thread GitBox
SparkQA commented on pull request #31481: URL: https://github.com/apache/spark/pull/31481#issuecomment-773916759 **[Test build #134909 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134909/testReport)** for PR 31481 at commit

[GitHub] [spark] SparkQA commented on pull request #31484: [SPARK-34374][SQL][DSTREAM] Use standard methods to extract keys or values from a Map

2021-02-05 Thread GitBox
SparkQA commented on pull request #31484: URL: https://github.com/apache/spark/pull/31484#issuecomment-773875198 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39499/

[GitHub] [spark] SparkQA removed a comment on pull request #31467: [SPARK-33212][FOLLOW-UP][BUILD] Uses provided properties for Hadoop client dependencies in root pom

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31467: URL: https://github.com/apache/spark/pull/31467#issuecomment-773056148 **[Test build #134855 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134855/testReport)** for PR 31467 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31133: [SPARK-26836][SQL] Supporting Avro schema evolution for partitioned Hive tables with "avro.schema.literal"

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31133: URL: https://github.com/apache/spark/pull/31133#issuecomment-773608335 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] LuciferYang commented on a change in pull request #31471: [SPARK-34355][SQL] Add log and time cost for commit job

2021-02-05 Thread GitBox
LuciferYang commented on a change in pull request #31471: URL: https://github.com/apache/spark/pull/31471#discussion_r570096278 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala ## @@ -217,8 +217,11 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #31245: [SPARK-34157][SQL] Unify output of SHOW TABLES and pass output attributes properly

2021-02-05 Thread GitBox
cloud-fan commented on a change in pull request #31245: URL: https://github.com/apache/spark/pull/31245#discussion_r570793713 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala ## @@ -119,6 +102,34 @@ trait

[GitHub] [spark] SparkQA commented on pull request #31485: [SPARK-SQL][34137] Update suquery's stats when build LogicalPlan's stats

2021-02-05 Thread GitBox
SparkQA commented on pull request #31485: URL: https://github.com/apache/spark/pull/31485#issuecomment-773909514 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39506/

[GitHub] [spark] yaooqinn commented on pull request #31482: [SPARK-34346][CORE][SQL][3.1] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause per

2021-02-05 Thread GitBox
yaooqinn commented on pull request #31482: URL: https://github.com/apache/spark/pull/31482#issuecomment-773910641 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] SparkQA commented on pull request #31486: [SPARK-34359][SQL][3.1] Add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
SparkQA commented on pull request #31486: URL: https://github.com/apache/spark/pull/31486#issuecomment-773906733 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39507/

<    2   3   4   5   6   7   8   >