[GitHub] [spark] SparkQA commented on pull request #30485: [SPARK-33533][SQL] Fix the regression bug that ConnectionProviders don't consider case-sensitivity for properties.

2020-11-24 Thread GitBox
SparkQA commented on pull request #30485: URL: https://github.com/apache/spark/pull/30485#issuecomment-733447277 **[Test build #131707 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131707/testReport)** for PR 30485 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #30427: [SPARK-33224][SS][WEBUI] Add watermark gap information into SS UI page

2020-11-24 Thread GitBox
SparkQA removed a comment on pull request #30427: URL: https://github.com/apache/spark/pull/30427#issuecomment-733283254 **[Test build #131704 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131704/testReport)** for PR 30427 at commit

[GitHub] [spark] SparkQA commented on pull request #30427: [SPARK-33224][SS][WEBUI] Add watermark gap information into SS UI page

2020-11-24 Thread GitBox
SparkQA commented on pull request #30427: URL: https://github.com/apache/spark/pull/30427#issuecomment-733445279 **[Test build #131704 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131704/testReport)** for PR 30427 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #30487: [SPARK-33535][INFRA][TESTS] Export LANG to en_US.UTF-8 in run-tests-jenkins script

2020-11-24 Thread GitBox
HyukjinKwon commented on pull request #30487: URL: https://github.com/apache/spark/pull/30487#issuecomment-733444683 Oh, I see. nice. @wangyum FYI This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] dongjoon-hyun commented on pull request #30485: [SPARK-33533][SQL] Fix the regression bug that ConnectionProviders don't consider case-sensitivity for properties.

2020-11-24 Thread GitBox
dongjoon-hyun commented on pull request #30485: URL: https://github.com/apache/spark/pull/30485#issuecomment-73305 cc @HyukjinKwon , too~ This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] dongjoon-hyun commented on pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-24 Thread GitBox
dongjoon-hyun commented on pull request #29950: URL: https://github.com/apache/spark/pull/29950#issuecomment-733444096 Retest this please. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] maropu commented on a change in pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
maropu commented on a change in pull request #30421: URL: https://github.com/apache/spark/pull/30421#discussion_r530088471 ## File path: docs/sql-ref-syntax-dml-insert-into.md ## @@ -41,7 +41,7 @@ INSERT INTO [ TABLE ] table_identifier [ partition_spec ] * **partition_spec**

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30491: [SPARK-33492][SQL][FOLLOWUP] Use callback instead of passing Spark session and v2 relation

2020-11-24 Thread GitBox
dongjoon-hyun commented on a change in pull request #30491: URL: https://github.com/apache/spark/pull/30491#discussion_r530088008 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala ## @@ -213,15 +213,14 @@ case

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30491: [SPARK-33492][SQL][FOLLOWUP] Use callback instead of passing Spark session and v2 relation

2020-11-24 Thread GitBox
dongjoon-hyun commented on a change in pull request #30491: URL: https://github.com/apache/spark/pull/30491#discussion_r530087851 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala ## @@ -213,15 +213,14 @@ case

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30491: [SPARK-33492][SQL][FOLLOWUP] Use callback instead of passing Spark session and v2 relation

2020-11-24 Thread GitBox
dongjoon-hyun commented on a change in pull request #30491: URL: https://github.com/apache/spark/pull/30491#discussion_r530087552 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala ## @@ -170,11 +170,13 @@ class

[GitHub] [spark] wangyum commented on pull request #29358: [SPARK-32541][SQL]Add support msck repair table drop partitions

2020-11-24 Thread GitBox
wangyum commented on pull request #29358: URL: https://github.com/apache/spark/pull/29358#issuecomment-733441235 cc @sunchao This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] HyukjinKwon commented on pull request #30411: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-11-24 Thread GitBox
HyukjinKwon commented on pull request #30411: URL: https://github.com/apache/spark/pull/30411#issuecomment-733440592 I am supportive of this change FWIW. I think it's good to have this feature. This is an automated message

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30490: [SPARK-33543][SQL] Migrate SHOW COLUMNS command to use UnresolvedTableOrView to resolve the identifier

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30490: URL: https://github.com/apache/spark/pull/30490#issuecomment-733433554 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29066: [SPARK-23889][SQL] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
SparkQA commented on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-733433901 **[Test build #131726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131726/testReport)** for PR 29066 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30421: URL: https://github.com/apache/spark/pull/30421#issuecomment-733419965 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30490: [SPARK-33543][SQL] Migrate SHOW COLUMNS command to use UnresolvedTableOrView to resolve the identifier

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30490: URL: https://github.com/apache/spark/pull/30490#issuecomment-733433554 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
SparkQA commented on pull request #30421: URL: https://github.com/apache/spark/pull/30421#issuecomment-733433490 **[Test build #131725 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131725/testReport)** for PR 30421 at commit

[GitHub] [spark] SparkQA commented on pull request #30484: [SPARK-33532][SQL] Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method

2020-11-24 Thread GitBox
SparkQA commented on pull request #30484: URL: https://github.com/apache/spark/pull/30484#issuecomment-733433369 **[Test build #131724 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131724/testReport)** for PR 30484 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30492: URL: https://github.com/apache/spark/pull/30492#issuecomment-733433074 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30413: [SPARK-33252][PYTHON][DOCS] Migration to NumPy documentation style in MLlib (pyspark.mllib.*)

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30413: URL: https://github.com/apache/spark/pull/30413#issuecomment-733433237 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30413: [SPARK-33252][PYTHON][DOCS] Migration to NumPy documentation style in MLlib (pyspark.mllib.*)

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30413: URL: https://github.com/apache/spark/pull/30413#issuecomment-733433237 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #28026: URL: https://github.com/apache/spark/pull/28026#issuecomment-733433043 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA removed a comment on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-24 Thread GitBox
SparkQA removed a comment on pull request #28026: URL: https://github.com/apache/spark/pull/28026#issuecomment-733402973 **[Test build #131713 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131713/testReport)** for PR 28026 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29122: URL: https://github.com/apache/spark/pull/29122#issuecomment-733433052 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #29122: URL: https://github.com/apache/spark/pull/29122#issuecomment-733433052 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #28026: URL: https://github.com/apache/spark/pull/28026#issuecomment-733433043 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30492: URL: https://github.com/apache/spark/pull/30492#issuecomment-733433074 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-24 Thread GitBox
SparkQA commented on pull request #28026: URL: https://github.com/apache/spark/pull/28026#issuecomment-733432820 **[Test build #131713 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131713/testReport)** for PR 28026 at commit

[GitHub] [spark] cloud-fan closed pull request #30490: [SPARK-33543][SQL] Migrate SHOW COLUMNS command to use UnresolvedTableOrView to resolve the identifier

2020-11-24 Thread GitBox
cloud-fan closed pull request #30490: URL: https://github.com/apache/spark/pull/30490 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] cloud-fan commented on pull request #30490: [SPARK-33543][SQL] Migrate SHOW COLUMNS command to use UnresolvedTableOrView to resolve the identifier

2020-11-24 Thread GitBox
cloud-fan commented on pull request #30490: URL: https://github.com/apache/spark/pull/30490#issuecomment-733432706 GA passed, merging to master, thanks! This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29994: [DONOTMERGE][WHITESPACE] workflow exercise

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #29994: URL: https://github.com/apache/spark/pull/29994#issuecomment-733432399 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30403: [SPARK-33448][SQL] Support CACHE/UNCACHE TABLE commands for v2 tables

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30403: URL: https://github.com/apache/spark/pull/30403#issuecomment-733432402 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30491: [SPARK-33492][SQL][FOLLOWUP] Use callback instead of passing Spark session and v2 relation

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30491: URL: https://github.com/apache/spark/pull/30491#issuecomment-733432400 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30492: URL: https://github.com/apache/spark/pull/30492#issuecomment-733260110 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30492: URL: https://github.com/apache/spark/pull/30492#issuecomment-733432401 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #30491: [SPARK-33492][SQL][FOLLOWUP] Use callback instead of passing Spark session and v2 relation

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30491: URL: https://github.com/apache/spark/pull/30491#issuecomment-733432400 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #30403: [SPARK-33448][SQL] Support CACHE/UNCACHE TABLE commands for v2 tables

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30403: URL: https://github.com/apache/spark/pull/30403#issuecomment-733432408 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins commented on pull request #29994: [DONOTMERGE][WHITESPACE] workflow exercise

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #29994: URL: https://github.com/apache/spark/pull/29994#issuecomment-733432399 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] LuciferYang commented on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-24 Thread GitBox
LuciferYang commented on pull request #30483: URL: https://github.com/apache/spark/pull/30483#issuecomment-733430977 > Since your fix is merged here, > could you fix Scala style and rebase to the master, @LuciferYang ? OK~ will do it later ~

[GitHub] [spark] liangz1 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model

2020-11-24 Thread GitBox
liangz1 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530076126 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator

[GitHub] [spark] zsxwing commented on a change in pull request #30242: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.

2020-11-24 Thread GitBox
zsxwing commented on a change in pull request #30242: URL: https://github.com/apache/spark/pull/30242#discussion_r530075736 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvalPythonExec.scala ## @@ -137,3 +139,47 @@ trait EvalPythonExec extends

[GitHub] [spark] cloud-fan commented on a change in pull request #30440: [SPARK-33496][SQL]Improve error message of ANSI explicit cast

2020-11-24 Thread GitBox
cloud-fan commented on a change in pull request #30440: URL: https://github.com/apache/spark/pull/30440#discussion_r530075609 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala ## @@ -98,6 +98,19 @@ object Cast { case _ =>

[GitHub] [spark] cloud-fan commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
cloud-fan commented on pull request #30421: URL: https://github.com/apache/spark/pull/30421#issuecomment-733429639 Let's revisit the entire process: 1. the parser creates partition spec which is `Map[String, String]` 2. for v1 commands, we pass `Map[String, String]` to hive APIs.

[GitHub] [spark] liangz1 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model

2020-11-24 Thread GitBox
liangz1 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530073036 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator

[GitHub] [spark] liangz1 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model

2020-11-24 Thread GitBox
liangz1 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530073036 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator

[GitHub] [spark] liangz1 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model

2020-11-24 Thread GitBox
liangz1 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530073036 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator

[GitHub] [spark] liangz1 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model

2020-11-24 Thread GitBox
liangz1 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530073036 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator

[GitHub] [spark] LuciferYang commented on a change in pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-24 Thread GitBox
LuciferYang commented on a change in pull request #30483: URL: https://github.com/apache/spark/pull/30483#discussion_r530072989 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -765,6 +765,11 @@ object SQLConf { .booleanConf

[GitHub] [spark] aokolnychyi commented on pull request #29066: [SPARK-23889][SQL] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-733427234 I'd like to emphasize that all changes are in one place to simplify the review. I'll split the work into smaller PRs later.

[GitHub] [spark] aokolnychyi commented on pull request #29066: [SPARK-23889][SQL] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-733425796 Okay, I went through the comments and I think they are all resolved except points related to tests. This PR is no longer WIP and is ready for a detailed review.

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [SPARK-23889][SQL] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530070683 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2WriteRequirements.scala ## @@ -0,0 +1,107 @@ +/* + * Licensed

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
AngersZh commented on a change in pull request #30421: URL: https://github.com/apache/spark/pull/30421#discussion_r530070612 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -503,13 +503,32 @@ class AstBuilder extends

[GitHub] [spark] SparkQA removed a comment on pull request #29994: [DONOTMERGE][WHITESPACE] workflow exercise

2020-11-24 Thread GitBox
SparkQA removed a comment on pull request #29994: URL: https://github.com/apache/spark/pull/29994#issuecomment-733415136 **[Test build #131718 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131718/testReport)** for PR 29994 at commit

[GitHub] [spark] SparkQA commented on pull request #29994: [DONOTMERGE][WHITESPACE] workflow exercise

2020-11-24 Thread GitBox
SparkQA commented on pull request #29994: URL: https://github.com/apache/spark/pull/29994#issuecomment-733425057 **[Test build #131718 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131718/testReport)** for PR 29994 at commit

[GitHub] [spark] cloud-fan commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
cloud-fan commented on pull request #30421: URL: https://github.com/apache/spark/pull/30421#issuecomment-733425083 cc @MaxGekk This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] cloud-fan commented on a change in pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
cloud-fan commented on a change in pull request #30421: URL: https://github.com/apache/spark/pull/30421#discussion_r530070129 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -503,13 +503,32 @@ class AstBuilder extends

[GitHub] [spark] cloud-fan commented on a change in pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
cloud-fan commented on a change in pull request #30421: URL: https://github.com/apache/spark/pull/30421#discussion_r530069998 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -503,13 +503,32 @@ class AstBuilder extends

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530069800 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/distributions/ClusteredDistribution.java ## @@ -0,0 +1,28 @@ +/* + *

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530069698 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala ## @@ -186,6 +186,13 @@ abstract class

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530068527 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2WriteRequirements.scala ## @@ -0,0 +1,107 @@ +/* + * Licensed

[GitHub] [spark] cloud-fan commented on a change in pull request #30473: [SPARK-33430][SQL] Support namespaces in JDBC v2 Table Catalog

2020-11-24 Thread GitBox
cloud-fan commented on a change in pull request #30473: URL: https://github.com/apache/spark/pull/30473#discussion_r530068393 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalog.scala ## @@ -170,6 +174,125 @@ class

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530068371 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2WriteRequirements.scala ## @@ -0,0 +1,107 @@ +/* + * Licensed

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530068093 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2Writes.scala ## @@ -0,0 +1,102 @@ +/* + * Licensed to the

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530067640 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/SortDirection.java ## @@ -0,0 +1,22 @@ +/* + * Licensed to the

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530067569 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/expressions/NullOrdering.java ## @@ -0,0 +1,22 @@ +/* + * Licensed to the

[GitHub] [spark] SparkQA removed a comment on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-24 Thread GitBox
SparkQA removed a comment on pull request #30492: URL: https://github.com/apache/spark/pull/30492#issuecomment-733420602 **[Test build #131723 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131723/testReport)** for PR 30492 at commit

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530067492 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2Writes.scala ## @@ -0,0 +1,102 @@ +/* + * Licensed to the

[GitHub] [spark] maropu commented on a change in pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-24 Thread GitBox
maropu commented on a change in pull request #30486: URL: https://github.com/apache/spark/pull/30486#discussion_r530054680 ## File path: core/src/main/scala/org/apache/spark/SparkContext.scala ## @@ -422,6 +426,8 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] [spark] SparkQA commented on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-24 Thread GitBox
SparkQA commented on pull request #30492: URL: https://github.com/apache/spark/pull/30492#issuecomment-733421894 **[Test build #131723 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131723/testReport)** for PR 30492 at commit

[GitHub] [spark] aokolnychyi commented on a change in pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on a change in pull request #29066: URL: https://github.com/apache/spark/pull/29066#discussion_r530067196 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/WriteBuilder.java ## @@ -23,17 +23,34 @@ import

[GitHub] [spark] SparkQA removed a comment on pull request #30468: [SPARK-33518][ML][WIP] Improve performance of ML ALS recommendForAll by GEMV

2020-11-24 Thread GitBox
SparkQA removed a comment on pull request #30468: URL: https://github.com/apache/spark/pull/30468#issuecomment-733397215 **[Test build #131711 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131711/testReport)** for PR 30468 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
SparkQA removed a comment on pull request #30421: URL: https://github.com/apache/spark/pull/30421#issuecomment-733419305 **[Test build #131722 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131722/testReport)** for PR 30421 at commit

[GitHub] [spark] SparkQA commented on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-24 Thread GitBox
SparkQA commented on pull request #30492: URL: https://github.com/apache/spark/pull/30492#issuecomment-733420602 **[Test build #131723 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131723/testReport)** for PR 30492 at commit

[GitHub] [spark] aokolnychyi commented on pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
aokolnychyi commented on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-733420058 Sorry, it took a while to update this PR. @rdblue had time to play around with this change locally and addressed quite some review comments. I rebased, incorporated

[GitHub] [spark] AmplabJenkins commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30421: URL: https://github.com/apache/spark/pull/30421#issuecomment-733419965 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
SparkQA commented on pull request #30421: URL: https://github.com/apache/spark/pull/30421#issuecomment-733419950 **[Test build #131722 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131722/testReport)** for PR 30421 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-24 Thread GitBox
dongjoon-hyun commented on a change in pull request #30492: URL: https://github.com/apache/spark/pull/30492#discussion_r530065239 ## File path: core/src/main/scala/org/apache/spark/storage/FallbackStorage.scala ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] LuciferYang commented on pull request #30487: [SPARK-33535][INFRA][TESTS] Export LANG to en_US.UTF-8 in run-tests-jenkins script

2020-11-24 Thread GitBox
LuciferYang commented on pull request #30487: URL: https://github.com/apache/spark/pull/30487#issuecomment-733419898 @dongjoon-hyun thx ~ This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA commented on pull request #30478: [SPARK-33525][SQL] Update hive-service-rpc to 3.1.2

2020-11-24 Thread GitBox
SparkQA commented on pull request #30478: URL: https://github.com/apache/spark/pull/30478#issuecomment-733419309 **[Test build #131720 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131720/testReport)** for PR 30478 at commit

[GitHub] [spark] SparkQA commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
SparkQA commented on pull request #30421: URL: https://github.com/apache/spark/pull/30421#issuecomment-733419305 **[Test build #131722 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131722/testReport)** for PR 30421 at commit

[GitHub] [spark] SparkQA commented on pull request #30440: [SPARK-33496][SQL]Improve error message of ANSI explicit cast

2020-11-24 Thread GitBox
SparkQA commented on pull request #30440: URL: https://github.com/apache/spark/pull/30440#issuecomment-733419129 **[Test build #131721 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131721/testReport)** for PR 30440 at commit

[GitHub] [spark] gengliangwang commented on pull request #30440: [SPARK-33496][SQL]Improve error message of ANSI explicit cast

2020-11-24 Thread GitBox
gengliangwang commented on pull request #30440: URL: https://github.com/apache/spark/pull/30440#issuecomment-733418918 retest this please. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
AngersZh commented on a change in pull request #30421: URL: https://github.com/apache/spark/pull/30421#discussion_r530064053 ## File path: docs/sql-ref-syntax-dml-insert-into.md ## @@ -41,7 +41,7 @@ INSERT INTO [ TABLE ] table_identifier [ partition_spec ] *

[GitHub] [spark] AmplabJenkins commented on pull request #30468: [SPARK-33518][ML][WIP] Improve performance of ML ALS recommendForAll by GEMV

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30468: URL: https://github.com/apache/spark/pull/30468#issuecomment-733418664 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] HyukjinKwon commented on pull request #30478: [SPARK-33525][SQL] Update hive-service-rpc to 3.1.2

2020-11-24 Thread GitBox
HyukjinKwon commented on pull request #30478: URL: https://github.com/apache/spark/pull/30478#issuecomment-733418441 retest this please This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
AngersZh commented on a change in pull request #30421: URL: https://github.com/apache/spark/pull/30421#discussion_r530063819 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -503,13 +503,32 @@ class AstBuilder extends

[GitHub] [spark] SparkQA commented on pull request #30468: [SPARK-33518][ML][WIP] Improve performance of ML ALS recommendForAll by GEMV

2020-11-24 Thread GitBox
SparkQA commented on pull request #30468: URL: https://github.com/apache/spark/pull/30468#issuecomment-733418349 **[Test build #131711 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131711/testReport)** for PR 30468 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #30482: [SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30482: URL: https://github.com/apache/spark/pull/30482#issuecomment-733418239 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #29066: [WIP][SPARK-23889] DataSourceV2: required sorting and clustering for writes

2020-11-24 Thread GitBox
SparkQA commented on pull request #29066: URL: https://github.com/apache/spark/pull/29066#issuecomment-733418271 **[Test build #131719 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131719/testReport)** for PR 29066 at commit

[GitHub] [spark] huaxingao commented on a change in pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-24 Thread GitBox
huaxingao commented on a change in pull request #29695: URL: https://github.com/apache/spark/pull/29695#discussion_r530060859 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala ## @@ -73,33 +77,25 @@ object

[GitHub] [spark] SparkQA commented on pull request #30242: [SPARK-33277][PYSPARK][SQL] Use ContextAwareIterator to stop consuming after the task ends.

2020-11-24 Thread GitBox
SparkQA commented on pull request #30242: URL: https://github.com/apache/spark/pull/30242#issuecomment-733415179 **[Test build #131717 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131717/testReport)** for PR 30242 at commit

[GitHub] [spark] SparkQA commented on pull request #29994: [DONOTMERGE][WHITESPACE] workflow exercise

2020-11-24 Thread GitBox
SparkQA commented on pull request #29994: URL: https://github.com/apache/spark/pull/29994#issuecomment-733415136 **[Test build #131718 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131718/testReport)** for PR 29994 at commit

[GitHub] [spark] SparkQA commented on pull request #30470: [SPARK-33495][BUILD] Remove commons-logging.jar's dependency

2020-11-24 Thread GitBox
SparkQA commented on pull request #30470: URL: https://github.com/apache/spark/pull/30470#issuecomment-733415016 **[Test build #131715 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131715/testReport)** for PR 30470 at commit

[GitHub] [spark] SparkQA commented on pull request #30412: [SPARK-33480][SQL] Support char/varchar type

2020-11-24 Thread GitBox
SparkQA commented on pull request #30412: URL: https://github.com/apache/spark/pull/30412#issuecomment-733415048 **[Test build #131716 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131716/testReport)** for PR 30412 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30413: [SPARK-33252][PYTHON][DOCS] Migration to NumPy documentation style in MLlib (pyspark.mllib.*)

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30413: URL: https://github.com/apache/spark/pull/30413#issuecomment-733414775 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-24 Thread GitBox
SparkQA commented on pull request #30486: URL: https://github.com/apache/spark/pull/30486#issuecomment-733414971 **[Test build #131714 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131714/testReport)** for PR 30486 at commit

[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model

2020-11-24 Thread GitBox
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530060435 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms,

[GitHub] [spark] AmplabJenkins commented on pull request #30413: [SPARK-33252][PYTHON][DOCS] Migration to NumPy documentation style in MLlib (pyspark.mllib.*)

2020-11-24 Thread GitBox
AmplabJenkins commented on pull request #30413: URL: https://github.com/apache/spark/pull/30413#issuecomment-733414775 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30490: [SPARK-33543][SQL] Migrate SHOW COLUMNS command to use UnresolvedTableOrView to resolve the identifier

2020-11-24 Thread GitBox
AmplabJenkins removed a comment on pull request #30490: URL: https://github.com/apache/spark/pull/30490#issuecomment-733414029 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-24 Thread GitBox
AngersZh commented on a change in pull request #30421: URL: https://github.com/apache/spark/pull/30421#discussion_r530059949 ## File path: docs/sql-ref-syntax-dml-insert-into.md ## @@ -41,7 +41,7 @@ INSERT INTO [ TABLE ] table_identifier [ partition_spec ] *

<    1   2   3   4   5   6   7   8   9   10   >