[GitHub] [spark] AngersZhuuuu opened a new pull request #30496: [SPARK-33547][SQL] Add usage of typed literal in doc

2020-11-25 Thread GitBox


AngersZh opened a new pull request #30496:
URL: https://github.com/apache/spark/pull/30496


   ### What changes were proposed in this pull request?
   According  to 
https://github.com/apache/spark/pull/30421#discussion_r530024114
   Add typed literal in doc
   
   
   ### Why are the changes needed?
   Make user clear about usage of typed literal
   
   
   ### Does this PR introduce _any_ user-facing change?
   NO
   
   ### How was this patch tested?
   NOT need
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30496: [SPARK-33547][SQL] Add usage of typed literal in doc

2020-11-25 Thread GitBox


AngersZh commented on a change in pull request #30496:
URL: https://github.com/apache/spark/pull/30496#discussion_r530172194



##
File path: docs/sql-ref-literals.md
##
@@ -21,14 +21,74 @@ license: |
 
 A literal (also known as a constant) represents a fixed data value. Spark SQL 
supports the following literals:
 
+ * [Typed Literal](#typed-literal)
  * [String Literal](#string-literal)
  * [Binary Literal](#binary-literal)
  * [Null Literal](#null-literal)
  * [Boolean Literal](#boolean-literal)
  * [Numeric Literal](#numeric-literal)
+ * [Timestamp Literal](#timestamp-literal)

Review comment:
   Missed in current doc





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #30496: [SPARK-33547][SQL] Add usage of typed literal in doc

2020-11-25 Thread GitBox


AngersZh commented on pull request #30496:
URL: https://github.com/apache/spark/pull/30496#issuecomment-733536446


   FYI @maropu there is many duplicated example between typed literal and 
corresponding  literal, any suggestion?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30492:
URL: https://github.com/apache/spark/pull/30492#issuecomment-733536613







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30421:
URL: https://github.com/apache/spark/pull/30421#issuecomment-733536615







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30478: [SPARK-33525][SQL] Update hive-service-rpc to 3.1.2

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30478:
URL: https://github.com/apache/spark/pull/30478#issuecomment-733536614







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30488: [SPARK-33071][SPARK-33536][SQL] Avoid changing dataset_id of LogicalPlan in join() to not break DetectAmbiguousSelfJoin

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30488:
URL: https://github.com/apache/spark/pull/30488#issuecomment-733536612







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30470: [SPARK-33495][BUILD] Remove commons-logging.jar's dependency

2020-11-25 Thread GitBox


SparkQA commented on pull request #30470:
URL: https://github.com/apache/spark/pull/30470#issuecomment-733536564


   **[Test build #131737 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131737/testReport)**
 for PR 30470 at commit 
[`bc3cb8b`](https://github.com/apache/spark/commit/bc3cb8b419bb985cdf98aaf172b20c900d40e806).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30470: [SPARK-33495][BUILD] Remove commons-logging.jar's dependency

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #30470:
URL: https://github.com/apache/spark/pull/30470#issuecomment-733468984


   **[Test build #131737 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131737/testReport)**
 for PR 30470 at commit 
[`bc3cb8b`](https://github.com/apache/spark/commit/bc3cb8b419bb985cdf98aaf172b20c900d40e806).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30421:
URL: https://github.com/apache/spark/pull/30421#issuecomment-733536615







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-25 Thread GitBox


SparkQA commented on pull request #30483:
URL: https://github.com/apache/spark/pull/30483#issuecomment-733536900


   **[Test build #131754 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131754/testReport)**
 for PR 30483 at commit 
[`8bba51a`](https://github.com/apache/spark/commit/8bba51a2c65393e92a494a9539064d94ad24ec50).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30492:
URL: https://github.com/apache/spark/pull/30492#issuecomment-733536613







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30496: [SPARK-33547][SQL] Add usage of typed literal in doc

2020-11-25 Thread GitBox


SparkQA commented on pull request #30496:
URL: https://github.com/apache/spark/pull/30496#issuecomment-733537090


   **[Test build #131752 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131752/testReport)**
 for PR 30496 at commit 
[`e3de389`](https://github.com/apache/spark/commit/e3de389280d41c1f8c30569c512a71abff90799b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30488: [SPARK-33071][SPARK-33536][SQL] Avoid changing dataset_id of LogicalPlan in join() to not break DetectAmbiguousSelfJoin

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30488:
URL: https://github.com/apache/spark/pull/30488#issuecomment-733536612







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30478: [SPARK-33525][SQL] Update hive-service-rpc to 3.1.2

2020-11-25 Thread GitBox


SparkQA commented on pull request #30478:
URL: https://github.com/apache/spark/pull/30478#issuecomment-733537096


   **[Test build #131755 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131755/testReport)**
 for PR 30478 at commit 
[`43d90ca`](https://github.com/apache/spark/commit/43d90cafaf0aa4c8c4a355070d5c71008f6f3ea9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30440: [SPARK-33496][SQL]Improve error message of ANSI explicit cast

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30440:
URL: https://github.com/apache/spark/pull/30440#issuecomment-733510495







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-25 Thread GitBox


SparkQA commented on pull request #30492:
URL: https://github.com/apache/spark/pull/30492#issuecomment-733536865


   **[Test build #131753 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131753/testReport)**
 for PR 30492 at commit 
[`025d9aa`](https://github.com/apache/spark/commit/025d9aadc49663521cb558237ee6b33a8f21f1e6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30440: [SPARK-33496][SQL]Improve error message of ANSI explicit cast

2020-11-25 Thread GitBox


SparkQA commented on pull request #30440:
URL: https://github.com/apache/spark/pull/30440#issuecomment-733537202


   **[Test build #131756 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131756/testReport)**
 for PR 30440 at commit 
[`e762162`](https://github.com/apache/spark/commit/e762162311e04c20bb06f9a4735514547050b832).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30478: [SPARK-33525][SQL] Update hive-service-rpc to 3.1.2

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30478:
URL: https://github.com/apache/spark/pull/30478#issuecomment-733513051







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30470: [SPARK-33495][BUILD] Remove commons-logging.jar's dependency

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30470:
URL: https://github.com/apache/spark/pull/30470#issuecomment-733537538







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30470: [SPARK-33495][BUILD] Remove commons-logging.jar's dependency

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30470:
URL: https://github.com/apache/spark/pull/30470#issuecomment-733537538







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30421:
URL: https://github.com/apache/spark/pull/30421#issuecomment-733539495







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-25 Thread GitBox


SparkQA commented on pull request #29950:
URL: https://github.com/apache/spark/pull/29950#issuecomment-733542143


   **[Test build #131729 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131729/testReport)**
 for PR 29950 at commit 
[`bbaae3e`](https://github.com/apache/spark/commit/bbaae3ed9b96ca517ec5435838ac37feb578c959).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30496: [SPARK-33547][SQL] Add usage of typed literal in doc

2020-11-25 Thread GitBox


SparkQA commented on pull request #30496:
URL: https://github.com/apache/spark/pull/30496#issuecomment-733542921


   **[Test build #131752 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131752/testReport)**
 for PR 30496 at commit 
[`e3de389`](https://github.com/apache/spark/commit/e3de389280d41c1f8c30569c512a71abff90799b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30496: [SPARK-33547][SQL] Add usage of typed literal in doc

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30496:
URL: https://github.com/apache/spark/pull/30496#issuecomment-733543113







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #29950:
URL: https://github.com/apache/spark/pull/29950#issuecomment-733543105







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30472: [WIP][SPARK-32221] Avoid possible errors due to incorrect file size or type supplied in spark conf.

2020-11-25 Thread GitBox


SparkQA commented on pull request #30472:
URL: https://github.com/apache/spark/pull/30472#issuecomment-733543655


   **[Test build #131757 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131757/testReport)**
 for PR 30472 at commit 
[`534f2ff`](https://github.com/apache/spark/commit/534f2ffc3aa6f019e9c3f85b5f7d35be92c0c379).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-25 Thread GitBox


SparkQA commented on pull request #30483:
URL: https://github.com/apache/spark/pull/30483#issuecomment-733544254


   **[Test build #131754 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131754/testReport)**
 for PR 30483 at commit 
[`8bba51a`](https://github.com/apache/spark/commit/8bba51a2c65393e92a494a9539064d94ad24ec50).
* This patch **fails Java style tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `case class TruncateTable(`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30483:
URL: https://github.com/apache/spark/pull/30483#issuecomment-733544301







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #24173: [SPARK-27237][SS] Introduce State schema validation among query restart

2020-11-25 Thread GitBox


viirya commented on a change in pull request #24173:
URL: https://github.com/apache/spark/pull/24173#discussion_r530163299



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -1294,6 +1294,13 @@ object SQLConf {
   .createWithDefault(
 
"org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider")
 
+  val STATE_SCHEMA_CHECK_ENABLED =
+buildConf("spark.sql.streaming.stateStore.stateSchemaCheck")
+  .doc("When true, Spark will validate the state schema against schema on 
existing state and " +
+"fail query if it's incompatible.")
+  .booleanConf

Review comment:
   .version("3.1.0")?

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCoordinator.scala
##
@@ -150,6 +172,25 @@ private class StateStoreCoordinator(override val rpcEnv: 
RpcEnv)
 storeIdsToRemove.mkString(", "))
   context.reply(true)
 
+case ValidateSchema(providerId, keySchema, valueSchema, checkEnabled) =>
+  // normalize partition ID to validate only once for one state operator
+  val newProviderId = 
StateStoreProviderId.withNoPartitionInformation(providerId)
+
+  val result = schemaValidated.getOrElseUpdate(newProviderId, {
+val checker = new StateSchemaCompatibilityChecker(newProviderId, 
hadoopConf)
+
+// regardless of configuration, we check compatibility to at least 
write schema file
+// if necessary
+val ret = Try(checker.check(keySchema, 
valueSchema)).toEither.fold(Some(_), _ => None)

Review comment:
   Roughly remember we don't recommend using `Try` in Spark?

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreCoordinator.scala
##
@@ -150,6 +172,25 @@ private class StateStoreCoordinator(override val rpcEnv: 
RpcEnv)
 storeIdsToRemove.mkString(", "))
   context.reply(true)
 
+case ValidateSchema(providerId, keySchema, valueSchema, checkEnabled) =>
+  // normalize partition ID to validate only once for one state operator
+  val newProviderId = 
StateStoreProviderId.withNoPartitionInformation(providerId)
+

Review comment:
   I think you already do `withNoPartitionInformation` for the `providerId` 
before you call `validateSchema`?

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateSchemaCompatibilityChecker.scala
##
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.Path
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.execution.streaming.CheckpointFileManager
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.{StructField, StructType}
+
+case class StateSchemaNotCompatible(message: String) extends Exception(message)
+
+class StateSchemaCompatibilityChecker(
+providerId: StateStoreProviderId,
+hadoopConf: Configuration) extends Logging {
+
+  private val storeCpLocation = providerId.storeId.storeCheckpointLocation()
+  private val fm = CheckpointFileManager.create(storeCpLocation, hadoopConf)
+  private val schemaFileLocation = schemaFile(storeCpLocation)
+
+  fm.mkdirs(schemaFileLocation.getParent)
+
+  def check(keySchema: StructType, valueSchema: StructType): Unit = {
+if (fm.exists(schemaFileLocation)) {
+  logDebug(s"Schema file for provider $providerId exists. Comparing with 
provided schema.")
+  val (storedKeySchema, storedValueSchema) = readSchemaFile()
+
+  def fieldCompatible(fieldOld: StructField, fieldNew: StructField): 
Boolean = {
+// compatibility for nullable
+// - same: OK
+// - non-nullable -> nullable: OK
+// - nullable -> non-nullable: Not compatible
+(fieldOld.dataType == fieldNew.dataType) &&
+  ((fieldOld.nullable == fieldNew.nullable) ||
+(!fieldOld.nullable && fieldNew.nullable))
+  }
+
+  def schemaCompatible(schemaOld: StructType, schemaNew: StructType): 
Boolean = {
+(schem

[GitHub] [spark] SparkQA commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-25 Thread GitBox


SparkQA commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-733547825


   **[Test build #131736 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131736/testReport)**
 for PR 28026 at commit 
[`25ec746`](https://github.com/apache/spark/commit/25ec746753f29acd5e248d03db48211a3876a7c1).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #30483:
URL: https://github.com/apache/spark/pull/30483#issuecomment-733536900


   **[Test build #131754 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131754/testReport)**
 for PR 30483 at commit 
[`8bba51a`](https://github.com/apache/spark/commit/8bba51a2c65393e92a494a9539064d94ad24ec50).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-733467808


   **[Test build #131736 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131736/testReport)**
 for PR 28026 at commit 
[`25ec746`](https://github.com/apache/spark/commit/25ec746753f29acd5e248d03db48211a3876a7c1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30496: [SPARK-33547][SQL] Add usage of typed literal in doc

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #30496:
URL: https://github.com/apache/spark/pull/30496#issuecomment-733537090


   **[Test build #131752 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131752/testReport)**
 for PR 30496 at commit 
[`e3de389`](https://github.com/apache/spark/commit/e3de389280d41c1f8c30569c512a71abff90799b).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #29950:
URL: https://github.com/apache/spark/pull/29950#issuecomment-733450335


   **[Test build #131729 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131729/testReport)**
 for PR 29950 at commit 
[`bbaae3e`](https://github.com/apache/spark/commit/bbaae3ed9b96ca517ec5435838ac37feb578c959).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30472: [SPARK-32221][k8s] Avoid possible errors due to incorrect file size or type supplied in spark conf.

2020-11-25 Thread GitBox


SparkQA commented on pull request #30472:
URL: https://github.com/apache/spark/pull/30472#issuecomment-733550292


   **[Test build #131757 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131757/testReport)**
 for PR 30472 at commit 
[`534f2ff`](https://github.com/apache/spark/commit/534f2ffc3aa6f019e9c3f85b5f7d35be92c0c379).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30472: [SPARK-32221][k8s] Avoid possible errors due to incorrect file size or type supplied in spark conf.

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #30472:
URL: https://github.com/apache/spark/pull/30472#issuecomment-733543655


   **[Test build #131757 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131757/testReport)**
 for PR 30472 at commit 
[`534f2ff`](https://github.com/apache/spark/commit/534f2ffc3aa6f019e9c3f85b5f7d35be92c0c379).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30480: [SPARK-32921][SHUFFLE][test-maven][test-hadoop2.7] MapOutputTracker extensions to support push-based shuffle

2020-11-25 Thread GitBox


SparkQA commented on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-733551692


   **[Test build #131710 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131710/testReport)**
 for PR 30480 at commit 
[`cc1c077`](https://github.com/apache/spark/commit/cc1c077cdd3e808f97d2025ffab7545fce58c067).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30480: [SPARK-32921][SHUFFLE][test-maven][test-hadoop2.7] MapOutputTracker extensions to support push-based shuffle

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-733397123


   **[Test build #131710 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131710/testReport)**
 for PR 30480 at commit 
[`cc1c077`](https://github.com/apache/spark/commit/cc1c077cdd3e808f97d2025ffab7545fce58c067).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-25 Thread GitBox


wangyum commented on pull request #30483:
URL: https://github.com/apache/spark/pull/30483#issuecomment-733556681


   @LuciferYang It would be great if we had some benchmark numbers.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30289: [SPARK-33141][SQL] Capture SQL configs when creating permanent views

2020-11-25 Thread GitBox


luluorta commented on a change in pull request #30289:
URL: https://github.com/apache/spark/pull/30289#discussion_r530134316



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
##
@@ -361,11 +379,38 @@ object ViewHelper {
 }
   }
 
+  /**
+   * Convert the view query SQL configs in `properties`.
+   */
+  private def generateQuerySQLConfigs(conf: SQLConf): Map[String, String] = {
+val modifiedConfs = conf.getAllConfs.filter { case (k, _) =>
+  conf.isModifiable(k) && !isConfigBlacklisted(k)
+}
+val props = new mutable.HashMap[String, String]
+if (modifiedConfs.nonEmpty) {
+  val confJson = compact(render(JsonProtocol.mapToJson(modifiedConfs)))
+  props.put(VIEW_QUERY_SQL_CONFIGS, confJson)

Review comment:
   Thanks for pointing this out. Splitting a large value string into small 
chunks seems a hive specific solution, so I changed to store one config per 
table property entry, each with a "view.sqlConfig." prefix.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] aminh73 commented on pull request #27380: [SPARK-30669][SS] Introduce AdmissionControl APIs for StructuredStreaming

2020-11-25 Thread GitBox


aminh73 commented on pull request #27380:
URL: https://github.com/apache/spark/pull/27380#issuecomment-733557893


   We need to use `maxOffsetsPerTrigger` in the Kafka source with 
`Trigger.Once()` but it seems reads `allAvailable` in spark 3. Is there a way 
for achieving rate limit in this situation?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-25 Thread GitBox


cloud-fan commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-733560590


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on a change in pull request #30312: [SPARK-32917][SHUFFLE][CORE] Adds support for executors to push shuffle blocks after successful map task completion

2020-11-25 Thread GitBox


Ngone51 commented on a change in pull request #30312:
URL: https://github.com/apache/spark/pull/30312#discussion_r530141471



##
File path: core/src/main/scala/org/apache/spark/shuffle/ShuffleBlockPusher.scala
##
@@ -0,0 +1,462 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.shuffle
+
+import java.io.File
+import java.net.ConnectException
+import java.nio.ByteBuffer
+import java.util.concurrent.ExecutorService
+
+import scala.collection.mutable.{ArrayBuffer, HashMap, HashSet, Queue}
+
+import com.google.common.base.Throwables
+
+import org.apache.spark.{ShuffleDependency, SparkConf, SparkEnv}
+import org.apache.spark.annotation.Since
+import org.apache.spark.internal.Logging
+import org.apache.spark.internal.config._
+import org.apache.spark.launcher.SparkLauncher
+import org.apache.spark.network.buffer.{FileSegmentManagedBuffer, 
ManagedBuffer, NioManagedBuffer}
+import org.apache.spark.network.netty.SparkTransportConf
+import org.apache.spark.network.shuffle.BlockFetchingListener
+import org.apache.spark.network.shuffle.ErrorHandler.BlockPushErrorHandler
+import org.apache.spark.network.util.TransportConf
+import org.apache.spark.shuffle.ShuffleBlockPusher._
+import org.apache.spark.storage.{BlockId, BlockManagerId, ShufflePushBlockId}
+import org.apache.spark.util.{ThreadUtils, Utils}
+
+/**
+ * Used for pushing shuffle blocks to remote shuffle services when push 
shuffle is enabled.
+ * When push shuffle is enabled, it is created after the shuffle writer 
finishes writing the shuffle
+ * file and initiates the block push process.
+ *
+ * @param dataFile mapper generated shuffle data file
+ * @param partitionLengths array of shuffle block size so we can tell shuffle 
block
+ * boundaries within the shuffle file
+ * @param dep  shuffle dependency to get shuffle ID and the 
location of remote shuffle
+ * services to push local shuffle blocks
+ * @param partitionId  map index of the shuffle map task
+ * @param conf spark configuration
+ */
+@Since("3.1.0")
+private[spark] class ShuffleBlockPusher(
+dataFile: File,
+partitionLengths: Array[Long],
+dep: ShuffleDependency[_, _, _],
+partitionId: Int,
+conf: SparkConf) extends Logging {

Review comment:
   Pass these fields to `initiateBlockPush()` should be enough?

##
File path: 
core/src/test/scala/org/apache/spark/shuffle/ShuffleBlockPusherSuite.scala
##
@@ -0,0 +1,248 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.shuffle
+
+import java.io.File
+import java.net.ConnectException
+import java.nio.ByteBuffer
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.mockito.{Mock, MockitoAnnotations}
+import org.mockito.Answers.RETURNS_SMART_NULLS
+import org.mockito.ArgumentMatchers.any
+import org.mockito.Mockito._
+import org.mockito.invocation.InvocationOnMock
+import org.scalatest.BeforeAndAfterEach
+
+import org.apache.spark._
+import org.apache.spark.network.buffer.ManagedBuffer
+import org.apache.spark.network.shuffle.{BlockFetchingListener, 
BlockStoreClient}
+import org.apache.spark.network.shuffle.ErrorHandler.BlockPushErrorHandler
+import org.apache.spark.network.util.TransportConf
+import org.apache.spark.serializer.JavaSerializer
+import org.apache.spark.storage._
+
+class ShuffleBlockPusherSuite extends SparkFunSuite with BeforeAnd

[GitHub] [spark] SparkQA commented on pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-25 Thread GitBox


SparkQA commented on pull request #30486:
URL: https://github.com/apache/spark/pull/30486#issuecomment-733564041


   **[Test build #131743 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131743/testReport)**
 for PR 30486 at commit 
[`15d8ed5`](https://github.com/apache/spark/commit/15d8ed51d99e57403aae3272d9975b96e7735ee2).
* This patch **fails SparkR unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30442: [SPARK-33498][SQL] Datetime parsing should fail if the input string can't be parsed, or the pattern string is invalid

2020-11-25 Thread GitBox


SparkQA commented on pull request #30442:
URL: https://github.com/apache/spark/pull/30442#issuecomment-733564271


   **[Test build #131739 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131739/testReport)**
 for PR 30442 at commit 
[`140081d`](https://github.com/apache/spark/commit/140081d593f443054cf62776b20443e981b4c0c5).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #30486:
URL: https://github.com/apache/spark/pull/30486#issuecomment-733488641


   **[Test build #131743 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131743/testReport)**
 for PR 30486 at commit 
[`15d8ed5`](https://github.com/apache/spark/commit/15d8ed51d99e57403aae3272d9975b96e7735ee2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30442: [SPARK-33498][SQL] Datetime parsing should fail if the input string can't be parsed, or the pattern string is invalid

2020-11-25 Thread GitBox


SparkQA removed a comment on pull request #30442:
URL: https://github.com/apache/spark/pull/30442#issuecomment-733472027


   **[Test build #131739 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131739/testReport)**
 for PR 30442 at commit 
[`140081d`](https://github.com/apache/spark/commit/140081d593f443054cf62776b20443e981b4c0c5).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on pull request #30413: [SPARK-33252][PYTHON][DOCS] Migration to NumPy documentation style in MLlib (pyspark.mllib.*)

2020-11-25 Thread GitBox


zero323 commented on pull request #30413:
URL: https://github.com/apache/spark/pull/30413#issuecomment-733564631


   > @zero323, the only reason I have chosen `cols` over `*cols` is that I felt 
odds to just document this such as `*cols : tuple`, and thought `` cols: str, 
:class:`Column` ... `` is clearer.
   
   That's true.. I am also concerned, how far can we go, without effectively 
duplicating annotations in a natural language.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on pull request #30413: [SPARK-33252][PYTHON][DOCS] Migration to NumPy documentation style in MLlib (pyspark.mllib.*)

2020-11-25 Thread GitBox


zero323 commented on pull request #30413:
URL: https://github.com/apache/spark/pull/30413#issuecomment-733565168


   Thanks everyone!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #30493: [SPARK-33549][SQL] Remove configuration spark.sql.legacy.allowCastNumericToTimestamp

2020-11-25 Thread GitBox


cloud-fan commented on pull request #30493:
URL: https://github.com/apache/spark/pull/30493#issuecomment-733565341


   GA passed, merging to master, thanks!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #30493: [SPARK-33549][SQL] Remove configuration spark.sql.legacy.allowCastNumericToTimestamp

2020-11-25 Thread GitBox


cloud-fan closed pull request #30493:
URL: https://github.com/apache/spark/pull/30493


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on pull request #30382: [SPARK-33457][PYTHON] Adjust mypy configuration

2020-11-25 Thread GitBox


zero323 commented on pull request #30382:
URL: https://github.com/apache/spark/pull/30382#issuecomment-733566086


   Thanks @Fokko  and @HyukjinKwon!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #29950:
URL: https://github.com/apache/spark/pull/29950#issuecomment-733567816







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30484: [SPARK-33532][SQL] Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30484:
URL: https://github.com/apache/spark/pull/30484#issuecomment-733567809







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30442: [SPARK-33498][SQL] Datetime parsing should fail if the input string can't be parsed, or the pattern string is invalid

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30442:
URL: https://github.com/apache/spark/pull/30442#issuecomment-733567807







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30212: [SPARK-33308][SQL] Refract current grouping analytics

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30212:
URL: https://github.com/apache/spark/pull/30212#issuecomment-733567835







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29066: [SPARK-23889][SQL] DataSourceV2: required sorting and clustering for writes

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-733567814







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-733567824







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30472: [SPARK-32221][k8s] Avoid possible errors due to incorrect file size or type supplied in spark conf.

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30472:
URL: https://github.com/apache/spark/pull/30472#issuecomment-733567813







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-25 Thread GitBox


dongjoon-hyun commented on pull request #30486:
URL: https://github.com/apache/spark/pull/30486#issuecomment-733567925


   Retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30421:
URL: https://github.com/apache/spark/pull/30421#issuecomment-733567804







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30486:
URL: https://github.com/apache/spark/pull/30486#issuecomment-733567818







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30480: [SPARK-32921][SHUFFLE][test-maven][test-hadoop2.7] MapOutputTracker extensions to support push-based shuffle

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-733567823







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30486:
URL: https://github.com/apache/spark/pull/30486#issuecomment-733510493







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30212: [SPARK-33308][SQL] Refract current grouping analytics

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30212:
URL: https://github.com/apache/spark/pull/30212#issuecomment-733567815







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30472: [SPARK-32221][k8s] Avoid possible errors due to incorrect file size or type supplied in spark conf.

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30472:
URL: https://github.com/apache/spark/pull/30472#issuecomment-733567813







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30480: [SPARK-32921][SHUFFLE][test-maven][test-hadoop2.7] MapOutputTracker extensions to support push-based shuffle

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30480:
URL: https://github.com/apache/spark/pull/30480#issuecomment-733567808







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29950: [SPARK-32945][SQL] Avoid collapsing projects if reaching max allowed common exprs

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #29950:
URL: https://github.com/apache/spark/pull/29950#issuecomment-733543105







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-25 Thread GitBox


dongjoon-hyun commented on pull request #30486:
URL: https://github.com/apache/spark/pull/30486#issuecomment-733568045


   The R failure is a flaky one.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-733567803







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30484: [SPARK-33532][SQL] Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30484:
URL: https://github.com/apache/spark/pull/30484#issuecomment-733510794







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30442: [SPARK-33498][SQL] Datetime parsing should fail if the input string can't be parsed, or the pattern string is invalid

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30442:
URL: https://github.com/apache/spark/pull/30442#issuecomment-733567807







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29066: [SPARK-23889][SQL] DataSourceV2: required sorting and clustering for writes

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #29066:
URL: https://github.com/apache/spark/pull/29066#issuecomment-733510492







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30492: [SPARK-33545][CORE] Support Fallback Storage during Worker decommission

2020-11-25 Thread GitBox


SparkQA commented on pull request #30492:
URL: https://github.com/apache/spark/pull/30492#issuecomment-733568379


   **[Test build #131758 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131758/testReport)**
 for PR 30492 at commit 
[`0b5cd67`](https://github.com/apache/spark/commit/0b5cd6734df903d736bf14f8e38223fe8db7a5f0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30421: [SPARK-33474][SQL] Support TypeConstructed partition spec value

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30421:
URL: https://github.com/apache/spark/pull/30421#issuecomment-733539495







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30486: [SPARK-33530][CORE] Support --archives and spark.archives option natively

2020-11-25 Thread GitBox


SparkQA commented on pull request #30486:
URL: https://github.com/apache/spark/pull/30486#issuecomment-733568538


   **[Test build #131759 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131759/testReport)**
 for PR 30486 at commit 
[`15d8ed5`](https://github.com/apache/spark/commit/15d8ed51d99e57403aae3272d9975b96e7735ee2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-25 Thread GitBox


SparkQA commented on pull request #30483:
URL: https://github.com/apache/spark/pull/30483#issuecomment-733568636


   **[Test build #131760 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131760/testReport)**
 for PR 30483 at commit 
[`3e2db1a`](https://github.com/apache/spark/commit/3e2db1a1cdec3df84e9ceb9cc64860b7f88c6720).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30472: [SPARK-32221][k8s] Avoid possible errors due to incorrect file size or type supplied in spark conf.

2020-11-25 Thread GitBox


SparkQA commented on pull request #30472:
URL: https://github.com/apache/spark/pull/30472#issuecomment-733568741


   **[Test build #131761 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131761/testReport)**
 for PR 30472 at commit 
[`2d676cd`](https://github.com/apache/spark/commit/2d676cdeaed89ffe89f00de41350e51122b559d1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30493: [SPARK-33549][SQL] Remove configuration spark.sql.legacy.allowCastNumericToTimestamp

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30493:
URL: https://github.com/apache/spark/pull/30493#issuecomment-733568871







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30483:
URL: https://github.com/apache/spark/pull/30483#issuecomment-733544301







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30493: [SPARK-33549][SQL] Remove configuration spark.sql.legacy.allowCastNumericToTimestamp

2020-11-25 Thread GitBox


AmplabJenkins removed a comment on pull request #30493:
URL: https://github.com/apache/spark/pull/30493#issuecomment-733568871







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28026: [SPARK-31257][SQL] Unify create table syntax

2020-11-25 Thread GitBox


SparkQA commented on pull request #28026:
URL: https://github.com/apache/spark/pull/28026#issuecomment-733569564


   **[Test build #131762 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131762/testReport)**
 for PR 28026 at commit 
[`25ec746`](https://github.com/apache/spark/commit/25ec746753f29acd5e248d03db48211a3876a7c1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30484: [SPARK-33532][SQL] Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method

2020-11-25 Thread GitBox


SparkQA commented on pull request #30484:
URL: https://github.com/apache/spark/pull/30484#issuecomment-733571617


   **[Test build #131733 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131733/testReport)**
 for PR 30484 at commit 
[`a0859e7`](https://github.com/apache/spark/commit/a0859e75e68d0fd248e39a648cdd715c8356a132).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29893: [SPARK-32976][SQL]Support column list in INSERT statement

2020-11-25 Thread GitBox


SparkQA commented on pull request #29893:
URL: https://github.com/apache/spark/pull/29893#issuecomment-733571838


   **[Test build #131750 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131750/testReport)**
 for PR 29893 at commit 
[`475e790`](https://github.com/apache/spark/commit/475e790f7eeae80c4123f93c74952eb24f056897).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29893: [SPARK-32976][SQL]Support column list in INSERT statement

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #29893:
URL: https://github.com/apache/spark/pull/29893#issuecomment-733572124







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30484: [SPARK-33532][SQL] Remove unreachable branch in SpecificParquetRecordReaderBase.initialize method

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30484:
URL: https://github.com/apache/spark/pull/30484#issuecomment-733572799







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya opened a new pull request #30497: [SPARK-33540][SQL] Subexpression elimination for interpreted predicate

2020-11-25 Thread GitBox


viirya opened a new pull request #30497:
URL: https://github.com/apache/spark/pull/30497


   
   
   ### What changes were proposed in this pull request?
   
   
   
   This patch proposes to support subexpression elimination for interpreted 
predicate.
   
   
   ### Why are the changes needed?
   
   
   Similar to interpreted projection, there are use cases when codegen 
predicate is not able to work, e.g. too complex schema, non-codegen expression, 
etc. When there are frequently occurring expressions (subexpressions) among 
predicate expression, the performance is quite bad as we need to re-compute 
same expressions. We should be able to support subexpression elimination for 
interpreted predicate like interpreted projection.
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   No, this doesn't change user behavior.
   
   ### How was this patch tested?
   
   
   Unit test and benchmark.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30493: [SPARK-33549][SQL] Remove configuration spark.sql.legacy.allowCastNumericToTimestamp

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30493:
URL: https://github.com/apache/spark/pull/30493#issuecomment-733575869







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30497: [SPARK-33540][SQL] Subexpression elimination for interpreted predicate

2020-11-25 Thread GitBox


SparkQA commented on pull request #30497:
URL: https://github.com/apache/spark/pull/30497#issuecomment-733576388


   **[Test build #131763 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131763/testReport)**
 for PR 30497 at commit 
[`56c09ca`](https://github.com/apache/spark/commit/56c09cab1180a7ee676e69e435aed555fb98e501).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30440: [SPARK-33496][SQL]Improve error message of ANSI explicit cast

2020-11-25 Thread GitBox


SparkQA commented on pull request #30440:
URL: https://github.com/apache/spark/pull/30440#issuecomment-733576716


   **[Test build #131764 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131764/testReport)**
 for PR 30440 at commit 
[`bb5b219`](https://github.com/apache/spark/commit/bb5b219e3a337a4dbdf6923c19734c2643acd1fa).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30212: [SPARK-33308][SQL] Refract current grouping analytics

2020-11-25 Thread GitBox


SparkQA commented on pull request #30212:
URL: https://github.com/apache/spark/pull/30212#issuecomment-733576983


   **[Test build #131730 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131730/testReport)**
 for PR 30212 at commit 
[`516afc5`](https://github.com/apache/spark/commit/516afc56aa0c818ffb2d5d0915025a3a6387c8ff).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30472: [SPARK-32221][k8s] Avoid possible errors due to incorrect file size or type supplied in spark conf.

2020-11-25 Thread GitBox


SparkQA commented on pull request #30472:
URL: https://github.com/apache/spark/pull/30472#issuecomment-733577201


   **[Test build #131761 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131761/testReport)**
 for PR 30472 at commit 
[`2d676cd`](https://github.com/apache/spark/commit/2d676cdeaed89ffe89f00de41350e51122b559d1).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29122: [SPARK-32320][PYSPARK] Remove mutable default arguments

2020-11-25 Thread GitBox


SparkQA commented on pull request #29122:
URL: https://github.com/apache/spark/pull/29122#issuecomment-733577461


   **[Test build #131765 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131765/testReport)**
 for PR 29122 at commit 
[`c75cd57`](https://github.com/apache/spark/commit/c75cd574abfbaa6b9f5f99079a1ba25418d55720).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30472: [SPARK-32221][k8s] Avoid possible errors due to incorrect file size or type supplied in spark conf.

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30472:
URL: https://github.com/apache/spark/pull/30472#issuecomment-733577468







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30408: [SPARK-33477][SQL] Hive Metastore support filter by date type

2020-11-25 Thread GitBox


SparkQA commented on pull request #30408:
URL: https://github.com/apache/spark/pull/30408#issuecomment-733578116


   **[Test build #131751 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131751/testReport)**
 for PR 30408 at commit 
[`29c489a`](https://github.com/apache/spark/commit/29c489ad5f753aaa3551489655073c9f6fc7b0c6).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30212: [SPARK-33308][SQL] Refract current grouping analytics

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30212:
URL: https://github.com/apache/spark/pull/30212#issuecomment-733578781







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30408: [SPARK-33477][SQL] Hive Metastore support filter by date type

2020-11-25 Thread GitBox


AmplabJenkins commented on pull request #30408:
URL: https://github.com/apache/spark/pull/30408#issuecomment-733579150







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #30483: [WIP][SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2020-11-25 Thread GitBox


LuciferYang commented on pull request #30483:
URL: https://github.com/apache/spark/pull/30483#issuecomment-733579848


   @wangyum this is a very good suggestion ~



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on pull request #29893: [SPARK-32976][SQL]Support column list in INSERT statement

2020-11-25 Thread GitBox


yaooqinn commented on pull request #29893:
URL: https://github.com/apache/spark/pull/29893#issuecomment-733580257


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   >