[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


SparkQA commented on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982372984


   **[Test build #145756 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145756/testReport)**
 for PR 34750 at commit 
[`dabc3c5`](https://github.com/apache/spark/commit/dabc3c5d854b0e7f22eb0f65005e3bc4a3b83016).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


SparkQA commented on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982370952


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50221/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer opened a new pull request #31847: [SPARK-34755][SQL] Support the utils for transform number format

2021-11-29 Thread GitBox


beliefer opened a new pull request #31847:
URL: https://github.com/apache/spark/pull/31847


   ### What changes were proposed in this pull request?
   Data Type Formatting Functions: `to_number` and `to_char` is very useful. 
There are some mainstream database support the syntax.
   **PostgreSQL:**
   **Oracle:**
   **Vertica**
   **Redshift**
   **DB2**
   **Teradata**
   **Snowflake:**
   **Exasol**
   **Phoenix**
   **Singlestore**
   **Intersystems**
   The implement has many different between `Postgresql` ,`Oracle` and 
`Phoenix`.
   So, this PR follows the implement of `to_number` in `Oracle` that give a 
strict parameter verification.
   So, this PR follows the implement of `to_number` in `Phoenix` that uses 
BigDecimal.
   
   
   
   This PR support the patterns for numeric formatting as follows:
   
   Pattern | Description
   -- | --
   9 | Value with the specified number of digits
   0 | Value with leading zeros
   . (period) | Decimal point
   , (comma) | Group (thousand) separator
   S | Sign anchored to number (uses locale)
   $ | a value with a leading dollar sign
   D | Decimal point (uses locale)
   G | Group separator (uses locale)
   
   
   ### Why are the changes needed?
   to_number and to_char are very useful for formatted currency to number 
conversion.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   Jenkins test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA removed a comment on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982354267


   **[Test build #145755 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145755/testReport)**
 for PR 34751 at commit 
[`fb1d62a`](https://github.com/apache/spark/commit/fb1d62af938107a43c6140c06c1681117ba965dd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982364378


   **[Test build #145755 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145755/testReport)**
 for PR 34751 at commit 
[`fb1d62a`](https://github.com/apache/spark/commit/fb1d62af938107a43c6140c06c1681117ba965dd).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yangwwei edited a comment on pull request #34672: [SPARK-37394][CORE] Skip registering with ESS if a customized shuffle manager is configured

2021-11-29 Thread GitBox


yangwwei edited a comment on pull request #34672:
URL: https://github.com/apache/spark/pull/34672#issuecomment-982363102


   @mridulm , @attilapiros , @tgravescs  could you pls help to review the 
changes again?
   Per @attilapiros 's suggestion, I have added a method in the ShuffleManager 
trait and this is allowed to be overridden when needed. The default returns 
true, so there is no behavior change. I have also updated the "[How this was 
tested](https://github.com/apache/spark/pull/34672#issue-785084737)" with more 
details about the tests I've done locally.
   
   Note, this is still an "incompatible" change to the 3rd party shuffle 
service implementations. Adding a method with a default implementation in a 
trait will require a re-compile of the RSS's server/client library.
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yangwwei commented on pull request #34672: [SPARK-37394][CORE] Skip registering with ESS if a customized shuffle manager is configured

2021-11-29 Thread GitBox


yangwwei commented on pull request #34672:
URL: https://github.com/apache/spark/pull/34672#issuecomment-982363102


   @mridulm , @attilapiros , @tgravescs  could you pls help to review the 
changes again?
   Per @attilapiros 's suggestion, I have added a method in the ShuffleManager 
trait and this is allowed to be overridden when needed. The default returns 
true, so there is no behavior change. I have also updated the "how this was 
tested" with more details about the tests I've done locally.
   
   Note, this is still an "incompatible" change to the 3rd party shuffle 
service implementations. Adding a method with a default implementation in a 
trait will require a re-compile of the RSS's server/client library.
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] guiyanakuang commented on a change in pull request #34743: [SPARK-37488][CORE] When `TaskLocation` is `HDFSCacheTaskLocation` or `HostTaskLocation`, check if executor is alive on the

2021-11-29 Thread GitBox


guiyanakuang commented on a change in pull request #34743:
URL: https://github.com/apache/spark/pull/34743#discussion_r759001312



##
File path: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala
##
@@ -291,6 +291,21 @@ class TaskSetManagerSuite
 assert(manager.resourceOffer("execA", "host1", ANY)._1.get.index === 0)
   }
 
+  test("skip unsatisfiable locality levels (the case TaskLocation is 
HostTaskLocation)") {

Review comment:
   Thanks for the reminder, I'll add it later




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sadikovi commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-29 Thread GitBox


sadikovi commented on a change in pull request #34596:
URL: https://github.com/apache/spark/pull/34596#discussion_r75908



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala
##
@@ -160,6 +169,17 @@ class CSVInferSchema(val options: CSVOptions) extends 
Serializable {
   private def tryParseDouble(field: String): DataType = {
 if ((allCatch opt field.toDouble).isDefined || isInfOrNan(field)) {
   DoubleType
+} else {
+  tryParseTimestampNTZ(field)
+}
+  }
+
+  private def tryParseTimestampNTZ(field: String): DataType = {
+// We can only parse the value as TimestampNTZType if it does not have 
zone-offset or
+// time-zone component and can be parsed with the timestamp formatter.
+// Otherwise, it is likely to be a timestamp with timezone.
+if ((allCatch opt timestampNTZFormatter.parseWithoutTimeZone(field, 
true)).isDefined) {

Review comment:
   Could you elaborate a bit more? Thanks. 
   
   My understanding was that the config indicated whether the output of parsing 
should be treated as TimestampNTZ or TimestampLTZ.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey

2021-11-29 Thread GitBox


SparkQA commented on pull request #34656:
URL: https://github.com/apache/spark/pull/34656#issuecomment-982357268


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50225/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on a change in pull request #34743: [SPARK-37488][CORE] When `TaskLocation` is `HDFSCacheTaskLocation` or `HostTaskLocation`, check if executor is alive on the h

2021-11-29 Thread GitBox


LuciferYang commented on a change in pull request #34743:
URL: https://github.com/apache/spark/pull/34743#discussion_r758996701



##
File path: 
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala
##
@@ -291,6 +291,21 @@ class TaskSetManagerSuite
 assert(manager.resourceOffer("execA", "host1", ANY)._1.get.index === 0)
   }
 
+  test("skip unsatisfiable locality levels (the case TaskLocation is 
HostTaskLocation)") {

Review comment:
   Maybe we should add `SPARK-37488` to the test name as prefix




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environmen

2021-11-29 Thread GitBox


SparkQA commented on pull request #34635:
URL: https://github.com/apache/spark/pull/34635#issuecomment-982356662


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50223/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34568:
URL: https://github.com/apache/spark/pull/34568#issuecomment-982354923


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145740/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] summaryzb commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page

2021-11-29 Thread GitBox


summaryzb commented on a change in pull request #34749:
URL: https://github.com/apache/spark/pull/34749#discussion_r758995760



##
File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
##
@@ -89,7 +89,44 @@ private[spark] class AppStatusStore(
 } else {
   base
 }
-filtered.asScala.map(_.info).filter(_.id != 
FALLBACK_BLOCK_MANAGER_ID.executorId).toSeq
+filtered.asScala.map(_.info)
+  .filter(_.id != FALLBACK_BLOCK_MANAGER_ID.executorId)
+  .map(replaceExec).toSeq
+  }
+
+  def replaceExec(origin: v1.ExecutorSummary): v1.ExecutorSummary = {

Review comment:
   make it private




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environmen

2021-11-29 Thread GitBox


SparkQA commented on pull request #34635:
URL: https://github.com/apache/spark/pull/34635#issuecomment-982355152


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50224/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34568:
URL: https://github.com/apache/spark/pull/34568#issuecomment-982354923


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145740/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34753: [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW CREATE TABLE` command

2021-11-29 Thread GitBox


SparkQA commented on pull request #34753:
URL: https://github.com/apache/spark/pull/34753#issuecomment-982354219


   **[Test build #145754 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145754/testReport)**
 for PR 34753 at commit 
[`97e52cb`](https://github.com/apache/spark/commit/97e52cb1856fda7148fd05f553ec090a256527b5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


SparkQA commented on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982354345


   **[Test build #145756 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145756/testReport)**
 for PR 34750 at commit 
[`dabc3c5`](https://github.com/apache/spark/commit/dabc3c5d854b0e7f22eb0f65005e3bc4a3b83016).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982354267


   **[Test build #145755 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145755/testReport)**
 for PR 34751 at commit 
[`fb1d62a`](https://github.com/apache/spark/commit/fb1d62af938107a43c6140c06c1681117ba965dd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter

2021-11-29 Thread GitBox


SparkQA removed a comment on pull request #34568:
URL: https://github.com/apache/spark/pull/34568#issuecomment-982226440


   **[Test build #145740 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145740/testReport)**
 for PR 34568 at commit 
[`7fddb62`](https://github.com/apache/spark/commit/7fddb62cc5d6f93b9525162fdf4bd4602903e248).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34752: [SPARK][STREAMING] minRatePerPartition should be multiplied with secsPerBatch

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34752:
URL: https://github.com/apache/spark/pull/34752#issuecomment-982353991


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34568: [SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter

2021-11-29 Thread GitBox


SparkQA commented on pull request #34568:
URL: https://github.com/apache/spark/pull/34568#issuecomment-982353646


   **[Test build #145740 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145740/testReport)**
 for PR 34568 at commit 
[`7fddb62`](https://github.com/apache/spark/commit/7fddb62cc5d6f93b9525162fdf4bd4602903e248).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982353320


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145750/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34723: [SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34723:
URL: https://github.com/apache/spark/pull/34723#issuecomment-982353318


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145738/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-clus

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34635:
URL: https://github.com/apache/spark/pull/34635#issuecomment-982353319


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145752/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34213:
URL: https://github.com/apache/spark/pull/34213#issuecomment-982353317


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50219/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982353202






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster envi

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34635:
URL: https://github.com/apache/spark/pull/34635#issuecomment-982353319


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145752/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34723: [SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34723:
URL: https://github.com/apache/spark/pull/34723#issuecomment-982353318


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145738/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982353203






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34213:
URL: https://github.com/apache/spark/pull/34213#issuecomment-982353317


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50219/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982353320


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145750/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] summaryzb commented on a change in pull request #34749: [SPARK-37493][CORE] show driver's gc time and duration time in executors page

2021-11-29 Thread GitBox


summaryzb commented on a change in pull request #34749:
URL: https://github.com/apache/spark/pull/34749#discussion_r758993284



##
File path: core/src/main/scala/org/apache/spark/metrics/ExecutorMetricType.scala
##
@@ -137,7 +138,9 @@ case object GarbageCollectionMetrics extends 
ExecutorMetricType with Logging {
 
   override private[spark] def getMetricValues(memoryManager: MemoryManager): 
Array[Long] = {
 val gcMetrics = new Array[Long](names.length) // minorCount, minorTime, 
majorCount, majorTime
-ManagementFactory.getGarbageCollectorMXBeans.asScala.foreach { mxBean =>
+val mxBeans = ManagementFactory.getGarbageCollectorMXBeans.asScala
+gcMetrics(4) = mxBeans.map(_.getCollectionTime).sum

Review comment:
   in common case they are the same, but when we use nonBuiltInCollectors,
   "gcMetrics(1) + gcMetrics(3)"  is zero, but the actual total gc time  is not 
zero




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on a change in pull request #34745: [WIP][SPARK-37391][SQL] JdbcConnectionProvider must indicate if it needs lock

2021-11-29 Thread GitBox


LuciferYang commented on a change in pull request #34745:
URL: https://github.com/apache/spark/pull/34745#discussion_r758992879



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProviderSuite.scala
##
@@ -68,12 +69,20 @@ class ConnectionProviderSuite
   override def canHandle(driver: Driver, options: Map[String, String]): 
Boolean = true
   override def getConnection(driver: Driver, options: Map[String, 
String]): Connection =
 throw new RuntimeException()
+  override def needsModifySecurityConfiguration(

Review comment:
   Should I write the default method to `JdbcConnectionProvider` (return 
false) and override it only where necessary
   
   

##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProviderSuite.scala
##
@@ -68,12 +69,20 @@ class ConnectionProviderSuite
   override def canHandle(driver: Driver, options: Map[String, String]): 
Boolean = true
   override def getConnection(driver: Driver, options: Map[String, 
String]): Connection =
 throw new RuntimeException()
+  override def needsModifySecurityConfiguration(

Review comment:
   Should we write the default method to `JdbcConnectionProvider` (return 
false) and override it only where necessary
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dchvn commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-11-29 Thread GitBox


dchvn commented on pull request #34213:
URL: https://github.com/apache/spark/pull/34213#issuecomment-982350720


   Thanks! @itholic @Yikun @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-11-29 Thread GitBox


SparkQA commented on pull request #34213:
URL: https://github.com/apache/spark/pull/34213#issuecomment-982349531


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50219/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


SparkQA commented on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982348283


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50221/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #31847: [SPARK-34755][SQL] Support the utils for transform number format

2021-11-29 Thread GitBox


MaxGekk commented on pull request #31847:
URL: https://github.com/apache/spark/pull/31847#issuecomment-982348272


   Since the functions are broadly used in other systems, it makes sense to 
support them in Spark that can make migration to Spark easier, I believe. 
@beliefer Could you re-open this PR, please. @cloud-fan Do you agree?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on pull request #34060: [SPARK-36850][SQL] Migrate CreateTableStatement to v2 command framework

2021-11-29 Thread GitBox


huaxingao commented on pull request #34060:
URL: https://github.com/apache/spark/pull/34060#issuecomment-982348104


   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34723: [SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders

2021-11-29 Thread GitBox


SparkQA removed a comment on pull request #34723:
URL: https://github.com/apache/spark/pull/34723#issuecomment-982219652


   **[Test build #145738 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145738/testReport)**
 for PR 34723 at commit 
[`96bb4f5`](https://github.com/apache/spark/commit/96bb4f5a2c4e06d902062369bb26c05b47b4d8a5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34723: [SPARK-37492][SQL] Optimize Orc test code with withAllNativeOrcReaders

2021-11-29 Thread GitBox


SparkQA commented on pull request #34723:
URL: https://github.com/apache/spark/pull/34723#issuecomment-982344119


   **[Test build #145738 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145738/testReport)**
 for PR 34723 at commit 
[`96bb4f5`](https://github.com/apache/spark/commit/96bb4f5a2c4e06d902062369bb26c05b47b4d8a5).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster en

2021-11-29 Thread GitBox


SparkQA removed a comment on pull request #34635:
URL: https://github.com/apache/spark/pull/34635#issuecomment-982330102


   **[Test build #145752 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145752/testReport)**
 for PR 34635 at commit 
[`c69cb6e`](https://github.com/apache/spark/commit/c69cb6e9720e8e1f443aa2c52492c5639928435a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environmen

2021-11-29 Thread GitBox


SparkQA commented on pull request #34635:
URL: https://github.com/apache/spark/pull/34635#issuecomment-982342499


   **[Test build #145752 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145752/testReport)**
 for PR 34635 at commit 
[`c69cb6e`](https://github.com/apache/spark/commit/c69cb6e9720e8e1f443aa2c52492c5639928435a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Peng-Lei opened a new pull request #34753: [SPARK-37494][SQL] Unify v1 and v2 options output of `SHOW CREATE TABLE` command

2021-11-29 Thread GitBox


Peng-Lei opened a new pull request #34753:
URL: https://github.com/apache/spark/pull/34753


   ### What changes were proposed in this pull request?
   1. Change the v1 `SHOW CREATE TABLE` command behaviors that options output 
match v2. eg:
  `'key' = 'value'`
   2. sort the order of options output.
   3. sort the order of properties output. 
   
   
   ### Why are the changes needed?
   match v2 behaviors and disscuss at 
[#comments](https://github.com/apache/spark/pull/34719#discussion_r758156350)
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. when `SHOW CREATE TABLE` the output of properties and options is sorted 
and options output is like `'key' = 'value'`
   
   ### How was this patch tested?
   Add test case.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA removed a comment on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982329931


   **[Test build #145751 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145751/testReport)**
 for PR 34751 at commit 
[`8129ab1`](https://github.com/apache/spark/commit/8129ab1dd1e2e384caf5851a616e3b255953e9b0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


SparkQA removed a comment on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982324974


   **[Test build #145750 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145750/testReport)**
 for PR 34750 at commit 
[`2bbb84d`](https://github.com/apache/spark/commit/2bbb84d30157e9b855f81c5448c326b34b302937).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-29 Thread GitBox


HyukjinKwon commented on a change in pull request #34596:
URL: https://github.com/apache/spark/pull/34596#discussion_r758980984



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala
##
@@ -38,6 +39,13 @@ class CSVInferSchema(val options: CSVOptions) extends 
Serializable {
 legacyFormat = FAST_DATE_FORMAT,
 isParsing = true)
 
+  private val timestampNTZFormatter = TimestampFormatter(
+options.timestampNTZFormatInRead,
+options.zoneId,
+legacyFormat = FAST_DATE_FORMAT,
+isParsing = true,
+forTimestampNTZ = true)

Review comment:
   this part I'd defer to @MaxGekk to review.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982338326


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50222/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #34596: [SPARK-37326][SQL] Support TimestampNTZ in CSV data source

2021-11-29 Thread GitBox


HyukjinKwon commented on a change in pull request #34596:
URL: https://github.com/apache/spark/pull/34596#discussion_r758979898



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala
##
@@ -160,6 +169,17 @@ class CSVInferSchema(val options: CSVOptions) extends 
Serializable {
   private def tryParseDouble(field: String): DataType = {
 if ((allCatch opt field.toDouble).isDefined || isInfOrNan(field)) {
   DoubleType
+} else {
+  tryParseTimestampNTZ(field)
+}
+  }
+
+  private def tryParseTimestampNTZ(field: String): DataType = {
+// We can only parse the value as TimestampNTZType if it does not have 
zone-offset or
+// time-zone component and can be parsed with the timestamp formatter.
+// Otherwise, it is likely to be a timestamp with timezone.
+if ((allCatch opt timestampNTZFormatter.parseWithoutTimeZone(field, 
true)).isDefined) {

Review comment:
   Should maybe we skip the parsing if `SQLConf.get.timestampType` is set 
to `TIMESTAMP_LTZ` since parsing is non-trivial op?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


SparkQA commented on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982337997


   **[Test build #145750 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145750/testReport)**
 for PR 34750 at commit 
[`2bbb84d`](https://github.com/apache/spark/commit/2bbb84d30157e9b855f81c5448c326b34b302937).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey

2021-11-29 Thread GitBox


dongjoon-hyun commented on pull request #34656:
URL: https://github.com/apache/spark/pull/34656#issuecomment-982335508


   Although it looks good to me, gentle ping once more, @cloud-fan @rdblue 
@viirya @huaxingao .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982335210


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50220/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982334949


   **[Test build #145751 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145751/testReport)**
 for PR 34751 at commit 
[`8129ab1`](https://github.com/apache/spark/commit/8129ab1dd1e2e384caf5851a616e3b255953e9b0).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-11-29 Thread GitBox


HyukjinKwon closed pull request #34213:
URL: https://github.com/apache/spark/pull/34213


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-11-29 Thread GitBox


HyukjinKwon commented on pull request #34213:
URL: https://github.com/apache/spark/pull/34213#issuecomment-982334604


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey

2021-11-29 Thread GitBox


dongjoon-hyun commented on pull request #34656:
URL: https://github.com/apache/spark/pull/34656#issuecomment-982334173


   Thank you, @sunchao !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #34264: [SPARK-36462][K8S] Add the ability to selectively disable watching or polling

2021-11-29 Thread GitBox


dongjoon-hyun commented on pull request #34264:
URL: https://github.com/apache/spark/pull/34264#issuecomment-982333916


   BTW, in general, I agree with your demands and requirements in this PR. The 
only concerns are
   - the better backward compatibility
   - the visibility of these configurations


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sungpeo opened a new pull request #34752: [SPARK][STREAMING] minRatePerPartition should be multiplied with secsPerBatch

2021-11-29 Thread GitBox


sungpeo opened a new pull request #34752:
URL: https://github.com/apache/spark/pull/34752


   
   
   ### What changes were proposed in this pull request?
   
   
   `maxRatePerPartition` means "max messages per partition per second".
   But minRatePerPartition does not. ("max messages per partition per a batch").
   
   minRatePerPartition should be multiplied with secsPerBatch
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


dongjoon-hyun commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982332947


   Thank you for review, @viirya .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey

2021-11-29 Thread GitBox


SparkQA commented on pull request #34656:
URL: https://github.com/apache/spark/pull/34656#issuecomment-982331806


   **[Test build #145753 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145753/testReport)**
 for PR 34656 at commit 
[`6b920e1`](https://github.com/apache/spark/commit/6b920e1e3109d6f2c37150cb2ffd168790a35d3e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on pull request #34656: [SPARK-37376][SQL] Introduce a new DataSource V2 interface HasPartitionKey

2021-11-29 Thread GitBox


sunchao commented on pull request #34656:
URL: https://github.com/apache/spark/pull/34656#issuecomment-982331198


   @dongjoon-hyun done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions

2021-11-29 Thread GitBox


HyukjinKwon closed pull request #34732:
URL: https://github.com/apache/spark/pull/34732


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox


HyukjinKwon closed pull request #34746:
URL: https://github.com/apache/spark/pull/34746


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34732: [SPARK-37291][PYSPARK][FOLLOWUP] PySpark create SparkSession should pass initialSessionOptions

2021-11-29 Thread GitBox


HyukjinKwon commented on pull request #34732:
URL: https://github.com/apache/spark/pull/34732#issuecomment-982330696


   Merged to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox


HyukjinKwon commented on pull request #34746:
URL: https://github.com/apache/spark/pull/34746#issuecomment-982330232


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluster environmen

2021-11-29 Thread GitBox


SparkQA commented on pull request #34635:
URL: https://github.com/apache/spark/pull/34635#issuecomment-982330102


   **[Test build #145752 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145752/testReport)**
 for PR 34635 at commit 
[`c69cb6e`](https://github.com/apache/spark/commit/c69cb6e9720e8e1f443aa2c52492c5639928435a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982329931


   **[Test build #145751 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145751/testReport)**
 for PR 34751 at commit 
[`8129ab1`](https://github.com/apache/spark/commit/8129ab1dd1e2e384caf5851a616e3b255953e9b0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA removed a comment on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982324903


   **[Test build #145749 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145749/testReport)**
 for PR 34751 at commit 
[`f845a10`](https://github.com/apache/spark/commit/f845a10f24f23a18ec230b3b8581e3bce866cc61).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982329599


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145749/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982329599


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145749/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982329574


   **[Test build #145749 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145749/testReport)**
 for PR 34751 at commit 
[`f845a10`](https://github.com/apache/spark/commit/f845a10f24f23a18ec230b3b8581e3bce866cc61).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class ExecutorPodsPollingSnapshotSource(`
 * `class ExecutorPodsWatchSnapshotSource(`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34264: [SPARK-36462][K8S] Add the ability to selectively disable watching or polling

2021-11-29 Thread GitBox


dongjoon-hyun commented on a change in pull request #34264:
URL: https://github.com/apache/spark/pull/34264#discussion_r758971094



##
File path: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/scheduler/cluster/k8s/ExecutorPodsWatchSnapshotSource.scala
##
@@ -22,24 +22,30 @@ import io.fabric8.kubernetes.api.model.Pod
 import io.fabric8.kubernetes.client.{KubernetesClient, Watcher, 
WatcherException}
 import io.fabric8.kubernetes.client.Watcher.Action
 
+import org.apache.spark.SparkConf
+import 
org.apache.spark.deploy.k8s.Config.KUBERNETES_EXECUTOR_ENABLE_API_WATCHER
 import org.apache.spark.deploy.k8s.Constants._
 import org.apache.spark.internal.Logging
 import org.apache.spark.util.Utils
 
 private[spark] class ExecutorPodsWatchSnapshotSource(
 snapshotsStore: ExecutorPodsSnapshotsStore,
-kubernetesClient: KubernetesClient) extends Logging {
+kubernetesClient: KubernetesClient,
+conf: SparkConf) extends Logging {

Review comment:
   For this one, I made a PR to provide a better backward compatibility and 
to help the downstreams' ExternalClusterManager. Since these classes are 
unchanged since 2.4.0, I believe we can declare it `stable developer API` and 
maintain it more carefully.
   - https://github.com/apache/spark/pull/34751




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on a change in pull request #34635: [SPARK-37205][YARN] Introduce a new config 'spark.yarn.am.tokenConfRegex' to support renewing delegation tokens in a multi-cluste

2021-11-29 Thread GitBox


sunchao commented on a change in pull request #34635:
URL: https://github.com/apache/spark/pull/34635#discussion_r758970173



##
File path: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
##
@@ -340,6 +344,40 @@ private[spark] class Client(
 amContainer.setTokens(ByteBuffer.wrap(serializedCreds))
   }
 
+  /**
+   * Set configurations sent from AM to RM for renewing delegation tokens.
+   */
+  private def setTokenConf(amContainer: ContainerLaunchContext): Unit = {
+// SPARK-37205: this regex is used to grep a list of configurations and 
send them to YARN RM
+// for fetching delegation tokens. See YARN-5910 for more details.
+// The feature is only supported in Hadoop 3.x and up, hence the check 
below.
+val regex = sparkConf.get(config.AM_SEND_TOKEN_CONF)
+if (regex.nonEmpty && VersionUtils.isHadoop3) {
+  logInfo(s"Processing token conf (spark.yarn.am.sendTokenConf) with regex 
$regex")
+  val dob = new DataOutputBuffer();
+  val copy = new Configuration(false);
+  copy.clear();
+  hadoopConf.asScala.foreach { entry =>
+if (entry.getKey.matches(regex.get)) {
+  copy.set(entry.getKey, entry.getValue)
+  logInfo(s"Captured key: ${entry.getKey} -> value: ${entry.getValue}")
+}
+  }
+  copy.write(dob);
+
+  // since this method was added in Hadoop 2.9 and 3.0, we use reflection 
here to avoid

Review comment:
   Got it. I added the change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests

2021-11-29 Thread GitBox


cloud-fan commented on a change in pull request #34719:
URL: https://github.com/apache/spark/pull/34719#discussion_r758970183



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowCreateTableSuite.scala
##
@@ -0,0 +1,141 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.v2
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.execution.command
+
+/**
+ * The class contains tests for the `SHOW CREATE TABLE` command to check V2 
table catalogs.
+ */
+class ShowCreateTableSuite extends command.ShowCreateTableSuiteBase with 
CommandSuiteBase {
+  test("SPARK-33898: SHOW CREATE TABLE AS SERDE") {
+val db = "ns1"
+val table = "tbl"
+withNamespaceAndTable(db, table) { t =>
+  spark.sql(s"CREATE TABLE $t (id bigint, data string) $defaultUsing")
+  val e = intercept[AnalysisException] {
+sql(s"SHOW CREATE TABLE $t AS SERDE")
+  }
+  assert(e.message.contains(s"SHOW CREATE TABLE AS SERDE is not supported 
for v2 tables."))
+}
+  }
+
+  test("CTAS") {
+val db = "ns1"
+val table = "ddl_test"
+withNamespaceAndTable(db, table) { t =>
+  sql(
+s"""CREATE TABLE $t
+   |$defaultUsing
+   |PARTITIONED BY (a)
+   |COMMENT 'This is a comment'
+   |TBLPROPERTIES ('a' = '1')
+   |AS SELECT 1 AS a, "foo" AS b
+ """.stripMargin
+  )
+  val showDDL = getShowCreateDDL(s"SHOW CREATE TABLE $t")
+  assert(showDDL === Array(
+s"CREATE TABLE $t (",
+"`a` INT,",
+"`b` STRING)",
+defaultUsing,
+"PARTITIONED BY (a)",
+"COMMENT 'This is a comment'",
+"TBLPROPERTIES(",
+"'a' = '1')"
+  ))
+}
+  }
+
+  test("SPARK-33898: SHOW CREATE TABLE") {
+val db = "ns1"
+val table = "tbl"
+withNamespaceAndTable(db, table) { t =>
+  sql(
+s"""
+   |CREATE TABLE $t (
+   |  a bigint NOT NULL,
+   |  b bigint,
+   |  c bigint,
+   |  `extraCol` ARRAY,
+   |  `` STRUCT>
+   |)
+   |$defaultUsing
+   |OPTIONS (
+   |  from = 0,
+   |  to = 1,
+   |  via = 2)
+   |COMMENT 'This is a comment'
+   |TBLPROPERTIES ('prop1' = '1', 'prop2' = '2', 'prop3' = 3, 'prop4' 
= 4)
+   |PARTITIONED BY (a)
+   |LOCATION '/tmp'

Review comment:
   yes please.
   
   Let's fix all the issues found in this PR first, then we can come back and 
merge this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


dongjoon-hyun commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982326438


   cc @holdenk , @shrutig , @viirya 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox


Yikun commented on pull request #34746:
URL: https://github.com/apache/spark/pull/34746#issuecomment-982325271


   ```
   ERROR [2.132s]: test_reuse_worker_of_parallelize_range 
(pyspark.tests.test_worker.WorkerReuseTest)
   --
   Traceback (most recent call last):
 File "/__w/spark/spark/python/pyspark/tests/test_worker.py", line 195, in 
test_reuse_worker_of_parallelize_range
   self.assertTrue(pid in previous_pids)
   AssertionError: False is not true
   ```
   
   flaky test failure and unrelated, and looks like it failed many 
times[1][2][3] before, I submitted a JIRA: 
https://issues.apache.org/jira/browse/SPARK-37498 , will take a look when I get 
time.
   
   [1] https://github.com/apache/spark/runs/1182154542?check_suite_focus=true
   [2] https://github.com/apache/spark/pull/33657#issuecomment-893969310
   [3] https://github.com/Yikun/spark/runs/4362783540?check_suite_focus=true


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34750: [SPARK-37495][PYTHON] Skip identical index checking of Series.compare when config 'compute.eager_check' is disabled

2021-11-29 Thread GitBox


SparkQA commented on pull request #34750:
URL: https://github.com/apache/spark/pull/34750#issuecomment-982324974


   **[Test build #145750 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145750/testReport)**
 for PR 34750 at commit 
[`2bbb84d`](https://github.com/apache/spark/commit/2bbb84d30157e9b855f81c5448c326b34b302937).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ChenMichael commented on a change in pull request #34684: [SPARK-37442][SQL] InMemoryRelation statistics bug causing broadcast join failures with AQE enabled

2021-11-29 Thread GitBox


ChenMichael commented on a change in pull request #34684:
URL: https://github.com/apache/spark/pull/34684#discussion_r758965590



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
##
@@ -237,7 +238,23 @@ case class CachedRDDBuilder(
   }
 
   def isCachedColumnBuffersLoaded: Boolean = {
-_cachedColumnBuffers != null
+_cachedColumnBuffers != null && isCachedRDDLoaded
+  }
+
+  def isCachedRDDLoaded: Boolean = {

Review comment:
   Sure. added




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


SparkQA commented on pull request #34751:
URL: https://github.com/apache/spark/pull/34751#issuecomment-982324903


   **[Test build #145749 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145749/testReport)**
 for PR 34751 at commit 
[`f845a10`](https://github.com/apache/spark/commit/f845a10f24f23a18ec230b3b8581e3bce866cc61).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Peng-Lei commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests

2021-11-29 Thread GitBox


Peng-Lei commented on a change in pull request #34719:
URL: https://github.com/apache/spark/pull/34719#discussion_r758966959



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowCreateTableSuite.scala
##
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.v1
+
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.catalog.CatalogTable
+import org.apache.spark.sql.execution.command
+
+/**
+ * This base suite contains unified tests for the `SHOW CREATE TABLE` command 
that checks V1
+ * table catalogs. The tests that cannot run for all V1 catalogs are located 
in more
+ * specific test suites:
+ *
+ *   - V1 In-Memory catalog: 
`org.apache.spark.sql.execution.command.v1.ShowCreateTableSuite`
+ *   - V1 Hive External catalog:
+ * `org.apache.spark.sql.hive.execution.command.ShowCreateTableSuite`
+ */
+trait ShowCreateTableSuiteBase extends command.ShowCreateTableSuiteBase
+with command.TestsV1AndV2Commands {
+
+  test("CATS") {

Review comment:
   I will address the comments together later.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Peng-Lei commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests

2021-11-29 Thread GitBox


Peng-Lei commented on a change in pull request #34719:
URL: https://github.com/apache/spark/pull/34719#discussion_r758966706



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowCreateTableSuite.scala
##
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.v1
+
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.catalog.CatalogTable
+import org.apache.spark.sql.execution.command
+
+/**
+ * This base suite contains unified tests for the `SHOW CREATE TABLE` command 
that checks V1
+ * table catalogs. The tests that cannot run for all V1 catalogs are located 
in more
+ * specific test suites:
+ *
+ *   - V1 In-Memory catalog: 
`org.apache.spark.sql.execution.command.v1.ShowCreateTableSuite`
+ *   - V1 Hive External catalog:
+ * `org.apache.spark.sql.hive.execution.command.ShowCreateTableSuite`
+ */
+trait ShowCreateTableSuiteBase extends command.ShowCreateTableSuiteBase
+with command.TestsV1AndV2Commands {
+
+  test("CATS") {

Review comment:
   Sorry. Wrong title. I did mean to express `Create Table As Select`. It 
should be `CTAS` instead of `CATS`. `CATS` is short name of `Contention-Aware 
Transaction Scheduling` in MySQL.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34746:
URL: https://github.com/apache/spark/pull/34746#issuecomment-982324060


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50217/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34741:
URL: https://github.com/apache/spark/pull/34741#issuecomment-982324154


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145735/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34731: [SPARK-37153][PYTHON] Inline type hints for python/pyspark/profiler.py

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34731:
URL: https://github.com/apache/spark/pull/34731#issuecomment-982324056


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50216/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34657: [WIP] Support TimedeltaIndex in pandas API on Spark

2021-11-29 Thread GitBox


AmplabJenkins removed a comment on pull request #34657:
URL: https://github.com/apache/spark/pull/34657#issuecomment-982324057


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50218/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34741:
URL: https://github.com/apache/spark/pull/34741#issuecomment-982324154


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/145735/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34746:
URL: https://github.com/apache/spark/pull/34746#issuecomment-982324060


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50217/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34657: [WIP] Support TimedeltaIndex in pandas API on Spark

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34657:
URL: https://github.com/apache/spark/pull/34657#issuecomment-982324057


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50218/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34731: [SPARK-37153][PYTHON] Inline type hints for python/pyspark/profiler.py

2021-11-29 Thread GitBox


AmplabJenkins commented on pull request #34731:
URL: https://github.com/apache/spark/pull/34731#issuecomment-982324056


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/50216/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun opened a new pull request #34751: [SPARK-37497][K8S] Promote `ExecutorPods[PollingSnapshot|WatchSnapshot]Source` to DeveloperApi

2021-11-29 Thread GitBox


dongjoon-hyun opened a new pull request #34751:
URL: https://github.com/apache/spark/pull/34751


   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone

2021-11-29 Thread GitBox


SparkQA removed a comment on pull request #34741:
URL: https://github.com/apache/spark/pull/34741#issuecomment-982193103


   **[Test build #145735 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145735/testReport)**
 for PR 34741 at commit 
[`ee11bf4`](https://github.com/apache/spark/commit/ee11bf4990bdc7bac5202d164ca144e9cbed6fa8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34741: [SPARK-37463][SQL] Read/Write Timestamp ntz from/to Orc uses UTC time zone

2021-11-29 Thread GitBox


SparkQA commented on pull request #34741:
URL: https://github.com/apache/spark/pull/34741#issuecomment-982322853


   **[Test build #145735 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/145735/testReport)**
 for PR 34741 at commit 
[`ee11bf4`](https://github.com/apache/spark/commit/ee11bf4990bdc7bac5202d164ca144e9cbed6fa8).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class UDFBasicProfiler(BasicProfiler):`
 * `case class PrettyPythonUDF(`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ChenMichael commented on a change in pull request #34684: [SPARK-37442][SQL] InMemoryRelation statistics bug causing broadcast join failures with AQE enabled

2021-11-29 Thread GitBox


ChenMichael commented on a change in pull request #34684:
URL: https://github.com/apache/spark/pull/34684#discussion_r758965590



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
##
@@ -237,7 +238,23 @@ case class CachedRDDBuilder(
   }
 
   def isCachedColumnBuffersLoaded: Boolean = {
-_cachedColumnBuffers != null
+_cachedColumnBuffers != null && isCachedRDDLoaded
+  }
+
+  def isCachedRDDLoaded: Boolean = {

Review comment:
   Makes sense. Added




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34657: [WIP] Support TimedeltaIndex in pandas API on Spark

2021-11-29 Thread GitBox


SparkQA commented on pull request #34657:
URL: https://github.com/apache/spark/pull/34657#issuecomment-982322524


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50218/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34213: [SPARK-36396][PYTHON] Implement DataFrame.cov

2021-11-29 Thread GitBox


SparkQA commented on pull request #34213:
URL: https://github.com/apache/spark/pull/34213#issuecomment-982319967


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50219/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Yikun commented on a change in pull request #34717: [SPARK-37465][PYTHON] Bump minimum pandas version to 1.0.5

2021-11-29 Thread GitBox


Yikun commented on a change in pull request #34717:
URL: https://github.com/apache/spark/pull/34717#discussion_r758898358



##
File path: python/docs/source/user_guide/sql/arrow_pandas.rst
##
@@ -387,7 +387,7 @@ working with timestamps in ``pandas_udf``\s to get the best 
performance, see
 Recommended Pandas and PyArrow Versions
 ~~~
 
-For usage with pyspark.sql, the minimum supported versions of Pandas is 0.23.2 
and PyArrow is 1.0.0.
+For usage with pyspark.sql, the minimum supported versions of Pandas is 1.0.5 
and PyArrow is 1.0.0.

Review comment:
   How about:
   
   For usage with pyspark.sql, the minimum supported versions of Pandas is 
**1.0.5** and PyArrow is 1.0.0. **Lower versions (such as there are some known 
issues under with v1.0.0, v1.0.1, see more in 
[link](https://github.com/apache/spark/pull/34717)) or** higher versions may be 
used, however, compatibility and data correctness can not be guaranteed and 
should be verified by the user.
   
   Maybe need more suggestion from native speaker. T_T, and if it's necessary 
we could do it in next commits in this PR or followup.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #34684: [SPARK-37442][SQL] InMemoryRelation statistics bug causing broadcast join failures with AQE enabled

2021-11-29 Thread GitBox


cloud-fan commented on a change in pull request #34684:
URL: https://github.com/apache/spark/pull/34684#discussion_r758962344



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
##
@@ -237,7 +238,23 @@ case class CachedRDDBuilder(
   }
 
   def isCachedColumnBuffersLoaded: Boolean = {
-_cachedColumnBuffers != null
+_cachedColumnBuffers != null && isCachedRDDLoaded
+  }
+
+  def isCachedRDDLoaded: Boolean = {

Review comment:
   OK, shall we at least mark `_cachedColumnBuffersAreLoaded` as 
`volatile`? In case two threads calling `isCachedRDDLoaded` at the same time.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34746: [SPARK-37489][PYTHON] Skip hasnans check in numops if eager_check disable

2021-11-29 Thread GitBox


SparkQA commented on pull request #34746:
URL: https://github.com/apache/spark/pull/34746#issuecomment-982313723


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50217/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34731: [SPARK-37153][PYTHON] Inline type hints for python/pyspark/profiler.py

2021-11-29 Thread GitBox


SparkQA commented on pull request #34731:
URL: https://github.com/apache/spark/pull/34731#issuecomment-982309084


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50216/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on a change in pull request #34719: [SPARK-37381][SQL] Unify v1 and v2 SHOW CREATE TABLE tests

2021-11-29 Thread GitBox


LuciferYang commented on a change in pull request #34719:
URL: https://github.com/apache/spark/pull/34719#discussion_r758953291



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowCreateTableSuite.scala
##
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.v1
+
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.catalog.CatalogTable
+import org.apache.spark.sql.execution.command
+
+/**
+ * This base suite contains unified tests for the `SHOW CREATE TABLE` command 
that checks V1
+ * table catalogs. The tests that cannot run for all V1 catalogs are located 
in more
+ * specific test suites:
+ *
+ *   - V1 In-Memory catalog: 
`org.apache.spark.sql.execution.command.v1.ShowCreateTableSuite`
+ *   - V1 Hive External catalog:
+ * `org.apache.spark.sql.hive.execution.command.ShowCreateTableSuite`
+ */
+trait ShowCreateTableSuiteBase extends command.ShowCreateTableSuiteBase
+with command.TestsV1AndV2Commands {
+
+  test("CATS") {

Review comment:
   I guess `CATS` is testing `Create Table As Select`?
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   >