[GitHub] [spark] AmplabJenkins removed a comment on pull request #32558: [SPARK-34953][CORE][SQL] Add the code change for adding the DateType in the infer schema while reading in CSV and JSON

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32558:
URL: https://github.com/apache/spark/pull/32558#issuecomment-851202485


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139089/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


dongjoon-hyun closed pull request #32706:
URL: https://github.com/apache/spark/pull/32706


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun edited a comment on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


dongjoon-hyun edited a comment on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851202576


   This PR affects only GitHub Action.
   - At the first commits, all test passed except SparkR job.
   - At the second commit, we recover only SparkR job and SparkR passed already 
(https://github.com/dongjoon-hyun/spark/runs/2707469445?check_suite_focus=true).
   
   I'll merge this PR. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


dongjoon-hyun commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851202576


   This PR affects only GitHub Action.
   - At the first commits, all test passed except SparkR job.
   - At the second commit, we recover only SparkR job and it passed already 
(https://github.com/dongjoon-hyun/spark/runs/2707469445?check_suite_focus=true).
   
   I'll merge this PR. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32558: [SPARK-34953][CORE][SQL] Add the code change for adding the DateType in the infer schema while reading in CSV and JSON

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32558:
URL: https://github.com/apache/spark/pull/32558#issuecomment-851202485


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139089/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32558: [SPARK-34953][CORE][SQL] Add the code change for adding the DateType in the infer schema while reading in CSV and JSON

2021-05-30 Thread GitBox


SparkQA removed a comment on pull request #32558:
URL: https://github.com/apache/spark/pull/32558#issuecomment-851112957


   **[Test build #139089 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139089/testReport)**
 for PR 32558 at commit 
[`29a5e43`](https://github.com/apache/spark/commit/29a5e4331c4fa301a484e52206174663f97a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32558: [SPARK-34953][CORE][SQL] Add the code change for adding the DateType in the infer schema while reading in CSV and JSON

2021-05-30 Thread GitBox


SparkQA commented on pull request #32558:
URL: https://github.com/apache/spark/pull/32558#issuecomment-851201158


   **[Test build #139089 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139089/testReport)**
 for PR 32558 at commit 
[`29a5e43`](https://github.com/apache/spark/commit/29a5e4331c4fa301a484e52206174663f97a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32693: [SPARK-35556][SQL][TESTS] Avoid log NoSuchMethodError when HiveClientImpl.state close

2021-05-30 Thread GitBox


SparkQA commented on pull request #32693:
URL: https://github.com/apache/spark/pull/32693#issuecomment-851199336


   **[Test build #139097 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139097/testReport)**
 for PR 32693 at commit 
[`eb22ac9`](https://github.com/apache/spark/commit/eb22ac95330404325c245602c3efde9dbe2272b4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-30 Thread GitBox


SparkQA commented on pull request #32686:
URL: https://github.com/apache/spark/pull/32686#issuecomment-851198587


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43617/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum edited a comment on pull request #32675: [SPARK-35531][SQL] Can not insert into hive bucket table if create table with upper case schema

2021-05-30 Thread GitBox


wangyum edited a comment on pull request #32675:
URL: https://github.com/apache/spark/pull/32675#issuecomment-851195220


   @cloud-fan I have added the stacktrace to PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on pull request #32675: [SPARK-35531][SQL] Can not insert into hive bucket table if create table with upper case schema

2021-05-30 Thread GitBox


wangyum commented on pull request #32675:
URL: https://github.com/apache/spark/pull/32675#issuecomment-851195220


   @cloud-fan I hive add stacktrace to PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


SparkQA commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851191449


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43616/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] pan3793 commented on pull request #32395: [SPARK-35270][SQL][CORE] Remove the use of guava in order to upgrade guava version to 27

2021-05-30 Thread GitBox


pan3793 commented on pull request #32395:
URL: https://github.com/apache/spark/pull/32395#issuecomment-851187049


   Seems `spark-core` already shaded guava, and for Hadoop 3.2, since spark 
already moved to Hadoop Shaded Client, I only see Curator depends on guava, 
from https://cwiki.apache.org/confluence/display/CURATOR/TN13 , I think it's ok 
to bundle a high version of guava in Spark hadoop-3.2 binary dist?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] otterc commented on a change in pull request #32007: [SPARK-33350][SHUFFLE] Add support to DiskBlockManager to create merge directory and to get the local shuffle merged data

2021-05-30 Thread GitBox


otterc commented on a change in pull request #32007:
URL: https://github.com/apache/spark/pull/32007#discussion_r642215418



##
File path: core/src/main/scala/org/apache/spark/storage/DiskBlockManager.scala
##
@@ -153,6 +198,60 @@ private[spark] class DiskBlockManager(conf: SparkConf, 
deleteFilesOnStop: Boolea
 }
   }
 
+  /**
+   * Get the list of configured local dirs storing merged shuffle blocks 
created by executors
+   * if push based shuffle is enabled. Note that the files in this directory 
will be created
+   * by the external shuffle services. We only create the merge_manager 
directories and
+   * subdirectories here because currently the shuffle service doesn't have 
permission to
+   * create directories under application local directories.
+   */
+  private def createLocalDirsForMergedShuffleBlocks(conf: SparkConf): 
Array[File] = {

Review comment:
   @zhouyejoe The earlier comment must be because the PR didn't have latest 
code and it was needed for initializing `activeMergedShuffleDirs`. There is no 
need for `activeMergedShuffleDirs`. As mentioned in the other comment it is not 
being used anywhere in `DiskBlockManager`. The dirs are being passed by the 
methods so why does this need to return the files.
   cc @mridulm 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] shahidki31 commented on pull request #32704: [SPARK-35567][SQL] Fix: Explain cost is not showing statistics for all the nodes

2021-05-30 Thread GitBox


shahidki31 commented on pull request #32704:
URL: https://github.com/apache/spark/pull/32704#issuecomment-851181036


   @cloud-fan Yes, `collectWithSubqueries` will include nested subqueries as 
well.
   
![image](https://user-images.githubusercontent.com/23054875/120143148-31b89880-c1fd-11eb-9ca0-871255b59b68.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32704: [SPARK-35567][SQL] Fix: Explain cost is not showing statistics for all the nodes

2021-05-30 Thread GitBox


cloud-fan commented on pull request #32704:
URL: https://github.com/apache/spark/pull/32704#issuecomment-851178214


   Does this fix nested subqueries?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32675: [SPARK-35531][SQL] Can not insert into hive bucket table if create table with upper case schema

2021-05-30 Thread GitBox


cloud-fan commented on pull request #32675:
URL: https://github.com/apache/spark/pull/32675#issuecomment-851177143


   Can you post the full stacktrace? I'm a bit curious about how/where the 
error happens.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32707:
URL: https://github.com/apache/spark/pull/32707#issuecomment-851175379


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43612/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


SparkQA commented on pull request #32707:
URL: https://github.com/apache/spark/pull/32707#issuecomment-851175353


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43612/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32707:
URL: https://github.com/apache/spark/pull/32707#issuecomment-851175379


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43612/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32705: [SPARK-35568][SQL] Avoid UnsupportedOperationException when enabling both AQE and DPP

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32705:
URL: https://github.com/apache/spark/pull/32705#issuecomment-851172007


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139088/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851172026


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32705: [SPARK-35568][SQL] Avoid UnsupportedOperationException when enabling both AQE and DPP

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32705:
URL: https://github.com/apache/spark/pull/32705#issuecomment-851172007


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139088/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #32687: [SPARK-35545][SQL] Split SubqueryExpression's children field into outer attributes and join conditions

2021-05-30 Thread GitBox


cloud-fan closed pull request #32687:
URL: https://github.com/apache/spark/pull/32687


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32687: [SPARK-35545][SQL] Split SubqueryExpression's children field into outer attributes and join conditions

2021-05-30 Thread GitBox


cloud-fan commented on pull request #32687:
URL: https://github.com/apache/spark/pull/32687#issuecomment-851171920


   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32705: [SPARK-35568][SQL] Avoid UnsupportedOperationException when enabling both AQE and DPP

2021-05-30 Thread GitBox


SparkQA removed a comment on pull request #32705:
URL: https://github.com/apache/spark/pull/32705#issuecomment-851098166


   **[Test build #139088 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139088/testReport)**
 for PR 32705 at commit 
[`932edd7`](https://github.com/apache/spark/commit/932edd7808ba8ae9220658eff37c9c3af77eb09f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851171260


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43615/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


SparkQA commented on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851171248


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43615/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851171260


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43615/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32705: [SPARK-35568][SQL] Avoid UnsupportedOperationException when enabling both AQE and DPP

2021-05-30 Thread GitBox


SparkQA commented on pull request #32705:
URL: https://github.com/apache/spark/pull/32705#issuecomment-851171058


   **[Test build #139088 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139088/testReport)**
 for PR 32705 at commit 
[`932edd7`](https://github.com/apache/spark/commit/932edd7808ba8ae9220658eff37c9c3af77eb09f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-30 Thread GitBox


SparkQA commented on pull request #32686:
URL: https://github.com/apache/spark/pull/32686#issuecomment-851169203


   **[Test build #139096 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139096/testReport)**
 for PR 32686 at commit 
[`0542922`](https://github.com/apache/spark/commit/0542922a77d660af1797c0a6f0840d77d87c059a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


SparkQA commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851169175


   **[Test build #139095 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139095/testReport)**
 for PR 32706 at commit 
[`53adc9e`](https://github.com/apache/spark/commit/53adc9ef6befb092a812449a1949837d320f927c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32658:
URL: https://github.com/apache/spark/pull/32658#issuecomment-851168498


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43614/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851168497


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43613/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32658:
URL: https://github.com/apache/spark/pull/32658#issuecomment-851168498


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43614/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851168497


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43613/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


dongjoon-hyun commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851168280


   Thank you again!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


viirya commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851167925


   Thanks @dongjoon-hyun. It sounds good to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-30 Thread GitBox


SparkQA commented on pull request #32658:
URL: https://github.com/apache/spark/pull/32658#issuecomment-851166355


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43614/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-30 Thread GitBox


SparkQA commented on pull request #32658:
URL: https://github.com/apache/spark/pull/32658#issuecomment-851165704


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43614/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


SparkQA commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851164672


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43613/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 in the docker image for GitHub Action

2021-05-30 Thread GitBox


SparkQA commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851163828


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43613/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


SparkQA commented on pull request #32707:
URL: https://github.com/apache/spark/pull/32707#issuecomment-851163291


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43612/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


dongjoon-hyun closed pull request #32707:
URL: https://github.com/apache/spark/pull/32707


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


dongjoon-hyun commented on pull request #32707:
URL: https://github.com/apache/spark/pull/32707#issuecomment-851161273


   I manually verified this with the following.
   
   **BEFORE**
   ```
   $ build/mvn help:evaluate -Pscala-2.12 -Dexpression=scala.version | grep 
"^2.12"
   Using `mvn` from path: /usr/local/bin/mvn
   2.12.10
   ```
   
   **AFTER**
   ```
   $ build/mvn help:evaluate -Pscala-2.12 -Dexpression=scala.version | grep 
"^2.12"
   Using `mvn` from path: /usr/local/bin/mvn
   2.12.14
   ```
   
   Merged to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 and upgrade R to 4.1.0 in the docker image for GitHub Action

2021-05-30 Thread GitBox


dongjoon-hyun commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851159207


   Thank you, @viirya . SparkR linter and doc works correctly, but it seems 
that SparkR has some UT issues with the latest `arrow`. I'll narrow down the 
scope of this PR by excluding the image update of SparkR GitHub Action job.
   ```
   2. Failure (test_sparkSQL_arrow.R:71:3): createDataFrame/collect Arrow optimi
   collect(createDataFrame(rdf)) not equal to `expected`.
   Component “g”: 'tzone' attributes are inconsistent ('UTC' and '')
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


dongjoon-hyun commented on pull request #32707:
URL: https://github.com/apache/spark/pull/32707#issuecomment-851156588


   Thank you, @viirya !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851153739


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139094/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


SparkQA removed a comment on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851152459


   **[Test build #139094 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139094/testReport)**
 for PR 32701 at commit 
[`e067bdb`](https://github.com/apache/spark/commit/e067bdb6fee7dceb4299917c9ef76af74e20720e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


SparkQA commented on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851153717


   **[Test build #139094 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139094/testReport)**
 for PR 32701 at commit 
[`e067bdb`](https://github.com/apache/spark/commit/e067bdb6fee7dceb4299917c9ef76af74e20720e).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851153739


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139094/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


SparkQA commented on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851152459


   **[Test build #139094 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139094/testReport)**
 for PR 32701 at commit 
[`e067bdb`](https://github.com/apache/spark/commit/e067bdb6fee7dceb4299917c9ef76af74e20720e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-850782008


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


yaooqinn commented on pull request #32701:
URL: https://github.com/apache/spark/pull/32701#issuecomment-851152150


   ok to test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-30 Thread GitBox


SparkQA commented on pull request #32658:
URL: https://github.com/apache/spark/pull/32658#issuecomment-851151414


   **[Test build #139093 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139093/testReport)**
 for PR 32658 at commit 
[`3bda8db`](https://github.com/apache/spark/commit/3bda8db8e50d6550089b1cb1770d6cfe078bcaf8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huskysun commented on a change in pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


huskysun commented on a change in pull request #32701:
URL: https://github.com/apache/spark/pull/32701#discussion_r642193027



##
File path: docs/submitting-applications.md
##
@@ -146,7 +146,7 @@ export HADOOP_CONF_DIR=XXX
 ./bin/spark-submit \
   --class org.apache.spark.examples.SparkPi \
   --master k8s://xx.yy.zz.ww:443 \
-  --deploy-mode cluster \
+  --deploy-mode cluster \  # can be client for client mode

Review comment:
   @yaooqinn @dongjoon-hyun Thanks for the review. Yeah I should've changed 
L145 as well, from `# Run on a Kubernetes cluster in cluster deploy mode` to `# 
Run on a Kubernetes cluster`. I was trying to mimicking L117. However I won't 
do that anymore, because:
   
   > you cannot add # after \ in bash
   
   You're right, `#` can't come after `\`. Then, L122 should also be fixed. I 
will revert this line, and also fix L122. Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-30 Thread GitBox


itholic commented on pull request #32658:
URL: https://github.com/apache/spark/pull/32658#issuecomment-851150735


   > @itholic the generated doc looks a bit weird:
   > 
   > ![Screen Shot 2021-05-28 at 1 09 38 
PM](https://user-images.githubusercontent.com/6477701/119928166-04c67480-bfb6-11eb-8449-428b01f2144a.png)
   > 
   > It includes `# noqa`
   > 
   > Can you double check and fix? Seems like we should fix other places for 
JSON, etc.
   
   Thanks, @HyukjinKwon . Just fixed them in every place.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32703: [SPARK-35566][SS] Fix StateStoreRestoreExec output rows

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32703:
URL: https://github.com/apache/spark/pull/32703#issuecomment-851150536


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139087/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32703: [SPARK-35566][SS] Fix StateStoreRestoreExec output rows

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32703:
URL: https://github.com/apache/spark/pull/32703#issuecomment-851150536


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139087/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] itholic commented on a change in pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-30 Thread GitBox


itholic commented on a change in pull request #32658:
URL: https://github.com/apache/spark/pull/32658#discussion_r642194447



##
File path: docs/sql-data-sources-csv.md
##
@@ -38,3 +36,217 @@ Spark SQL provides `spark.read().csv("file_name")` to read 
a file or directory o
 
 
 
+
+## Data Source Option
+
+Data source options of CSV can be set via:
+* the `.option`/`.options` methods of
+  *  `DataFrameReader`
+  *  `DataFrameWriter`
+  *  `DataStreamReader`
+  *  `DataStreamWriter`
+* the built-in functions below
+  * `from_csv`
+  * `to_csv`
+  * `schema_of_csv`
+* `OPTIONS` clause at [CREATE TABLE USING 
DATA_SOURCE](sql-ref-syntax-ddl-create-table-datasource.html)
+
+
+
+  Property 
NameDefaultMeaningScope
+  
+sep
+,
+Sets a separator (one or more characters) for each field and 
value.
+read/write
+  
+  
+encoding
+UTF-8 for reading, not set for writing
+Specifies encoding (charset) for reading or writing CSV files
+read/write
+  
+  
+quote
+"
+Sets a single character used for escaping quoted values where the 
separator can be part of the value. If you would like to turn off quotations, 
you need to set an empty string. If an empty string is set, it uses 
u (null character) for wirting, and it disables the quotation 
handling for reading.
+read/write
+  
+  
+quoteAll
+false
+A flag indicating whether all values should always be enclosed in 
quotes. It only escapes values containing a quote character by default.
+write
+  
+  
+escape
+\
+Sets a single character used for escaping quotes inside an already 
quoted value.
+read/write
+  
+  
+escapeQuotes
+true
+A flag indicating whether values containing quotes should always be 
enclosed in quotes. It escapes all values containing a quote character by 
default.
+write
+  
+  
+comment
+empty string
+Sets a single character used for skipping lines beginning with this 
character. It's disabled by default
+read
+  
+  
+header
+false
+For reading, uses the first line as names of columns. For writing, 
writes the names of columns as the first line. Note that if the given path is a 
RDD of Strings, this header option will remove all lines same with the header 
if exists.
+read/write
+  
+  
+inferSchema
+false
+Infers the input schema automatically from data. It requires one extra 
pass over the data.
+read
+  
+  
+enforceSchema
+true
+If it is set to true, the specified or inferred schema 
will be forcibly applied to datasource files, and headers in CSV files will be 
ignored. If the option is set to false, the schema will be 
validated against all headers in CSV files or the first header in RDD if the 
header option is set to true. Field names in the 
schema and column names in CSV headers are checked by their positions taking 
into account spark.sql.caseSensitive. Though the default value is 
true, it is recommended to disable the enforceSchema 
option to avoid incorrect results.
+read
+  
+  
+ignoreLeadingWhiteSpace
+false (for reading), true (for writing)
+A flag indicating whether or not leading whitespaces from values being 
read/written should be skipped.
+read/write
+  
+  
+ignoreTrailingWhiteSpace
+false (for reading), true (for writing)
+A flag indicating whether or not trailing whitespaces from values 
being read/written should be skipped.
+read/write
+  
+  
+nullValue
+empty string
+Sets the string representation of a null value. Since 2.0.1, this 
nullValue param applies to all supported types including the 
string type.
+read/write
+  
+  
+nanValue
+NaN
+Sets the string representation of a non-number value.
+read
+  
+  
+positiveInf
+Inf
+Sets the string representation of a positive infinity value.
+read
+  
+  
+negativeInf
+-Inf
+Sets the string representation of a negative infinity value.
+read
+  
+  
+dateFormat
+-MM-dd
+Sets the string that indicates a date format. Custom date formats 
follow the formats at https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html;> 
Datetime Patterns. This applies to date type.
+read/write
+  
+  
+timestampFormat
+-MM-dd'T'HH:mm:ss[.SSS][XXX]
+Sets the string that indicates a timestamp format. Custom date formats 
follow the formats at https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html;>Datetime
 Patterns. This applies to timestamp type.
+read/write
+  
+  
+maxColumns
+20480
+Defines a hard limit of how many columns a record can have.
+read
+  
+  
+maxCharsPerColumn
+-1
+Defines the maximum number of characters allowed for any given value 
being read. The default value -1 means unlimited length.
+read
+  
+  
+mode
+PERMISSIVE
+Allows a mode for dealing with corrupt records during parsing. Note 
that Spark 

[GitHub] [spark] SparkQA removed a comment on pull request #32703: [SPARK-35566][SS] Fix StateStoreRestoreExec output rows

2021-05-30 Thread GitBox


SparkQA removed a comment on pull request #32703:
URL: https://github.com/apache/spark/pull/32703#issuecomment-851081590


   **[Test build #139087 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139087/testReport)**
 for PR 32703 at commit 
[`a8135d8`](https://github.com/apache/spark/commit/a8135d85a46e48715ce60d8f5e4ac5f4dbf26b36).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32703: [SPARK-35566][SS] Fix StateStoreRestoreExec output rows

2021-05-30 Thread GitBox


SparkQA commented on pull request #32703:
URL: https://github.com/apache/spark/pull/32703#issuecomment-851150023


   **[Test build #139087 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139087/testReport)**
 for PR 32703 at commit 
[`a8135d8`](https://github.com/apache/spark/commit/a8135d85a46e48715ce60d8f5e4ac5f4dbf26b36).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32704: [SPARK-35567][SQL] Fix: Explain cost is not showing statistics for all the nodes

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32704:
URL: https://github.com/apache/spark/pull/32704#issuecomment-851149791


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139086/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32704: [SPARK-35567][SQL] Fix: Explain cost is not showing statistics for all the nodes

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32704:
URL: https://github.com/apache/spark/pull/32704#issuecomment-851149791


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139086/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32704: [SPARK-35567][SQL] Fix: Explain cost is not showing statistics for all the nodes

2021-05-30 Thread GitBox


SparkQA removed a comment on pull request #32704:
URL: https://github.com/apache/spark/pull/32704#issuecomment-851081580


   **[Test build #139086 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139086/testReport)**
 for PR 32704 at commit 
[`f1eced2`](https://github.com/apache/spark/commit/f1eced2782ec742d2dd04a122f15a4bf47ef237d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32706: [SPARK-35507][INFRA] Add Python 3.9 and upgrade R to 4.1.0 in the docker image for GitHub Action

2021-05-30 Thread GitBox


SparkQA commented on pull request #32706:
URL: https://github.com/apache/spark/pull/32706#issuecomment-851149438


   **[Test build #139092 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139092/testReport)**
 for PR 32706 at commit 
[`4a5fa6a`](https://github.com/apache/spark/commit/4a5fa6a1901810df5163c4e026b15569dda7177c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


SparkQA commented on pull request #32707:
URL: https://github.com/apache/spark/pull/32707#issuecomment-851149411


   **[Test build #139091 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139091/testReport)**
 for PR 32707 at commit 
[`53c55de`](https://github.com/apache/spark/commit/53c55de603ad55b3cdd4aeeac445a35aee68d34a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32704: [SPARK-35567][SQL] Fix: Explain cost is not showing statistics for all the nodes

2021-05-30 Thread GitBox


SparkQA commented on pull request #32704:
URL: https://github.com/apache/spark/pull/32704#issuecomment-851149228


   **[Test build #139086 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139086/testReport)**
 for PR 32704 at commit 
[`f1eced2`](https://github.com/apache/spark/commit/f1eced2782ec742d2dd04a122f15a4bf47ef237d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huskysun commented on a change in pull request #32701: [SPARK-35562][DOC] Fix docs about Kubernetes

2021-05-30 Thread GitBox


huskysun commented on a change in pull request #32701:
URL: https://github.com/apache/spark/pull/32701#discussion_r642193027



##
File path: docs/submitting-applications.md
##
@@ -146,7 +146,7 @@ export HADOOP_CONF_DIR=XXX
 ./bin/spark-submit \
   --class org.apache.spark.examples.SparkPi \
   --master k8s://xx.yy.zz.ww:443 \
-  --deploy-mode cluster \
+  --deploy-mode cluster \  # can be client for client mode

Review comment:
   @yaooqinn @dongjoon-hyun Thanks for the review. Yeah I should've changed 
L145 as well, from `# Run on a Kubernetes cluster in cluster deploy mode` to `# 
Run on a Kubernetes cluster`. I was trying to mimicking L117. However I won't 
do that anymore, because:
   
   > you cannot add # after \ in bash
   
   You're right, `#` can come after `\`. Then, L122 should also be fixed. I 
will revert this line, and also fix L122. Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851132964






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851148815


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139090/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


SparkQA removed a comment on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851113186


   **[Test build #139090 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139090/testReport)**
 for PR 31565 at commit 
[`cced137`](https://github.com/apache/spark/commit/cced1372715fd1654cbc40620a30116e28b245db).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


SparkQA commented on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851148106


   **[Test build #139090 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139090/testReport)**
 for PR 31565 at commit 
[`cced137`](https://github.com/apache/spark/commit/cced1372715fd1654cbc40620a30116e28b245db).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun opened a new pull request #32707: [SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile

2021-05-30 Thread GitBox


dongjoon-hyun opened a new pull request #32707:
URL: https://github.com/apache/spark/pull/32707


   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32697: [SPARK-31168][BUILD] Upgrade Scala to 2.12.14

2021-05-30 Thread GitBox


dongjoon-hyun commented on a change in pull request #32697:
URL: https://github.com/apache/spark/pull/32697#discussion_r642190770



##
File path: pom.xml
##
@@ -162,7 +162,7 @@
 3.4.1
 
 3.2.2
-2.12.10
+2.12.14

Review comment:
   Oh, thanks. I missed there.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32696: [SPARK-35194][SQL][FOLLOWUP] Recover build error with Scala 2.13 on GA

2021-05-30 Thread GitBox


cloud-fan commented on pull request #32696:
URL: https://github.com/apache/spark/pull/32696#issuecomment-851140959


   thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun opened a new pull request #32706: [SPARK-35507][INFRA] Move Python 3.9 to the docker image

2021-05-30 Thread GitBox


dongjoon-hyun opened a new pull request #32706:
URL: https://github.com/apache/spark/pull/32706


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851132964


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32468: [SPARK-35335][SQL] Improve CoalesceShufflePartitions to avoid generating small files

2021-05-30 Thread GitBox


AngersZh commented on a change in pull request #32468:
URL: https://github.com/apache/spark/pull/32468#discussion_r642180154



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CoalesceShufflePartitions.scala
##
@@ -83,10 +84,16 @@ case class CoalesceShufflePartitions(session: SparkSession) 
extends CustomShuffl
 // is not set, so to avoid perf regressions compared to no coalescing.
 val minPartitionNum = 
conf.getConf(SQLConf.COALESCE_PARTITIONS_MIN_PARTITION_NUM)
   .getOrElse(session.sparkContext.defaultParallelism)
+val minNumPartitions = if (isFinalStage) {

Review comment:
   minFinalStagePartitionNum?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851113533






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32558: [SPARK-34953][CORE][SQL] Add the code change for adding the DateType in the infer schema while reading in CSV and JSON

2021-05-30 Thread GitBox


AmplabJenkins removed a comment on pull request #32558:
URL: https://github.com/apache/spark/pull/32558#issuecomment-851129415


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43610/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32558: [SPARK-34953][CORE][SQL] Add the code change for adding the DateType in the infer schema while reading in CSV and JSON

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #32558:
URL: https://github.com/apache/spark/pull/32558#issuecomment-851129415


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43610/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


AmplabJenkins commented on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851129414


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zzcclp commented on a change in pull request #32697: [SPARK-31168][BUILD] Upgrade Scala to 2.12.14

2021-05-30 Thread GitBox


zzcclp commented on a change in pull request #32697:
URL: https://github.com/apache/spark/pull/32697#discussion_r642178486



##
File path: pom.xml
##
@@ -162,7 +162,7 @@
 3.4.1
 
 3.2.2
-2.12.10
+2.12.14

Review comment:
   @dongjoon-hyun please modify the scala version to 2.12.14 in profile 
`scala-2.12`, otherwise there maybe a `Error:scala: bad option: 
-P:silencer:globalFilters=.*deprecated.*` error when tick the profile 
`scala-2.12`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32558: [SPARK-34953][CORE][SQL] Add the code change for adding the DateType in the infer schema while reading in CSV and JSON

2021-05-30 Thread GitBox


SparkQA commented on pull request #32558:
URL: https://github.com/apache/spark/pull/32558#issuecomment-851126950


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43610/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum edited a comment on pull request #28032: [SPARK-31264][SQL] Repartition by dynamic partition columns before insert partition table

2021-05-30 Thread GitBox


wangyum edited a comment on pull request #28032:
URL: https://github.com/apache/spark/pull/28032#issuecomment-851119698


   @HyukjinKwon I mainly want to make the whole cluster more stable. If a user 
does not add it manually, a large number of files may be generated. For 
example: Suppose there are 1 tasks, each task contains 500 separate 
partition values, and the number of files generated is 1 * 500. This pr 
mainly to avoid this case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


SparkQA commented on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851126484


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32558: [SPARK-34953][CORE][SQL] Add the code change for adding the DateType in the infer schema while reading in CSV and JSON

2021-05-30 Thread GitBox


SparkQA commented on pull request #32558:
URL: https://github.com/apache/spark/pull/32558#issuecomment-851126356


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43610/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #32676: [SPARK-35532][TESTS] Ensure mllib and kafka-0-10 module can be maven test independently in Scala 2.13

2021-05-30 Thread GitBox


LuciferYang commented on pull request #32676:
URL: https://github.com/apache/spark/pull/32676#issuecomment-851126312


   thx @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31565: [SPARK-34438][SPARK SUBMIT] Check path component in isPython/isR, not full URI

2021-05-30 Thread GitBox


SparkQA commented on pull request #31565:
URL: https://github.com/apache/spark/pull/31565#issuecomment-851126220


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43611/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang edited a comment on pull request #32669: [SPARK-35526][CORE][SQL][ML][MLLIB] Re-Cleanup `procedure syntax is deprecated` compilation warning in Scala 2.13

2021-05-30 Thread GitBox


LuciferYang edited a comment on pull request #32669:
URL: https://github.com/apache/spark/pull/32669#issuecomment-851126030


   > +1 for the idea, @LuciferYang .
   
   Ok, I will give a new pr do this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #32669: [SPARK-35526][CORE][SQL][ML][MLLIB] Re-Cleanup `procedure syntax is deprecated` compilation warning in Scala 2.13

2021-05-30 Thread GitBox


LuciferYang commented on pull request #32669:
URL: https://github.com/apache/spark/pull/32669#issuecomment-851126030


   > +1 for the idea, @LuciferYang .
   Ok, I will give a new pr do this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-30 Thread GitBox


LuciferYang commented on pull request #32688:
URL: https://github.com/apache/spark/pull/32688#issuecomment-851125841


   thx all ~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on pull request #32694: [SPARK-35059][SQL] Group exception messages in hive/execution

2021-05-30 Thread GitBox


beliefer commented on pull request #32694:
URL: https://github.com/apache/spark/pull/32694#issuecomment-851124316


   ping @allisonwang-db 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-05-30 Thread GitBox


huaxingao commented on pull request #32049:
URL: https://github.com/apache/spark/pull/32049#issuecomment-851121633


   @cloud-fan @maropu I addressed the comments. Could you please take another 
look? Thanks a lot!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #32704: [SPARK-35567][SQL] Fix: Explain cost is not showing statistics for all the nodes

2021-05-30 Thread GitBox


HyukjinKwon commented on a change in pull request #32704:
URL: https://github.com/apache/spark/pull/32704#discussion_r642172184



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala
##
@@ -256,13 +255,9 @@ class QueryExecution(
 
 // trigger to compute stats for logical plans
 try {
-  optimizedPlan.foreach(_.expressions.foreach(_.foreach {
-case subqueryExpression: SubqueryExpression =>
-  // trigger subquery's child plan stats propagation
-  subqueryExpression.plan.stats
-case _ =>
-  }))
-  optimizedPlan.stats
+  optimizedPlan.collectWithSubqueries {

Review comment:
   cc @cloud-fan @maryannxue FYI




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on pull request #28032: [SPARK-31264][SQL] Repartition by dynamic partition columns before insert partition table

2021-05-30 Thread GitBox


wangyum commented on pull request #28032:
URL: https://github.com/apache/spark/pull/28032#issuecomment-851119698


   @HyukjinKwon I mainly want to make the whole cluster more stable. If a user 
does not add it manually, a large number of files may be generated. Please see 
this picture:
   
![image](https://user-images.githubusercontent.com/5399861/77612239-9bd30f00-6f62-11ea-9178-3bcd65aa4034.png)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #32398: [WIP] hive version upgraded from 2.3.7 to 2.3.8

2021-05-30 Thread GitBox


dongjoon-hyun closed pull request #32398:
URL: https://github.com/apache/spark/pull/32398


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #32398: [WIP] hive version upgraded from 2.3.7 to 2.3.8

2021-05-30 Thread GitBox


dongjoon-hyun commented on pull request #32398:
URL: https://github.com/apache/spark/pull/32398#issuecomment-851117258


   Hi, All.
   
   According to the above discussion, I'll close this PR for now.
   
   BTW, Apache Spark 3.1.2 is available, @bhupeshdhiman84 .
   - https://downloads.apache.org/spark/spark-3.1.2/
   - https://spark.apache.org/docs/3.1.2/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32388: [SPARK-35258][SHUFFLE][YARN] Add new metrics to ExternalShuffleService for better monitoring

2021-05-30 Thread GitBox


dongjoon-hyun commented on a change in pull request #32388:
URL: https://github.com/apache/spark/pull/32388#discussion_r642167700



##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java
##
@@ -264,6 +265,8 @@ private void checkAuth(TransportClient client, String 
appId) {
 private final Timer registerExecutorRequestLatencyMillis = new Timer();
 // Time latency for processing finalize shuffle merge request latency in ms
 private final Timer finalizeShuffleMergeLatencyMillis = new Timer();
+// Block transfer rate in blocks per second

Review comment:
   Is this valid when we do `getContinuousBlocksData`? To be clear to the 
metric audience, could you revise the definition you are aiming?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >