[GitHub] [spark] AmplabJenkins removed a comment on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770172822


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134670/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770172821


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134671/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770172822


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134670/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770172821


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134671/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770144006


   **[Test build #134671 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134671/testReport)**
 for PR 29542 at commit 
[`00dbac4`](https://github.com/apache/spark/commit/00dbac4dbf59e89d6d4d2afb51187dd5821b6ec6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770170951


   **[Test build #134671 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134671/testReport)**
 for PR 29542 at commit 
[`00dbac4`](https://github.com/apache/spark/commit/00dbac4dbf59e89d6d4d2afb51187dd5821b6ec6).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770143935


   **[Test build #134670 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134670/testReport)**
 for PR 31182 at commit 
[`fe08ea0`](https://github.com/apache/spark/commit/fe08ea000eecea1620c51320665b1138d3792681).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


SparkQA commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770170638


   **[Test build #134670 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134670/testReport)**
 for PR 31182 at commit 
[`fe08ea0`](https://github.com/apache/spark/commit/fe08ea000eecea1620c51320665b1138d3792681).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770167643







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770167645


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134668/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770167645


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134668/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770167644







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770134746


   **[Test build #134668 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134668/testReport)**
 for PR 31182 at commit 
[`976ae2d`](https://github.com/apache/spark/commit/976ae2df9aec7ff7997f83533163cb9ea38eada4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


SparkQA commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770164950


   **[Test build #134668 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134668/testReport)**
 for PR 31182 at commit 
[`976ae2d`](https://github.com/apache/spark/commit/976ae2df9aec7ff7997f83533163cb9ea38eada4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770133556


   **[Test build #134669 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134669/testReport)**
 for PR 30175 at commit 
[`51a9539`](https://github.com/apache/spark/commit/51a9539f1d5601a53dc4f22473127fbc8831ead0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770163533


   **[Test build #134669 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134669/testReport)**
 for PR 30175 at commit 
[`51a9539`](https://github.com/apache/spark/commit/51a9539f1d5601a53dc4f22473127fbc8831ead0).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30714: [SPARK-33739] [SQL] Jobs committed through the S3A Magic committer don't track bytes

2021-01-29 Thread GitBox


dongjoon-hyun commented on a change in pull request #30714:
URL: https://github.com/apache/spark/pull/30714#discussion_r567202719



##
File path: docs/cloud-integration.md
##
@@ -49,7 +49,6 @@ They cannot be used as a direct replacement for a cluster 
filesystem such as HDF
 
 Key differences are:
 
-* Changes to stored objects may not be immediately visible, both in directory 
listings and actual data access.

Review comment:
   Shall we remove [line 
60](https://github.com/apache/spark/pull/30714/files#diff-a0f56e11a3171477d68df69ab5f93eb12c8cc899cceaceffdc52180c87d677c5R60)
 together?
   > 1. The output of work may not be immediately visible to a follow-on query.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30714: [SPARK-33739] [SQL] Jobs committed through the S3A Magic committer don't track bytes

2021-01-29 Thread GitBox


dongjoon-hyun commented on a change in pull request #30714:
URL: https://github.com/apache/spark/pull/30714#discussion_r567202719



##
File path: docs/cloud-integration.md
##
@@ -49,7 +49,6 @@ They cannot be used as a direct replacement for a cluster 
filesystem such as HDF
 
 Key differences are:
 
-* Changes to stored objects may not be immediately visible, both in directory 
listings and actual data access.

Review comment:
   Shall we remove line 60 together?
   > 1. The output of work may not be immediately visible to a follow-on query.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770161888


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39258/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770161888


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39258/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


dongjoon-hyun commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-770160257


   Merged to master/3.1/3.0/2.4.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770159334


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39258/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


dongjoon-hyun closed pull request #31369:
URL: https://github.com/apache/spark/pull/31369


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770156499


   **[Test build #134672 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134672/testReport)**
 for PR 30175 at commit 
[`9531c60`](https://github.com/apache/spark/commit/9531c60e2856b5345fa165597e4d29e6e4f55ab0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770143893







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770156333


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39257/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770150092


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39256/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770150169


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39258/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770150092


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39256/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770144006


   **[Test build #134671 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134671/testReport)**
 for PR 29542 at commit 
[`00dbac4`](https://github.com/apache/spark/commit/00dbac4dbf59e89d6d4d2afb51187dd5821b6ec6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770143954


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39256/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


SparkQA commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770143935


   **[Test build #134670 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134670/testReport)**
 for PR 31182 at commit 
[`fe08ea0`](https://github.com/apache/spark/commit/fe08ea000eecea1620c51320665b1138d3792681).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770143893


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39255/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


SparkQA commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770143886


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39255/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang closed pull request #31392: [SPARK-34288] [WEBUI] Add a tip info for the `resources` column in the executors page

2021-01-29 Thread GitBox


gengliangwang closed pull request #31392:
URL: https://github.com/apache/spark/pull/31392


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770140339


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39256/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


SparkQA commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770140295


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39255/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on pull request #31392: [SPARK-34288] [WEBUI] Add a tip info for the `resources` column in the executors page

2021-01-29 Thread GitBox


gengliangwang commented on pull request #31392:
URL: https://github.com/apache/spark/pull/31392#issuecomment-770140240


   Thanks, merging to master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on a change in pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


sunchao commented on a change in pull request #31182:
URL: https://github.com/apache/spark/pull/31182#discussion_r567179363



##
File path: sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
##
@@ -1316,6 +1316,30 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils
 }
   }
 
+  test("SPARK-34269: cache lookup with ORDER BY clause") {
+withTable("t") {
+  withTempView("v1") {
+sql("CREATE TABLE t (key bigint, value string) USING parquet")
+sql("CACHE TABLE v1 AS SELECT * FROM t ORDER BY key")
+
+val query = sql("SELECT * FROM t ORDER BY key")
+
assert(spark.sharedState.cacheManager.lookupCachedData(query).isDefined)
+  }
+}
+  }
+
+  test("SPARK-34269: cache lookup with LIMIT clause") {
+withTable("t") {
+  withTempView("v1") {
+sql("CREATE TABLE t (key bigint, value string) USING parquet")
+sql("CACHE TABLE v1 AS SELECT * FROM t LIMIT 10")
+
+val query = sql("SELECT * FROM t LIMIT 10")

Review comment:
   Sure





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


viirya commented on a change in pull request #31182:
URL: https://github.com/apache/spark/pull/31182#discussion_r567178325



##
File path: sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
##
@@ -1316,6 +1316,30 @@ class CachedTableSuite extends QueryTest with 
SQLTestUtils
 }
   }
 
+  test("SPARK-34269: cache lookup with ORDER BY clause") {
+withTable("t") {
+  withTempView("v1") {
+sql("CREATE TABLE t (key bigint, value string) USING parquet")
+sql("CACHE TABLE v1 AS SELECT * FROM t ORDER BY key")
+
+val query = sql("SELECT * FROM t ORDER BY key")
+
assert(spark.sharedState.cacheManager.lookupCachedData(query).isDefined)
+  }
+}
+  }
+
+  test("SPARK-34269: cache lookup with LIMIT clause") {
+withTable("t") {
+  withTempView("v1") {
+sql("CREATE TABLE t (key bigint, value string) USING parquet")
+sql("CACHE TABLE v1 AS SELECT * FROM t LIMIT 10")
+
+val query = sql("SELECT * FROM t LIMIT 10")

Review comment:
   Seems two tests can be combined? Only the query is different.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-29 Thread GitBox


SparkQA commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770134746


   **[Test build #134668 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134668/testReport)**
 for PR 31182 at commit 
[`976ae2d`](https://github.com/apache/spark/commit/976ae2df9aec7ff7997f83533163cb9ea38eada4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770133556


   **[Test build #134669 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134669/testReport)**
 for PR 30175 at commit 
[`51a9539`](https://github.com/apache/spark/commit/51a9539f1d5601a53dc4f22473127fbc8831ead0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770133281







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770133281







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770122585


   **[Test build #134667 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134667/testReport)**
 for PR 30175 at commit 
[`f25f715`](https://github.com/apache/spark/commit/f25f715bb63d3a63bb2b90996e8d4f86fb96).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770124348


   **[Test build #134667 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134667/testReport)**
 for PR 30175 at commit 
[`f25f715`](https://github.com/apache/spark/commit/f25f715bb63d3a63bb2b90996e8d4f86fb96).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30175: [SPARK-33274][SS] Fix job hang in cp mode when total cores less than total kafka partition

2021-01-29 Thread GitBox


SparkQA commented on pull request #30175:
URL: https://github.com/apache/spark/pull/30175#issuecomment-770122585


   **[Test build #134667 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134667/testReport)**
 for PR 30175 at commit 
[`f25f715`](https://github.com/apache/spark/commit/f25f715bb63d3a63bb2b90996e8d4f86fb96).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] aokolnychyi commented on pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-29 Thread GitBox


aokolnychyi commented on pull request #31355:
URL: https://github.com/apache/spark/pull/31355#issuecomment-770091245


   Requiring is a static number of partitions is a valid use case so this 
change looks good to me. Thanks, @HeartSaVioR!
   
   In my comment on the original PR, I meant that most data sources (that's my 
guess) would want to control the parallelism based on the size of incoming 
data. For example, we may want to have reasonably sized files while writing to 
a Hive table. In that case, the number of shuffle partitions will depend on the 
number of records in the incoming batch and the estimated size of each record. 
To achieve that, we could rely solely on AQE or we could accept some info from 
the source and feed that to AQE (it is probably similar to @rdblue's idea on 
bytes per task).
   
   The question that was I asking myself is whether knowing that we would want 
to have this more dynamic control would influence the approach we take when a 
data source requires a static number of partitions. For example, I was thinking 
about some interfaces like `RequiresStaticNumberOfPartitions` that would extend 
`Write` or an interface that would extend `RequiresDistributionAndOrdering` 
instead of adding the control directly to the existing interface. However, I am 
just throwing ideas to see what everybody thinks.
   
   As I said in the beginning, this PR looks good to me too.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-770087726


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134661/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770087727


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134666/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770087727


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134666/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-770087726


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134661/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] aokolnychyi commented on a change in pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-29 Thread GitBox


aokolnychyi commented on a change in pull request #31355:
URL: https://github.com/apache/spark/pull/31355#discussion_r567124615



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/WriteDistributionAndOrderingSuite.scala
##
@@ -372,17 +492,82 @@ class WriteDistributionAndOrderingSuite
 Seq.empty
   )
 )
-val writePartitioning = RangePartitioning(writeOrdering, 
conf.numShufflePartitions)
+val writePartitioning = orderedWritePartitioning(writeOrdering, 
targetNumPartitions)
 
 checkWriteRequirements(
   tableDistribution,
+  targetNumPartitions,
   tableOrdering,
   expectedWritePartitioning = writePartitioning,
   expectedWriteOrdering = writeOrdering,
   writeTransform = df => df.sortWithinPartitions("data", "id"),
   writeCommand = command)
   }
 
+  ignore("ordered distribution and sort with manual repartition: append") {

Review comment:
   Shall we add the JIRA here? Is SPARK-34184 the right one?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-769953174


   **[Test build #134661 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134661/testReport)**
 for PR 31369 at commit 
[`f950772`](https://github.com/apache/spark/commit/f9507725d86c3da4bb799b987761eb340a708c8e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


SparkQA commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-770079221


   **[Test build #134661 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134661/testReport)**
 for PR 31369 at commit 
[`f950772`](https://github.com/apache/spark/commit/f9507725d86c3da4bb799b987761eb340a708c8e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770014623


   **[Test build #134666 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134666/testReport)**
 for PR 31367 at commit 
[`0932597`](https://github.com/apache/spark/commit/0932597ec268ca52b51890f54be22c90684acf22).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


SparkQA commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770077223


   **[Test build #134666 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134666/testReport)**
 for PR 31367 at commit 
[`0932597`](https://github.com/apache/spark/commit/0932597ec268ca52b51890f54be22c90684acf22).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770064614


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39253/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770064614


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39253/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


SparkQA commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770050146


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39253/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770037314


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134663/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770037314


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134663/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-769953154


   **[Test build #134663 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134663/testReport)**
 for PR 29542 at commit 
[`b6f9355`](https://github.com/apache/spark/commit/b6f9355992ffe5203006394c73a578e5f16ecbca).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-770034641


   **[Test build #134663 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134663/testReport)**
 for PR 29542 at commit 
[`b6f9355`](https://github.com/apache/spark/commit/b6f9355992ffe5203006394c73a578e5f16ecbca).
* This patch passes all tests.
* This patch **does not merge cleanly**.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


SparkQA commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770033178


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39253/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


SparkQA commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770014623


   **[Test build #134666 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134666/testReport)**
 for PR 31367 at commit 
[`0932597`](https://github.com/apache/spark/commit/0932597ec268ca52b51890f54be22c90684acf22).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770008079


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134660/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-770008082







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31182: [SPARK-34108][SQL] Caching with permanent view doesn't work in certain cases

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770008081


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39249/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-770008079


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134660/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31182: [SPARK-34108][SQL] Caching with permanent view doesn't work in certain cases

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-770008081


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39249/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-770008083







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


SparkQA removed a comment on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-769922020


   **[Test build #134660 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134660/testReport)**
 for PR 31367 at commit 
[`c16968f`](https://github.com/apache/spark/commit/c16968f36cb589fbe3d13f60f95240af516fc12f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


SparkQA commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-76567


   **[Test build #134660 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134660/testReport)**
 for PR 31367 at commit 
[`c16968f`](https://github.com/apache/spark/commit/c16968f36cb589fbe3d13f60f95240af516fc12f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


SparkQA commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-76470


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39248/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


SparkQA commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-769992452


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39251/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31182: [SPARK-34108][SQL] Caching with permanent view doesn't work in certain cases

2021-01-29 Thread GitBox


SparkQA commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-769990255


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39249/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LucaCanali commented on a change in pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


LucaCanali commented on a change in pull request #31367:
URL: https://github.com/apache/spark/pull/31367#discussion_r566863847



##
File path: python/pyspark/sql/tests/test_pandas_sqlmetrics.py
##
@@ -0,0 +1,65 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import unittest
+
+from pyspark.sql.functions import pandas_udf
+from pyspark.testing.sqlutils import ReusedSQLTestCase, have_pandas, 
have_pyarrow, \
+pandas_requirement_message, pyarrow_requirement_message
+
+
+@unittest.skipIf(
+not have_pandas or not have_pyarrow,
+pandas_requirement_message or pyarrow_requirement_message)  # type: 
ignore[arg-type]
+class PandasSQLMetrics(ReusedSQLTestCase):

Review comment:
   Good point. I am now sure on how to do a Scala side test for Python UDF 
though.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ueshin commented on a change in pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


ueshin commented on a change in pull request #31367:
URL: https://github.com/apache/spark/pull/31367#discussion_r567030338



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/CoGroupedArrowPythonRunner.scala
##
@@ -71,6 +83,9 @@ class CoGroupedArrowPythonRunner(
 }
 
 PythonUDFRunner.writeUDFs(dataOut, funcs, argOffsets)
+val deltaTime = System.nanoTime()-startTime
+pythonCodeSerializeTime += deltaTime
+pythonCodeSent += dataOut.size()

Review comment:
   ditto as https://github.com/apache/spark/pull/31367#discussion_r566439263

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/PythonUDFRunner.scala
##
@@ -46,12 +54,23 @@ class PythonUDFRunner(
 new WriterThread(env, worker, inputIterator, partitionIndex, context) {
 
   protected override def writeCommand(dataOut: DataOutputStream): Unit = {
+val startTime = System.nanoTime()
 PythonUDFRunner.writeUDFs(dataOut, funcs, argOffsets)
+val deltaTime = System.nanoTime()-startTime
+pythonCodeSerializeTime += deltaTime
+pythonCodeSent += dataOut.size()

Review comment:
   ditto.

##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/CoGroupedArrowPythonRunner.scala
##
@@ -83,6 +98,7 @@ class CoGroupedArrowPythonRunner(
   writeGroup(nextRight, rightSchema, dataOut, "right")
 }
 dataOut.writeInt(0)
+pythonDataSent += dataOut.size()

Review comment:
   ditto.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LucaCanali edited a comment on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


LucaCanali edited a comment on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-769738766


   I have updated PR with the proposed list of metrics. The list currently 
contains 10 metrics.
   I can see the need to have just a few important metrics.
   However, I can also see that the Python UDF implementation is complex with 
many moving parts, metrics can help with troubleshooting with corner cases too, 
in those circumstances more metrics mean more flexibility and more chances to 
find where the root cause of the problem is.
   
   Another point for discussion is how accurate are the metrics in the current 
implementation? I have run a few tests to check that the values measured make 
sense and are in the ballpark of what was expected.  There is room to do more 
tests with some corner cases to understand how the instrumentation copes there.
   
   In particular, measuring execution time can be challenging at times, as with 
this we attempt to do all measuremetns from JVM/Scala code. I am aiming to do a 
“good enough to be useful” job for the timing metrics, rather than a precise 
timing. I have put in the metrics description some hints of the nuances I found 
when testing. 
   
   I think the "time spent sending data" can be useful when troubleshooting 
cases where the performance problem is with sending large amounts of data to 
Python. Time spent executing is the key metric to understand the overall 
performance. Number of rows returned and processed are also useful metric, to 
understand how much work has been done and how much still needs to be done when 
monitoring the progress of an active query.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


SparkQA commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-769983228


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39251/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31364: [SPARK-34266][SQL][DOCS] Update comments for `SessionCatalog.refreshTable()` and `CatalogImpl.refreshTable()`

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31364:
URL: https://github.com/apache/spark/pull/31364#issuecomment-769724796







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31364: [SPARK-34266][SQL][DOCS] Update comments for `SessionCatalog.refreshTable()` and `CatalogImpl.refreshTable()`

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31364:
URL: https://github.com/apache/spark/pull/31364#issuecomment-769982934







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-769981642


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39250/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-769981642


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39250/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-769981607


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39250/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


SparkQA commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-769980719


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39248/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-769977920


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39247/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-769977920


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39247/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31182: [SPARK-34108][SQL] Caching with permanent view doesn't work in certain cases

2021-01-29 Thread GitBox


SparkQA commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-769975333


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39249/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-769972539


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39250/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


SparkQA commented on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-769955167


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39247/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #31393: [SPARK-34289][SQL] Parquet vectorized reader support column index

2021-01-29 Thread GitBox


dongjoon-hyun closed pull request #31393:
URL: https://github.com/apache/spark/pull/31393


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31182: [SPARK-34108][SQL] Caching with permanent view doesn't work in certain cases

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-769952594


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134662/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31367: [SPARK-34265][PYTHON][SQL] Instrument Python UDF using SQL Metrics

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31367:
URL: https://github.com/apache/spark/pull/31367#issuecomment-769952593


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134659/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31395: [SPARK-33599][SQL] Restore the assert-like in catalyst/analysis

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31395:
URL: https://github.com/apache/spark/pull/31395#issuecomment-769952598


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39246/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


SparkQA commented on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-769953174


   **[Test build #134661 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134661/testReport)**
 for PR 31369 at commit 
[`f950772`](https://github.com/apache/spark/commit/f9507725d86c3da4bb799b987761eb340a708c8e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-29 Thread GitBox


SparkQA commented on pull request #29542:
URL: https://github.com/apache/spark/pull/29542#issuecomment-769953154


   **[Test build #134663 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134663/testReport)**
 for PR 29542 at commit 
[`b6f9355`](https://github.com/apache/spark/commit/b6f9355992ffe5203006394c73a578e5f16ecbca).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31369: [SPARK-34270][SS] Combine StateStoreMetrics should not override StateStoreCustomMetric

2021-01-29 Thread GitBox


AmplabJenkins removed a comment on pull request #31369:
URL: https://github.com/apache/spark/pull/31369#issuecomment-769952597


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134664/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31182: [SPARK-34108][SQL] Caching with permanent view doesn't work in certain cases

2021-01-29 Thread GitBox


AmplabJenkins commented on pull request #31182:
URL: https://github.com/apache/spark/pull/31182#issuecomment-769952594


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134662/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   >