[GitHub] [spark] SparkQA commented on pull request #32899: [SPARK-35652][SQL][3.0] joinWith on two table generated from same one

2021-06-15 Thread GitBox


SparkQA commented on pull request #32899:
URL: https://github.com/apache/spark/pull/32899#issuecomment-862066764


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44369/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


SparkQA commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862066215


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44368/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


SparkQA removed a comment on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862005335


   **[Test build #139834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139834/testReport)**
 for PR 32921 at commit 
[`04ae0e3`](https://github.com/apache/spark/commit/04ae0e363e98cc0a8af1100ef11f08d9f2e47d1a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


SparkQA commented on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862057877


   **[Test build #139834 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139834/testReport)**
 for PR 32921 at commit 
[`04ae0e3`](https://github.com/apache/spark/commit/04ae0e363e98cc0a8af1100ef11f08d9f2e47d1a).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] StefanXiepj opened a new pull request #32925: SPARK-35622: DataFrame's count function do not need groupBy and avoid shuffle

2021-06-15 Thread GitBox


StefanXiepj opened a new pull request #32925:
URL: https://github.com/apache/spark/pull/32925


   
   
   ### What changes were proposed in this pull request?
   
   Use `df.rdd.count()` replace `df.count()`.
   
   ### Why are the changes needed?
   
   DataFrame's count function do not need groupBy,  use `df.rdd.count()` 
replace `df.count()` and avoid shuffle
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Added UT test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] allisonwang-db commented on a change in pull request #32787: [SPARK-35618][SQL] Resolve star expressions in subqueries using outer query plans

2021-06-15 Thread GitBox


allisonwang-db commented on a change in pull request #32787:
URL: https://github.com/apache/spark/pull/32787#discussion_r652363922



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
##
@@ -177,4 +178,61 @@ class ResolveSubquerySuite extends AnalysisTest {
   condition = Some(sum('a) === sum('c)))
 assertAnalysisError(plan, Seq("Invalid expressions: [sum(a), sum(c)]"))
   }
+
+  test("SPARK-35618: lateral join with star expansion") {

Review comment:
   @maropu I looked into how regex expressions are resolved and the logic 
is actually different from star expressions. It won't throw exceptions when 
there is no match. Instead, it returns an empty sequence. So we can't tell if 
the regex expression is resolved by the current plan with an empty output, or 
it can't be resolved.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on pull request #32924: [SPARK-35771][SQL] Format year-month intervals using type fields

2021-06-15 Thread GitBox


sarutak commented on pull request #32924:
URL: https://github.com/apache/spark/pull/32924#issuecomment-862052884


   cc: @MaxGekk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak opened a new pull request #32924: Format year-month intervals using type fields.

2021-06-15 Thread GitBox


sarutak opened a new pull request #32924:
URL: https://github.com/apache/spark/pull/32924


   ### What changes were proposed in this pull request?
   
   This PR proposes to format year-month interval to strings using the start 
and end fields of `YearMonthIntervalType`.
   
   ### Why are the changes needed?
   
Currently, they are ignored, and any `YearMonthIntervalType` is formatted 
as `INTERVAL YEAR TO MONTH`.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sumeetgajjar commented on pull request #32912: [SPARK-35429][CORE] Remove commons-httpclient from Hadoop-3.2 profile due to EOL and CVEs

2021-06-15 Thread GitBox


sumeetgajjar commented on pull request #32912:
URL: https://github.com/apache/spark/pull/32912#issuecomment-862050777


   Thank you @dongjoon-hyun @wangyum and @sunchao for the quick review and 
comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] weixiuli opened a new pull request #32923: [SPARK-35783][SQL] Set the list of read columns in the task configuration to reduce reading of ORC data.

2021-06-15 Thread GitBox


weixiuli opened a new pull request #32923:
URL: https://github.com/apache/spark/pull/32923


   
   ### What changes were proposed in this pull request?
   Set the list of read columns in the task configuration to reduce reading of 
ORC data.
   ### Why are the changes needed?
   Now, if the read column list is not set in the task configuration, it will 
read all columns in the ORC table. Therefore, we should set the list of read 
columns in the task configuration to reduce reading of ORC data.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   exist unittests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


wangyum commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862050091


   @kudhru Please fix the code style first:
   ```
   Scalastyle checks failed at following occurrences:
   [error] 
/home/jenkins/workspace/SparkPullRequestBuilder/common/sketch/src/test/scala/org/apache/spark/util/sketch/BloomFilterSuite.scala:102:
 File line length exceeds 100 characters
   [error] Total time: 46 s, completed Jun 15, 2021 10:21:32 PM
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA removed a comment on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-861936755


   **[Test build #139829 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139829/testReport)**
 for PR 32914 at commit 
[`1031593`](https://github.com/apache/spark/commit/10315931fff2ad06030cbc8e017f01e0be8593bc).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862048837


   **[Test build #139829 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139829/testReport)**
 for PR 32914 at commit 
[`1031593`](https://github.com/apache/spark/commit/10315931fff2ad06030cbc8e017f01e0be8593bc).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kudhru commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


kudhru commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862048748


   I am a bit confused as to how and when will this PR be merged into the 
master branch. Could someone please clarify?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


SparkQA removed a comment on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862047283


   **[Test build #139844 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139844/testReport)**
 for PR 32907 at commit 
[`5cb4bd0`](https://github.com/apache/spark/commit/5cb4bd0966e25e0c2a374a43729ef3732027a23b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


SparkQA commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862048308


   **[Test build #139844 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139844/testReport)**
 for PR 32907 at commit 
[`5cb4bd0`](https://github.com/apache/spark/commit/5cb4bd0966e25e0c2a374a43729ef3732027a23b).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `case class MakeYMInterval(years: Expression, months: Expression)`
 * `case class YearMonthIntervalType(startField: Byte, endField: Byte) 
extends AtomicType `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExe

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862048157


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44365/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExecutorM

2021-06-15 Thread GitBox


SparkQA commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862048138


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44365/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-862047380


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44364/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on pull request #32922: [SPARK-35774][SQL] Parse any year-month interval types in SQL

2021-06-15 Thread GitBox


sarutak commented on pull request #32922:
URL: https://github.com/apache/spark/pull/32922#issuecomment-862047487


   cc: @MaxGekk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


SparkQA commented on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-862047355


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44364/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-862047380


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44364/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


SparkQA commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862047283


   **[Test build #139844 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139844/testReport)**
 for PR 32907 at commit 
[`5cb4bd0`](https://github.com/apache/spark/commit/5cb4bd0966e25e0c2a374a43729ef3732027a23b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak opened a new pull request #32922: [SPARK-35774][SQL] Parse any year-month interval types in SQL

2021-06-15 Thread GitBox


sarutak opened a new pull request #32922:
URL: https://github.com/apache/spark/pull/32922


   
   ### What changes were proposed in this pull request?
   
   This PR extends the parser rules to be able to parse the following types:
   
   * INTERVAL YEAR
   * INTERVAL YEAR TO MONTH
   * INTERVAL MONTH
   
   ### Why are the changes needed?
   
   For ANSI compliance.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New assertion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageE

2021-06-15 Thread GitBox


SparkQA removed a comment on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862044801


   **[Test build #139843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139843/testReport)**
 for PR 31992 at commit 
[`87a079e`](https://github.com/apache/spark/commit/87a079e5fa159dad343adcf8ac3f158ff4870f6b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.log

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862046824


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139843/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExe

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862046824


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139843/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExecutorM

2021-06-15 Thread GitBox


SparkQA commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862046804


   **[Test build #139843 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139843/testReport)**
 for PR 31992 at commit 
[`87a079e`](https://github.com/apache/spark/commit/87a079e5fa159dad343adcf8ac3f158ff4870f6b).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-860798654






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


SparkQA removed a comment on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862041054


   **[Test build #139840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139840/testReport)**
 for PR 32907 at commit 
[`f1aec5e`](https://github.com/apache/spark/commit/f1aec5e39ef135edc39724efb3802eea8053ea37).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExecutorM

2021-06-15 Thread GitBox


SparkQA commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862044801


   **[Test build #139843 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139843/testReport)**
 for PR 31992 at commit 
[`87a079e`](https://github.com/apache/spark/commit/87a079e5fa159dad343adcf8ac3f158ff4870f6b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExec

2021-06-15 Thread GitBox


AngersZh commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862044197


   > @AngersZh BTW did you disable GA in your fork repo? It should be 
enabled so PR leverage the GA resources in your forked repo.
   
   No, but this pr is a little long.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExec

2021-06-15 Thread GitBox


AngersZh commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862043865


   > @AngersZh, mind making the PR description disambiguous? what's "driver 
executor peakMemoryMetrics"?
   
   How about current


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862042035


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139840/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


SparkQA commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862042021


   **[Test build #139840 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139840/testReport)**
 for PR 32907 at commit 
[`f1aec5e`](https://github.com/apache/spark/commit/f1aec5e39ef135edc39724efb3802eea8053ea37).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExecutorM

2021-06-15 Thread GitBox


SparkQA commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862041476


   **[Test build #139842 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139842/testReport)**
 for PR 31992 at commit 
[`6c81e2d`](https://github.com/apache/spark/commit/6c81e2dd74b7f49b3ba30b8618d1a502db1246dc).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862040635






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32919: [SPARK-35378][SQL][FOLLOWUP] Restore the command execution name for DataFrameWriterV2

2021-06-15 Thread GitBox


SparkQA commented on pull request #32919:
URL: https://github.com/apache/spark/pull/32919#issuecomment-862041045


   **[Test build #139838 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139838/testReport)**
 for PR 32919 at commit 
[`297c43d`](https://github.com/apache/spark/commit/297c43d5e9ed8586820f228a6d1693309a1a0b4d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32899: [SPARK-35652][SQL][3.0] joinWith on two table generated from same one

2021-06-15 Thread GitBox


SparkQA commented on pull request #32899:
URL: https://github.com/apache/spark/pull/32899#issuecomment-862041113


   **[Test build #139841 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139841/testReport)**
 for PR 32899 at commit 
[`b5c7706`](https://github.com/apache/spark/commit/b5c77069001fc64cb01628554ab2ac7e4bc42c7c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862040632






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


SparkQA commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862041054


   **[Test build #139840 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139840/testReport)**
 for PR 32907 at commit 
[`f1aec5e`](https://github.com/apache/spark/commit/f1aec5e39ef135edc39724efb3802eea8053ea37).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862041004


   **[Test build #139839 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139839/testReport)**
 for PR 32914 at commit 
[`7211c4b`](https://github.com/apache/spark/commit/7211c4b407e5bf8353af3d2cbf6f072c79d5a175).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862040632






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862040639






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


SparkQA removed a comment on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862006513


   **[Test build #139835 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139835/testReport)**
 for PR 32921 at commit 
[`202be14`](https://github.com/apache/spark/commit/202be14f09cbacd94de9d0bc3b518c87292f3878).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


SparkQA commented on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862039808


   **[Test build #139835 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139835/testReport)**
 for PR 32921 at commit 
[`202be14`](https://github.com/apache/spark/commit/202be14f09cbacd94de9d0bc3b518c87292f3878).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExecu

2021-06-15 Thread GitBox


HyukjinKwon commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862038185


   cc @HeartSaVioR too FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #31992: [SPARK-34898][CORE] We should log SparkListenerExecutorMetricsUpdateEvent of `driver` appropriately when `spark.eventLog.logStageExecu

2021-06-15 Thread GitBox


HyukjinKwon commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862038027


   @AngersZh BTW did you disable GA in your fork repo? It should be enabled 
so PR leverage the GA resources in your forked repo.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm edited a comment on pull request #32385: [WIP][SPARK-35275][CORE] Add checksum for shuffle blocks and diagnose corruption

2021-06-15 Thread GitBox


mridulm edited a comment on pull request #32385:
URL: https://github.com/apache/spark/pull/32385#issuecomment-862024269


   lol, thanks for the links @Ngone51  :-)
   Glad I went through this once more anyway - will help me with better 
understanding of the sub-pr's !
   Will wait for the update before taking a look at #32401.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm edited a comment on pull request #32385: [WIP][SPARK-35275][CORE] Add checksum for shuffle blocks and diagnose corruption

2021-06-15 Thread GitBox


mridulm edited a comment on pull request #32385:
URL: https://github.com/apache/spark/pull/32385#issuecomment-862024269


   lol, thanks for the links @Ngone51  :-)
   Glad I went through this once more anyway - will help me with better 
understanding of the sub-pr's !
   Will wait for the update before taking a look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


HyukjinKwon commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862036980


   Also do you mean `SparkListenerExecutorMetricsUpdateEvent` by 
`SparkListenerExecutorMetricsUpdateEventLog`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


HyukjinKwon commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862036119


   @AngersZh, mind making the PR description disambiguous? what's "driver 
executor peakMemoryMetrics"?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


SparkQA commented on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-862035784


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44364/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


cloud-fan commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862035609


   @kudhru  I think you need to rebase this PR with the latest master branch, 
and also update the master branch of your spark fork to sync with the latest 
upstream master branch. Otherwise github action won't work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


SparkQA commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862035537


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44365/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


HyukjinKwon commented on a change in pull request #31992:
URL: https://github.com/apache/spark/pull/31992#discussion_r652346351



##
File path: 
core/src/test/scala/org/apache/spark/scheduler/EventLoggingListenerSuite.scala
##
@@ -618,6 +619,10 @@ class EventLoggingListenerSuite extends SparkFunSuite with 
LocalSparkContext wit
 assert(expected.stageInfo.stageId === actual.stageInfo.stageId)
   case (expected: SparkListenerTaskEnd, actual: SparkListenerTaskEnd) =>
 assert(expected.stageId === actual.stageId)
+  case (expected: SparkListenerExecutorMetricsUpdate,
+  actual: SparkListenerExecutorMetricsUpdate) =>

Review comment:
   ```suggestion
 actual: SparkListenerExecutorMetricsUpdate) =>
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


cloud-fan commented on pull request #32907:
URL: https://github.com/apache/spark/pull/32907#issuecomment-862035082


   ok to test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-06-15 Thread GitBox


cloud-fan commented on a change in pull request #32401:
URL: https://github.com/apache/spark/pull/32401#discussion_r652345736



##
File path: 
core/src/main/java/org/apache/spark/shuffle/api/ShuffleMapOutputWriter.java
##
@@ -68,8 +72,11 @@
*for that partition id.
* 
* 2) An optional metadata blob that can be used by shuffle readers.
+   *
+   * @param checksums The checksum values for each partition if shuffle 
checksum enabled.
+   *  Otherwise, it's empty.
*/
-  MapOutputCommitMessage commitAllPartitions() throws IOException;
+  MapOutputCommitMessage commitAllPartitions(long[] checksums) throws 
IOException;

Review comment:
   TBH I don't think the current shuffle API provides enough abstraction to 
do checksum. I'm OK with this change as the shuffle API is still private, but 
we should revisit the shuffle API later, so that checksum can be done at the 
shuffle implementation side.
   
   The current issue I see is, Spark writes local spill files and then asks the 
shuffle implementation to "transfer" the spill files. Then Spark has to do 
checksum by itself during spill file writing, to reduce the perf overhead.
   
   We can discuss it later.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA removed a comment on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862002836


   **[Test build #139833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139833/testReport)**
 for PR 32914 at commit 
[`5dddc66`](https://github.com/apache/spark/commit/5dddc6644c9e76f78ebd0d44a4c23ee80b0fd55c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862033749


   **[Test build #139833 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139833/testReport)**
 for PR 32914 at commit 
[`5dddc66`](https://github.com/apache/spark/commit/5dddc6644c9e76f78ebd0d44a4c23ee80b0fd55c).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


SparkQA commented on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862032913


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44363/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kudhru closed pull request #32907: [SPARK-35757][CORE] Add bitwise AND operation and functionality for intersecting bloom filters

2021-06-15 Thread GitBox


kudhru closed pull request #32907:
URL: https://github.com/apache/spark/pull/32907


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32899: [SPARK-35652][SQL][3.0] joinWith on two table generated from same one

2021-06-15 Thread GitBox


cloud-fan commented on pull request #32899:
URL: https://github.com/apache/spark/pull/32899#issuecomment-862031371


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #31905: [SPARK-34806][SQL] Add Observation helper for Dataset.observe

2021-06-15 Thread GitBox


cloud-fan commented on a change in pull request #31905:
URL: https://github.com/apache/spark/pull/31905#discussion_r652340898



##
File path: sql/core/src/main/scala/org/apache/spark/sql/Observation.scala
##
@@ -0,0 +1,189 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+import java.util.UUID
+import java.util.concurrent.TimeUnit
+import java.util.concurrent.locks.{Condition, Lock, ReentrantLock}
+
+import org.apache.spark.sql.execution.QueryExecution
+import org.apache.spark.sql.util.QueryExecutionListener
+
+/**
+ * Not thread-safe.
+ * @param name
+ * @param sparkSession
+ */
+class Observation(name: String) {

Review comment:
   then shall we have a variant of `waitCompleted` that w/o a timeout?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862029828


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44361/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32470: [SPARK-35712][SQL] Simplify ResolveAggregateFunctions

2021-06-15 Thread GitBox


cloud-fan commented on pull request #32470:
URL: https://github.com/apache/spark/pull/32470#issuecomment-862028396


   @viirya @maropu Can you take one more look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on pull request #32385: [WIP][SPARK-35275][CORE] Add checksum for shuffle blocks and diagnose corruption

2021-06-15 Thread GitBox


mridulm commented on pull request #32385:
URL: https://github.com/apache/spark/pull/32385#issuecomment-862024269


   lol, thanks for the links @Ngone51  :-)
   Glad I went through this once more anyway - will help me with better 
understanding of the sub-pr's !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm edited a comment on pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


mridulm edited a comment on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862023314


   Thanks for the update @AngersZh, I am fine with merging this ... will 
keep it around for a couple of days in case there are other comments.
   +CC @zhouyejoe, @thejdeep PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


mridulm commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862023314


   Thanks for the update @AngersZh
   +CC @zhouyejoe, @thejdeep PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


SparkQA commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862021515


   **[Test build #139837 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139837/testReport)**
 for PR 31992 at commit 
[`c9a4a67`](https://github.com/apache/spark/commit/c9a4a6789586771720739cd71354029b64e5b358).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


SparkQA commented on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-862021191


   **[Test build #139836 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139836/testReport)**
 for PR 32801 at commit 
[`996f787`](https://github.com/apache/spark/commit/996f787921860d002e275fe530af6b2ab429cf7e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


SparkQA commented on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862021114


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44363/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on a change in pull request #32811: [SPARK-35671][SHUFFLE][CORE] Add support in the ESS to serve merged shuffle block meta and data to executors

2021-06-15 Thread GitBox


mridulm commented on a change in pull request #32811:
URL: https://github.com/apache/spark/pull/32811#discussion_r651973359



##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java
##
@@ -294,18 +336,30 @@ public ShuffleMetrics() {
 private int index = 0;
 private final Function blockDataForIndexFn;
 private final int size;
+private boolean requestForMergedBlockChunks;
 
 ManagedBufferIterator(OpenBlocks msg) {
   String appId = msg.appId;
   String execId = msg.execId;
   String[] blockIds = msg.blockIds;
   String[] blockId0Parts = blockIds[0].split("_");
-  if (blockId0Parts.length == 4 && blockId0Parts[0].equals("shuffle")) {
+  if (blockId0Parts.length == 4 && 
(blockId0Parts[0].equals(SHUFFLE_BLOCK_ID) ||
+blockId0Parts[0].equals(SHUFFLE_CHUNK_ID))) {
 final int shuffleId = Integer.parseInt(blockId0Parts[1]);
-final int[] mapIdAndReduceIds = shuffleMapIdAndReduceIds(blockIds, 
shuffleId);
-size = mapIdAndReduceIds.length;
-blockDataForIndexFn = index -> blockManager.getBlockData(appId, 
execId, shuffleId,
-  mapIdAndReduceIds[index], mapIdAndReduceIds[index + 1]);
+requestForMergedBlockChunks = 
blockId0Parts[0].equals(SHUFFLE_CHUNK_ID);
+// For regular shuffle blocks, primaryId is mapId and secondaryIds are 
reduceIds.
+// For shuffle chunks, primaryIds is reduceId and secondaryIds are 
chunkIds.
+final int[] primaryIdAndSecondaryIds = 
shuffleMapIdAndReduceIds(blockIds, shuffleId);
+size = primaryIdAndSecondaryIds.length;
+blockDataForIndexFn = index -> {
+  if (requestForMergedBlockChunks) {
+return mergeManager.getMergedBlockData(msg.appId, shuffleId,
+  primaryIdAndSecondaryIds[index], primaryIdAndSecondaryIds[index 
+ 1]);
+  } else {
+return blockManager.getBlockData(msg.appId, msg.execId, shuffleId,
+  primaryIdAndSecondaryIds[index], primaryIdAndSecondaryIds[index 
+ 1]);
+  }
+};

Review comment:
   nit: Wondering if this is cleaner if we simply split this out into its 
own else block for block chunk ?

##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/OneForOneBlockFetcher.java
##
@@ -88,82 +94,125 @@ public OneForOneBlockFetcher(
 if (blockIds.length == 0) {
   throw new IllegalArgumentException("Zero-sized blockIds array");
 }
-if (!transportConf.useOldFetchProtocol() && isShuffleBlocks(blockIds)) {
+if (!transportConf.useOldFetchProtocol() && 
areShuffleBlocksOrChunks(blockIds)) {
   this.blockIds = new String[blockIds.length];
-  this.message = createFetchShuffleBlocksMsgAndBuildBlockIds(appId, 
execId, blockIds);
+  this.message = createFetchShuffleBlocksOrChunksMsg(appId, execId, 
blockIds);
 } else {
   this.blockIds = blockIds;
   this.message = new OpenBlocks(appId, execId, blockIds);
 }
   }
 
-  private boolean isShuffleBlocks(String[] blockIds) {
-for (String blockId : blockIds) {
-  if (!blockId.startsWith("shuffle_")) {
-return false;
-  }
+  /**
+   * Check if the array of block IDs are all shuffle block IDs. With push 
based shuffle,
+   * the shuffle block ID could be either unmerged shuffle block IDs or merged 
shuffle chunk
+   * IDs. For a given stream of shuffle blocks to be fetched in one request, 
they would be either
+   * all unmerged shuffle blocks or all merged shuffle chunks.
+   * @param blockIds block ID array
+   * @return whether the array contains only shuffle block IDs
+   */
+  private boolean areShuffleBlocksOrChunks(String[] blockIds) {
+if (Arrays.stream(blockIds).anyMatch(blockId -> 
!blockId.startsWith(SHUFFLE_BLOCK_PREFIX))) {
+  // It comes here because there is a blockId which doesn't have 
"shuffle_" prefix so we
+  // check if all the block ids are shuffle chunk Ids.
+  return Arrays.stream(blockIds).allMatch(blockId -> 
blockId.startsWith(SHUFFLE_CHUNK_PREFIX));
 }
 return true;
   }
 
+  /** Creates either a {@link FetchShuffleBlocks} or {@link 
FetchShuffleBlockChunks} message. */
+  private AbstractFetchShuffleBlocks createFetchShuffleBlocksOrChunksMsg(
+  String appId,
+  String execId,
+  String[] blockIds) {
+if (blockIds[0].startsWith(SHUFFLE_CHUNK_PREFIX)) {
+  return createFetchShuffleMsgAndBuildBlockIds(appId, execId, blockIds, 
true);
+} else {
+  return createFetchShuffleMsgAndBuildBlockIds(appId, execId, blockIds, 
false);
+}
+  }
+
   /**
-   * Create FetchShuffleBlocks message and rebuild internal blockIds by
+   * Create FetchShuffleBlocks/FetchShuffleBlockChunks message and rebuild 
internal blockIds by
* analyzing the pass in blockIds.
*/
-  private FetchShuffleBlocks createFetchShuffleBlocksMsgAndBuildBlockIds(
-  String appId, String execId, 

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #32049:
URL: https://github.com/apache/spark/pull/32049#issuecomment-862019804


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44362/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862019805


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44359/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-862019802


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44360/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #32049:
URL: https://github.com/apache/spark/pull/32049#issuecomment-862019804


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44362/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-862019802


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44360/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862019805


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44359/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-06-15 Thread GitBox


SparkQA commented on pull request #32049:
URL: https://github.com/apache/spark/pull/32049#issuecomment-862018031


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44362/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862017695


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44361/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] c21 commented on pull request #32911: [SPARK-35760][SQL] Fix the max rows check for broadcast exchange

2021-06-15 Thread GitBox


c21 commented on pull request #32911:
URL: https://github.com/apache/spark/pull/32911#issuecomment-862016454


   Thank you all for review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


AngersZh commented on pull request #31992:
URL: https://github.com/apache/spark/pull/31992#issuecomment-862010798


   > I don't know enough to have an opinion on this. I think the key questions 
are - what is the most consistent thing to do, and, are there any performance 
problems with adding this information to events?
   
   Since  for spark admin, we always want to build system to know the app's 
running status in our cluster and let user to change the memory configuration  
if they don't set it reasonable. So we nee to know the peak memory usage. 
Although we can get this information form metrics system but we need to 
integrate restful api's metrics data and metrics system's information.
   This pr make us can get driver's memory usage from hisitory server's restful 
api.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #31992: [SPARK-34898][CORE] We should send SparkListenerExecutorMetricsUpdateEventLog of `driver` appropriately

2021-06-15 Thread GitBox


AngersZh commented on a change in pull request #31992:
URL: https://github.com/apache/spark/pull/31992#discussion_r652323059



##
File path: 
core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala
##
@@ -249,6 +249,9 @@ private[spark] class EventLoggingListener(
   }
 
   override def onExecutorMetricsUpdate(event: 
SparkListenerExecutorMetricsUpdate): Unit = {
+if (event.execId == SparkContext.DRIVER_IDENTIFIER) {
+  logEvent(event)
+}

Review comment:
   > Currently, we have a single event for both driver and executor metrics 
update - differentiated by exec id.
   > I dont have strong opinions on this, but if we have a flag 
(`shouldLogStageExecutorMetrics`) controlling whether metrics are to be 
updated, we should consistently apply it IMO.
   
   @mridulm Follow this comment, how about current.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


SparkQA commented on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-862007710


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44360/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862007179


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44359/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


SparkQA commented on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862006513


   **[Test build #139835 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139835/testReport)**
 for PR 32921 at commit 
[`202be14`](https://github.com/apache/spark/commit/202be14f09cbacd94de9d0bc3b518c87292f3878).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on pull request #32610: [SPARK-35460][K8S] verify the content of`spark.kubernetes.executor.podNamePrefix` before post it to k8s api-server

2021-06-15 Thread GitBox


yaooqinn commented on pull request #32610:
URL: https://github.com/apache/spark/pull/32610#issuecomment-862006483


   kindly ping @dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


aokolnychyi commented on a change in pull request #32921:
URL: https://github.com/apache/spark/pull/32921#discussion_r652208722



##
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
##
@@ -227,3 +228,14 @@ object ReuseSubquery extends Rule[SparkPlan] {
 }
   }
 }
+
+object PrepareScans extends Rule[SparkPlan] {
+  def apply(plan: SparkPlan): SparkPlan = {
+val scans = plan.collect {
+  case scan: BatchScanExec => scan
+}
+scans.foreach(_.prepare())

Review comment:
   I mention in the doc why I am using `prepare` but we can make this more 
specific to dynamic filters if needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


SparkQA commented on pull request #32921:
URL: https://github.com/apache/spark/pull/32921#issuecomment-862005335


   **[Test build #139834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139834/testReport)**
 for PR 32921 at commit 
[`04ae0e3`](https://github.com/apache/spark/commit/04ae0e363e98cc0a8af1100ef11f08d9f2e47d1a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


SparkQA commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862002836


   **[Test build #139833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139833/testReport)**
 for PR 32914 at commit 
[`5dddc66`](https://github.com/apache/spark/commit/5dddc6644c9e76f78ebd0d44a4c23ee80b0fd55c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-06-15 Thread GitBox


SparkQA commented on pull request #32049:
URL: https://github.com/apache/spark/pull/32049#issuecomment-862001794


   **[Test build #139832 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139832/testReport)**
 for PR 32049 at commit 
[`a5833ef`](https://github.com/apache/spark/commit/a5833ef7f551980ef48229932d9427a9e00af444).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


HeartSaVioR commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862001604


   retest this, please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


HeartSaVioR commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-862001559


   > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139830
   
   StackOverflowError happened while compiling... I'll retrigger again to see 
whether it's intermittent or not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [WIP][SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-06-15 Thread GitBox


aokolnychyi commented on a change in pull request #32921:
URL: https://github.com/apache/spark/pull/32921#discussion_r652315411



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
##
@@ -96,6 +96,7 @@ case class AdaptiveSparkPlanExec(
   @transient private val queryStageOptimizerRules: Seq[Rule[SparkPlan]] = Seq(
 PlanAdaptiveDynamicPruningFilters(this),
 ReuseAdaptiveSubquery(context.subqueryCache),
+PrepareScans,

Review comment:
   @sunchao, per our design doc discussion, removing this explicit call 
causes test failures. I'll check what is going on tomorrow.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


AmplabJenkins removed a comment on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-861999568


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44358/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] asfgit closed pull request #32754: [SPARK-35613][CORE][SQL] Cache commonly occurring strings in SQLMetrics, JSONProtocol and AccumulatorV2 classes

2021-06-15 Thread GitBox


asfgit closed pull request #32754:
URL: https://github.com/apache/spark/pull/32754


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #32914: [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric

2021-06-15 Thread GitBox


AmplabJenkins commented on pull request #32914:
URL: https://github.com/apache/spark/pull/32914#issuecomment-861999568


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44358/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on pull request #32754: [SPARK-35613][CORE][SQL] Cache commonly occurring strings in SQLMetrics, JSONProtocol and AccumulatorV2 classes

2021-06-15 Thread GitBox


mridulm commented on pull request #32754:
URL: https://github.com/apache/spark/pull/32754#issuecomment-861999217


   Merging to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32801: [SPARK-12567][SQL] Add aes_encrypt and aes_decrypt builtin functions

2021-06-15 Thread GitBox


SparkQA commented on pull request #32801:
URL: https://github.com/apache/spark/pull/32801#issuecomment-861996098


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44360/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >