date:20210330

[GitHub] [spark] SparkQA commented on pull request #31983: [SPARK-34882][SQL] Replace if with filter clause in RewriteDistinctAggregates

2021-03-30 Thread GitBox



SparkQA commented on pull request #31983:
URL: https://github.com/apache/spark/pull/31983#issuecomment-810821233


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41331/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-30 Thread GitBox



cloud-fan commented on a change in pull request #30965:
URL: https://github.com/apache/spark/pull/30965#discussion_r604638838



##
File path: 
sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt
##
@@ -6,71 +6,71 @@ TakeOrderedAndProject 
[ext_price,brand,brand_id,i_manufact_id,i_manufact]
   WholeStageCodegen (12)
 HashAggregate 
[i_brand,i_brand_id,i_manufact_id,i_manufact,ss_ext_sales_price] [sum,sum]
   Project 
[ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-SortMergeJoin [ss_customer_sk,c_customer_sk,ca_zip,s_zip]
-  InputAdapter
-WholeStageCodegen (5)
-  Sort [ss_customer_sk]
-InputAdapter
-  Exchange [ss_customer_sk] #2
-WholeStageCodegen (4)
-  Project 
[ss_customer_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact,s_zip]
-BroadcastHashJoin [ss_store_sk,s_store_sk]
-  Project 
[ss_customer_sk,ss_store_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-BroadcastHashJoin 
[ss_sold_date_sk,d_date_sk]
-  Project 
[ss_sold_date_sk,ss_customer_sk,ss_store_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-BroadcastHashJoin 
[ss_item_sk,i_item_sk]
-  Filter 
[ss_sold_date_sk,ss_item_sk,ss_customer_sk,ss_store_sk]
-ColumnarToRow
-  InputAdapter
-Scan parquet 
default.store_sales 
[ss_sold_date_sk,ss_item_sk,ss_customer_sk,ss_store_sk,ss_ext_sales_price]
+BroadcastHashJoin [ss_item_sk,i_item_sk]

Review comment:
   Thanks for looking into it!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



SparkQA commented on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810820115


   **[Test build #136755 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136755/testReport)**
 for PR 32010 at commit 
[`89990af`](https://github.com/apache/spark/commit/89990af1104e533f7b1ad720475036d8ce0f1865).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] tanelk commented on a change in pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-30 Thread GitBox



tanelk commented on a change in pull request #30965:
URL: https://github.com/apache/spark/pull/30965#discussion_r604636299



##
File path: 
sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt
##
@@ -6,71 +6,71 @@ TakeOrderedAndProject 
[ext_price,brand,brand_id,i_manufact_id,i_manufact]
   WholeStageCodegen (12)
 HashAggregate 
[i_brand,i_brand_id,i_manufact_id,i_manufact,ss_ext_sales_price] [sum,sum]
   Project 
[ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-SortMergeJoin [ss_customer_sk,c_customer_sk,ca_zip,s_zip]
-  InputAdapter
-WholeStageCodegen (5)
-  Sort [ss_customer_sk]
-InputAdapter
-  Exchange [ss_customer_sk] #2
-WholeStageCodegen (4)
-  Project 
[ss_customer_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact,s_zip]
-BroadcastHashJoin [ss_store_sk,s_store_sk]
-  Project 
[ss_customer_sk,ss_store_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-BroadcastHashJoin 
[ss_sold_date_sk,d_date_sk]
-  Project 
[ss_sold_date_sk,ss_customer_sk,ss_store_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-BroadcastHashJoin 
[ss_item_sk,i_item_sk]
-  Filter 
[ss_sold_date_sk,ss_item_sk,ss_customer_sk,ss_store_sk]
-ColumnarToRow
-  InputAdapter
-Scan parquet 
default.store_sales 
[ss_sold_date_sk,ss_item_sk,ss_customer_sk,ss_store_sk,ss_ext_sales_price]
+BroadcastHashJoin [ss_item_sk,i_item_sk]

Review comment:
   I'll experiment with it a bit, but it might take a while.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-30 Thread GitBox



SparkQA commented on pull request #30144:
URL: https://github.com/apache/spark/pull/30144#issuecomment-810817452


   **[Test build #136754 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136754/testReport)**
 for PR 30144 at commit 
[`7224e01`](https://github.com/apache/spark/commit/7224e01acfe2eed282369cb4a96dadb0e401b627).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-30 Thread GitBox



AngersZh commented on a change in pull request #30144:
URL: https://github.com/apache/spark/pull/30144#discussion_r604635105



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/grouping.scala
##
@@ -212,3 +212,29 @@ object GroupingID {
 if (SQLConf.get.integerGroupingIdEnabled) IntegerType else LongType
   }
 }
+
+
+object GroupByOperator {

Review comment:
   > `GroupByOperator` -> `GroupingAnalytics`?
   
   Just changed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-30 Thread GitBox



maropu commented on a change in pull request #30144:
URL: https://github.com/apache/spark/pull/30144#discussion_r604634949



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/grouping.scala
##
@@ -212,3 +212,29 @@ object GroupingID {
 if (SQLConf.get.integerGroupingIdEnabled) IntegerType else LongType
   }
 }
+
+
+object GroupByOperator {

Review comment:
   `GroupByOperator` -> `GroupingAnalytics`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-30 Thread GitBox



SparkQA commented on pull request #30144:
URL: https://github.com/apache/spark/pull/30144#issuecomment-810815784


   **[Test build #136753 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136753/testReport)**
 for PR 30144 at commit 
[`005b697`](https://github.com/apache/spark/commit/005b6974d11ed37351f54de8dd43717f7b13aa71).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #31932: [SPARK-34906] Refactor TreeNode's children handling methods into specialized traits

2021-03-30 Thread GitBox



cloud-fan commented on a change in pull request #31932:
URL: https://github.com/apache/spark/pull/31932#discussion_r604633980



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Covariance.scala
##
@@ -27,9 +28,10 @@ import org.apache.spark.sql.types._
  * When applied on empty data (i.e., count is zero), it returns NULL.
  */
 abstract class Covariance(x: Expression, y: Expression, nullOnDivideByZero: 
Boolean)

Review comment:
   we can simply do `val left: Expression, val right: Expression` here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-30 Thread GitBox



SparkQA commented on pull request #30057:
URL: https://github.com/apache/spark/pull/30057#issuecomment-810815432


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41333/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #32011:
URL: https://github.com/apache/spark/pull/32011#issuecomment-81081


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41330/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810814434


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136747/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #32006:
URL: https://github.com/apache/spark/pull/32006#issuecomment-810814432


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136740/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #31470:
URL: https://github.com/apache/spark/pull/31470#issuecomment-810814436


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136741/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31989: [WIP][SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-810814443


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136742/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #31470:
URL: https://github.com/apache/spark/pull/31470#issuecomment-810814436


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136741/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #32006:
URL: https://github.com/apache/spark/pull/32006#issuecomment-810814432


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136740/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #31989: [WIP][SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-810814443


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136742/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810814434


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136747/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #32011:
URL: https://github.com/apache/spark/pull/32011#issuecomment-81081


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41330/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #32001: [SPARK-34902][SQL] Support cast between LongType & DayTimeIntervalType and IntegerType & YearMonthIntervalType

2021-03-30 Thread GitBox



AngersZh commented on pull request #32001:
URL: https://github.com/apache/spark/pull/32001#issuecomment-810814418


   > As @cloud-fan said we have special functions that convert numbers to 
timestamps. I quickly look at Oracle, it has similar function for intervals. 
For example, 
[NUMTODSINTERVAL](https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions117.htm#SQLRF00682)
 converts a `NUM` to a `DAY TO SECOND INTERVAL`:
   > 
   > ```
   > NUMTODSINTERVAL(100, 'day')
   > ```
   > 
   > @AngersZh Could you look at other DMBS, and see how they cast 
intervals from/to numbers.
   
   Sure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-30 Thread GitBox



SparkQA commented on pull request #32011:
URL: https://github.com/apache/spark/pull/32011#issuecomment-810814224






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #31010: [SPARK-33976][SQL] Spark script TRANSFORM related change doc

2021-03-30 Thread GitBox



AngersZh commented on pull request #31010:
URL: https://github.com/apache/spark/pull/31010#issuecomment-810814163


   > `branch-3.0`/`3.1` does not have a doc for a TRANSFORM clause, so IMO it 
would be nice to write the common syntaxes of a TRANSFORM clause in this first 
PR and backport the doc into `branch-3.0`/`3.1`. Then, we can write the other 
improved syntaxes for master only in following PRs. WDYT?
   
   Good suggestion. Will start this a little later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-30 Thread GitBox



cloud-fan commented on a change in pull request #30965:
URL: https://github.com/apache/spark/pull/30965#discussion_r604631512



##
File path: 
sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt
##
@@ -6,71 +6,71 @@ TakeOrderedAndProject 
[ext_price,brand,brand_id,i_manufact_id,i_manufact]
   WholeStageCodegen (12)
 HashAggregate 
[i_brand,i_brand_id,i_manufact_id,i_manufact,ss_ext_sales_price] [sum,sum]
   Project 
[ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-SortMergeJoin [ss_customer_sk,c_customer_sk,ca_zip,s_zip]
-  InputAdapter
-WholeStageCodegen (5)
-  Sort [ss_customer_sk]
-InputAdapter
-  Exchange [ss_customer_sk] #2
-WholeStageCodegen (4)
-  Project 
[ss_customer_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact,s_zip]
-BroadcastHashJoin [ss_store_sk,s_store_sk]
-  Project 
[ss_customer_sk,ss_store_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-BroadcastHashJoin 
[ss_sold_date_sk,d_date_sk]
-  Project 
[ss_sold_date_sk,ss_customer_sk,ss_store_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-BroadcastHashJoin 
[ss_item_sk,i_item_sk]
-  Filter 
[ss_sold_date_sk,ss_item_sk,ss_customer_sk,ss_store_sk]
-ColumnarToRow
-  InputAdapter
-Scan parquet 
default.store_sales 
[ss_sold_date_sk,ss_item_sk,ss_customer_sk,ss_store_sk,ss_ext_sales_price]
+BroadcastHashJoin [ss_item_sk,i_item_sk]

Review comment:
   I think q19 exposes a problem. Previously this `BroadcastHashJoin` is 
run before the `SortMergeJoin`, which reduces the input data of shuffle, 
because this  `BroadcastHashJoin` has a filter on the right side and likely 
makes this join very selective.
   
   @tanelk , if the idea from @wzhfy doesn't look good to you, can you try with 
some other ideas and see if we can fix this issue?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-30 Thread GitBox



cloud-fan commented on a change in pull request #30965:
URL: https://github.com/apache/spark/pull/30965#discussion_r604631512



##
File path: 
sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt
##
@@ -6,71 +6,71 @@ TakeOrderedAndProject 
[ext_price,brand,brand_id,i_manufact_id,i_manufact]
   WholeStageCodegen (12)
 HashAggregate 
[i_brand,i_brand_id,i_manufact_id,i_manufact,ss_ext_sales_price] [sum,sum]
   Project 
[ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-SortMergeJoin [ss_customer_sk,c_customer_sk,ca_zip,s_zip]
-  InputAdapter
-WholeStageCodegen (5)
-  Sort [ss_customer_sk]
-InputAdapter
-  Exchange [ss_customer_sk] #2
-WholeStageCodegen (4)
-  Project 
[ss_customer_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact,s_zip]
-BroadcastHashJoin [ss_store_sk,s_store_sk]
-  Project 
[ss_customer_sk,ss_store_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-BroadcastHashJoin 
[ss_sold_date_sk,d_date_sk]
-  Project 
[ss_sold_date_sk,ss_customer_sk,ss_store_sk,ss_ext_sales_price,i_brand_id,i_brand,i_manufact_id,i_manufact]
-BroadcastHashJoin 
[ss_item_sk,i_item_sk]
-  Filter 
[ss_sold_date_sk,ss_item_sk,ss_customer_sk,ss_store_sk]
-ColumnarToRow
-  InputAdapter
-Scan parquet 
default.store_sales 
[ss_sold_date_sk,ss_item_sk,ss_customer_sk,ss_store_sk,ss_ext_sales_price]
+BroadcastHashJoin [ss_item_sk,i_item_sk]

Review comment:
   I think q19 exposes a problem. Previously this `BroadcastHashJoin` is 
run before the `SortMergeJoin`, which reduces the input data of shuffle, 
because this  `BroadcastHashJoin` has a filter on the right side and can likely 
prune many data.
   
   @tanelk , if the idea from @wzhfy doesn't look good to you, can you try with 
some other ideas and see if we can fix this issue?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] tanelk commented on pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-30 Thread GitBox



tanelk commented on pull request #30965:
URL: https://github.com/apache/spark/pull/30965#issuecomment-810810870


   @wzhfy and @cloud-fan 
   
   I'm not a fan of adding up the relative costs.
   
   A simple example, where the weight is 0.5:
   If this plans size (bytes) is 2x larger, then no matter how many times more 
rows does the other plan have, the other plan will allways be considered to be 
better - `0.5*2 + 0.5*0.01  > 1`.
   This basically the same situation, where one cost overwhelms the other.
   
   Perhaps this would be a best of both worlds:
   `(this.card / other.card) ^ cardWeight * (this.size / other.size) ^ (1 - 
cardWeight) < 1`.
   In short - multiply the relative costs instead of adding them.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan closed pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-03-30 Thread GitBox



cloud-fan closed pull request #31470:
URL: https://github.com/apache/spark/pull/31470


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-03-30 Thread GitBox



cloud-fan commented on pull request #31470:
URL: https://github.com/apache/spark/pull/31470#issuecomment-810808178


   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #31983: [SPARK-34882][SQL] Replace if with filter clause in RewriteDistinctAggregates

2021-03-30 Thread GitBox



maropu commented on a change in pull request #31983:
URL: https://github.com/apache/spark/pull/31983#discussion_r604626741



##
File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
##
@@ -2834,6 +2835,29 @@ class DataFrameSuite extends QueryTest
   df10.select(zip_with(col("array1"), col("array2"), (b1, b2) => 
reverseThenConcat2(b1, b2)))
 checkAnswer(test10, Row(Array(Row("cbaihg"), Row("fedlkj"))) :: Nil)
   }
+
+  test("SPARK-34882: Aggregate with multiple distinct null sensitive 
aggregators") {
+spark.udf.register("countNulls", udaf(new Aggregator[JLong, JLong, JLong] {

Review comment:
   Ah, okay. I misunderstood it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



SparkQA removed a comment on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810739013


   **[Test build #136747 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136747/testReport)**
 for PR 32010 at commit 
[`7d367e3`](https://github.com/apache/spark/commit/7d367e38e625a1007e1922ac3fb17da9d17647d6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-30 Thread GitBox



cloud-fan commented on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810807243


   @viirya how about the history server? I'm a bit worried about the event log 
with v2 metrics.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



SparkQA commented on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810806977


   **[Test build #136747 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136747/testReport)**
 for PR 32010 at commit 
[`7d367e3`](https://github.com/apache/spark/commit/7d367e38e625a1007e1922ac3fb17da9d17647d6).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-03-30 Thread GitBox



SparkQA removed a comment on pull request #31470:
URL: https://github.com/apache/spark/pull/31470#issuecomment-810691959


   **[Test build #136741 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136741/testReport)**
 for PR 31470 at commit 
[`f0c7ce4`](https://github.com/apache/spark/commit/f0c7ce423009e9465ec614c9e4c64781229e1f19).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-03-30 Thread GitBox



SparkQA commented on pull request #31470:
URL: https://github.com/apache/spark/pull/31470#issuecomment-810806264


   **[Test build #136741 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136741/testReport)**
 for PR 31470 at commit 
[`f0c7ce4`](https://github.com/apache/spark/commit/f0c7ce423009e9465ec614c9e4c64781229e1f19).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on pull request #31886: [SPARK-34795][SQL][TESTS] Adds a new job in GitHub Actions to check the output of TPC-DS queries

2021-03-30 Thread GitBox



gatorsmile commented on pull request #31886:
URL: https://github.com/apache/spark/pull/31886#issuecomment-810802704


   This is awesome! We should do it 5 years ago. :-) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

2021-03-30 Thread GitBox



maropu commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604621607



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. 
converts a value of
+ * one data type into another data type), but returns a NULL value instead of 
raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be 
resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data 
type `type`. " +
+"This expression is identical to CAST with configuration 
`spark.sql.ansi.enabled` as " +
+"true, except it returns NULL instead of raising an error. Note that the 
behavior of this " +
+"expression doesn't depend on configuration `spark.sql.ansi.enabled`.",
+  examples = """
+Examples:
+  > SELECT _FUNC_('10' as int);
+   10
+  > SELECT _FUNC_(1234567890123L as int);
+   null
+  """,
+  since = "3.2.0",
+  group = "conversion_funcs")

Review comment:
   Nice! Thanks for letting me know.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] timarmstrong commented on pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



timarmstrong commented on pull request #32006:
URL: https://github.com/apache/spark/pull/32006#issuecomment-810801024


   Thanks for the reviews!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] wangyum commented on pull request #31984: [SPARK-34884][SQL] Improve DPP evaluation to make filtering side must can broadcast by size or broadcast by hint

2021-03-30 Thread GitBox



wangyum commented on pull request #31984:
URL: https://github.com/apache/spark/pull/31984#issuecomment-810800714


   Benchmark result(spark.sql.adaptive.enabled=false):
   
   SQL | 
Before(spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly=true) | 
After(spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly=false)
   -- | -- | --
   58 | 144 | 21
   73 | 8 | 7
   83 | 25 | 14
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-30 Thread GitBox



viirya commented on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810799188


   @cloud-fan Captured a screenshot and attached in the description. The DS v2 
uses the same custom metrics as I added in `SQLAppStatusListenerSuite`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



SparkQA removed a comment on pull request #32006:
URL: https://github.com/apache/spark/pull/32006#issuecomment-810691742


   **[Test build #136740 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136740/testReport)**
 for PR 32006 at commit 
[`3e25454`](https://github.com/apache/spark/commit/3e254540a7e3f77c3b5db4bacb17f3b9332bf8de).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



SparkQA commented on pull request #32006:
URL: https://github.com/apache/spark/pull/32006#issuecomment-810796518


   **[Test build #136740 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136740/testReport)**
 for PR 32006 at commit 
[`3e25454`](https://github.com/apache/spark/commit/3e254540a7e3f77c3b5db4bacb17f3b9332bf8de).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-30 Thread GitBox



viirya commented on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810792543


   Okay. Let me have a simple test DS v2 locally and capture some screenshots 
of the web UI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #31989: [WIP][SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-30 Thread GitBox



SparkQA removed a comment on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-810695572


   **[Test build #136742 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136742/testReport)**
 for PR 31989 at commit 
[`25bbd47`](https://github.com/apache/spark/commit/25bbd4772a08b0c19d1cd305ef82d26b922a21e9).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #31989: [WIP][SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-30 Thread GitBox



SparkQA commented on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-810790360


   **[Test build #136742 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136742/testReport)**
 for PR 31989 at commit 
[`25bbd47`](https://github.com/apache/spark/commit/25bbd4772a08b0c19d1cd305ef82d26b922a21e9).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan closed pull request #31680: [SPARK-34568][SQL] When SparkContext's conf not enable hive, we should respect `enableHiveSupport()` when build SparkSession too

2021-03-30 Thread GitBox



cloud-fan closed pull request #31680:
URL: https://github.com/apache/spark/pull/31680


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #31680: [SPARK-34568][SQL] When SparkContext's conf not enable hive, we should respect `enableHiveSupport()` when build SparkSession too

2021-03-30 Thread GitBox



cloud-fan commented on pull request #31680:
URL: https://github.com/apache/spark/pull/31680#issuecomment-810789864


   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

2021-03-30 Thread GitBox



SparkQA commented on pull request #31982:
URL: https://github.com/apache/spark/pull/31982#issuecomment-810789707


   **[Test build #136752 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136752/testReport)**
 for PR 31982 at commit 
[`9266934`](https://github.com/apache/spark/commit/92669341d5eb849c853104c2a8052ec81fb2a4e5).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #31653: [SPARK-33832][SQL] v2. move OptimzieSkewedJoin to query stage preparation

2021-03-30 Thread GitBox



cloud-fan commented on a change in pull request #31653:
URL: https://github.com/apache/spark/pull/31653#discussion_r604614313



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##
@@ -251,48 +253,129 @@ object OptimizeSkewedJoin extends 
CustomShuffleReaderRule {
   }
   }
 
+  /**
+   * A potential stage is from Exchange down.  Actual [[QueryStageExec]] nodes 
are created
+   * by [[AdaptiveSparkPlanExec.newQueryStage]] bounded by previously created 
[[QueryStageExec]]
+   * nodes below.
+   * Todo: need better way to identify which join the log msgs below refer to. 
 Tags?
+   */
+  private def handlePotentialQueryStage(plan: SparkPlan): SparkPlan = {
+val shuffleStages = collectShuffleStages(plan)
+val s = ExplainUtils.getAQELogPrefix(shuffleStages)
+
+if (shuffleStages.length != 2 && !conf.adaptiveForceIfShuffle) {
+  /* Consider Case II.  Shuffle above SMJ1.  We should see 3 SQSE nodes but
+   with adaptiveForceIfShuffle() we should be able to add a new shuffle
+   above SMJ2 to enable skew mitigation of SMJ2.  W/o ability to add a new
+   shuffle skew mitigation is still possible in some cases - to be handled 
later.
+
+   Add a test for this.
+   See test("skew in deeply nested join - test ShuffleAddedException") and
+   add a similar test with just 2 joins */
+  logInfo(s"OptimizeSkewedJoin: rule is not applied since" +
+s" shuffleStages.length=${shuffleStages.length} != 2 and " +
+s"${SQLConf.ADAPTIVE_FORCE_IF_SHUFFLE.key}=false; $s")
+  return plan
+}
+val numShufflesBefore = plan.collect {
+  case e: ShuffleExchangeExec => e
+}.length
+val mitigatedPlan = optimizeSkewJoin(plan)
+if (mitigatedPlan eq plan) {
+  return plan
+}
+val executedPlan = ensureRequirements.apply(mitigatedPlan)
+val numNewShuffles = executedPlan.collect {
+  case e: ShuffleExchangeExec => e
+}.length - numShufflesBefore
+if(numNewShuffles > 0) {
+  if (conf.adaptiveForceIfShuffle) {
+logInfo(s"OptimizeSkewedJoin: rule is applied. " +
+  s"$numNewShuffles additional shuffles will be introduced; $s")
+executedPlan // make sure to return plan with new shuffles
+  } else {
+logInfo(s"OptimizeSkewedJoin: rule is not applied due" +
+  s" to $numNewShuffles additional shuffles will be introduced; $s")
+plan
+  }
+} else {
+  executedPlan
+}
+  }
+
+  def collectShuffleStages(plan: SparkPlan): Seq[ShuffleQueryStageExec] = plan 
match {
+case stage: ShuffleQueryStageExec => Seq(stage)
+case _ => plan.children.flatMap(collectShuffleStages)
+  }
+  /**
+   * Now this runs as part of queryStagePreparationRules() which means it runs 
over the whole plan
+   * which may have any number of ExchangeExec nodes, i.e. multiple "query 
stages"

Review comment:
   Maybe we don't have to be optimal in the first version. We can optimize 
all the leaf SMJs, and revert all of them if extra shuffles are introduced. The 
optimal solution is to find out which SMJ caused the extra shuffle and only 
revert it. We can do it later.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk commented on pull request #32001: [SPARK-34902][SQL] Support cast between LongType & DayTimeIntervalType and IntegerType & YearMonthIntervalType

2021-03-30 Thread GitBox



MaxGekk commented on pull request #32001:
URL: https://github.com/apache/spark/pull/32001#issuecomment-810787472


   As @cloud-fan said we have special functions that convert numbers to 
timestamps. I quickly look at Oracle, it has similar function for intervals. 
For example, 
[NUMTODSINTERVAL](https://docs.oracle.com/cd/E11882_01/server.112/e41084/functions117.htm#SQLRF00682)
 converts a `NUM` to a `DAY TO SECOND INTERVAL`:
   ```
   NUMTODSINTERVAL(100, 'day')
   ```
   @AngersZh Could you look at other DMBS, and see how they cast intervals 
from/to numbers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gengliangwang commented on a change in pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

2021-03-30 Thread GitBox



gengliangwang commented on a change in pull request #31982:
URL: https://github.com/apache/spark/pull/31982#discussion_r604612597



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/TryCast.scala
##
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import org.apache.spark.sql.catalyst.expressions.codegen._
+import org.apache.spark.sql.catalyst.expressions.codegen.Block._
+import org.apache.spark.sql.types.DataType
+
+/**
+ * A special version of [[AnsiCast]]. It performs the same operation (i.e. 
converts a value of
+ * one data type into another data type), but returns a NULL value instead of 
raising an error
+ * when the conversion can not be performed.
+ *
+ * When cast from/to timezone related types, we need timeZoneId, which will be 
resolved with
+ * session local timezone by an analyzer [[ResolveTimeZone]].
+ */
+@ExpressionDescription(
+  usage = "_FUNC_(expr AS type) - Casts the value `expr` to the target data 
type `type`. " +
+"This expression is identical to CAST with configuration 
`spark.sql.ansi.enabled` as " +
+"true, except it returns NULL instead of raising an error. Note that the 
behavior of this " +
+"expression doesn't depend on configuration `spark.sql.ansi.enabled`.",
+  examples = """
+Examples:
+  > SELECT _FUNC_('10' as int);
+   10
+  > SELECT _FUNC_(1234567890123L as int);
+   null
+  """,
+  since = "3.2.0",
+  group = "conversion_funcs")

Review comment:
   Actually I plan to create docs for both CAST and TRY_CAST, even with 
ANSI CAST. Grouping them into one section.
   I have created https://issues.apache.org/jira/browse/SPARK-34917 and 
https://issues.apache.org/jira/browse/SPARK-34918




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-30 Thread GitBox



SparkQA commented on pull request #30057:
URL: https://github.com/apache/spark/pull/30057#issuecomment-810783077


   **[Test build #136751 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136751/testReport)**
 for PR 30057 at commit 
[`81b1bd8`](https://github.com/apache/spark/commit/81b1bd817a30b5a31026d5841c2ba7189598e3b4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-30 Thread GitBox



SparkQA commented on pull request #30144:
URL: https://github.com/apache/spark/pull/30144#issuecomment-810783018


   **[Test build #136750 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136750/testReport)**
 for PR 30144 at commit 
[`f5763e8`](https://github.com/apache/spark/commit/f5763e8580ebb70a2c89679852e1e2301d58641d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #32001: [SPARK-34902][SQL] Support cast between LongType & DayTimeIntervalType and IntegerType & YearMonthIntervalType

2021-03-30 Thread GitBox



cloud-fan commented on pull request #32001:
URL: https://github.com/apache/spark/pull/32001#issuecomment-810782597


   > this conversion could be safe
   
   It's not about safe or not. It's about how to make the behavior easy to 
understand for end-users.
   
   CAST is a standard SQL operator, and I don't think it makes sense that 
casting integral value to day-time value should treat the input as 
microseconds, simply because in Spark the precision is microseconds. This 
behavior should not be vendor-specific.
   
   Ideally the data source should have native support of interval type to store 
them. If it does not, we can provide some new functions to convert between 
int/long and year-month/day-time interval, similar to these timestamp 
functions. We can also do some research and see how other databases handle this 
case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #31983: [SPARK-34882][SQL] Replace if with filter clause in RewriteDistinctAggregates

2021-03-30 Thread GitBox



SparkQA commented on pull request #31983:
URL: https://github.com/apache/spark/pull/31983#issuecomment-810782549


   **[Test build #136749 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136749/testReport)**
 for PR 31983 at commit 
[`2530e89`](https://github.com/apache/spark/commit/2530e89304c874bee785eefcb8b8c09648046d17).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810780681


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136746/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #31204:
URL: https://github.com/apache/spark/pull/31204#issuecomment-810780686


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136744/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810780681


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136746/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #31204:
URL: https://github.com/apache/spark/pull/31204#issuecomment-810780686


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136744/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



AngersZh commented on pull request #32006:
URL: https://github.com/apache/spark/pull/32006#issuecomment-810775087


   Good catch!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-03-30 Thread GitBox



SparkQA removed a comment on pull request #31204:
URL: https://github.com/apache/spark/pull/31204#issuecomment-810715229


   **[Test build #136744 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136744/testReport)**
 for PR 31204 at commit 
[`13e3692`](https://github.com/apache/spark/commit/13e36921cf9898ab83da8b8bc802b8a3edb36a29).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-03-30 Thread GitBox



SparkQA commented on pull request #31204:
URL: https://github.com/apache/spark/pull/31204#issuecomment-810773894


   **[Test build #136744 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136744/testReport)**
 for PR 31204 at commit 
[`13e3692`](https://github.com/apache/spark/commit/13e36921cf9898ab83da8b8bc802b8a3edb36a29).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk closed pull request #31996: [SPARK-34896][SQL] Return day-time interval from dates subtraction

2021-03-30 Thread GitBox



MaxGekk closed pull request #31996:
URL: https://github.com/apache/spark/pull/31996


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] MaxGekk commented on pull request #31996: [SPARK-34896][SQL] Return day-time interval from dates subtraction

2021-03-30 Thread GitBox



MaxGekk commented on pull request #31996:
URL: https://github.com/apache/spark/pull/31996#issuecomment-810769945


   Thank you @cloud-fan @AngersZh for your review. Merging to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



SparkQA removed a comment on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810719159


   **[Test build #136746 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136746/testReport)**
 for PR 32009 at commit 
[`369e08b`](https://github.com/apache/spark/commit/369e08b2e39b09868238db00b14db4a0eb526ddc).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



SparkQA commented on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810769301


   **[Test build #136746 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136746/testReport)**
 for PR 32009 at commit 
[`369e08b`](https://github.com/apache/spark/commit/369e08b2e39b09868238db00b14db4a0eb526ddc).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] ulysses-you commented on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



ulysses-you commented on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810767033


   thank you for taking a look @HyukjinKwon @yaooqinn 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #31993: [SPARK-34897][SQL] Add workaround to error message when OrcUtils.requestedColumnIds fails

2021-03-30 Thread GitBox



cloud-fan commented on pull request #31993:
URL: https://github.com/apache/spark/pull/31993#issuecomment-810765696


   Sorry I may miss something. Why it's only a problem in nested column pruning 
but not column pruning?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] tanelk commented on a change in pull request #31983: [SPARK-34882][SQL] Replace if with filter clause in RewriteDistinctAggregates

2021-03-30 Thread GitBox



tanelk commented on a change in pull request #31983:
URL: https://github.com/apache/spark/pull/31983#discussion_r604597457



##
File path: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
##
@@ -2834,6 +2835,29 @@ class DataFrameSuite extends QueryTest
   df10.select(zip_with(col("array1"), col("array2"), (b1, b2) => 
reverseThenConcat2(b1, b2)))
 checkAnswer(test10, Row(Array(Row("cbaihg"), Row("fedlkj"))) :: Nil)
   }
+
+  test("SPARK-34882: Aggregate with multiple distinct null sensitive 
aggregators") {
+spark.udf.register("countNulls", udaf(new Aggregator[JLong, JLong, JLong] {

Review comment:
   I added the `withUserDefinedFunction`, but I did not understand the 
first question.
   I added this udaf, because the built in aggregates, that are "null 
sensitive" (`First` and `Last`) gave unstable test results in the 
`SQLQueryTestSuite`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-30 Thread GitBox



SparkQA commented on pull request #32011:
URL: https://github.com/apache/spark/pull/32011#issuecomment-810761488


   **[Test build #136748 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136748/testReport)**
 for PR 32011 at commit 
[`642d7c0`](https://github.com/apache/spark/commit/642d7c09f604c2912042e1b863a6750d86184170).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan closed pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



cloud-fan closed pull request #32006:
URL: https://github.com/apache/spark/pull/32006


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



cloud-fan commented on pull request #32006:
URL: https://github.com/apache/spark/pull/32006#issuecomment-810761260


   thanks, merging to master/3.1/3.0/2.4!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-30 Thread GitBox



HyukjinKwon commented on pull request #32011:
URL: https://github.com/apache/spark/pull/32011#issuecomment-810761149


   cc @dongjoon-hyun, @gengliangwang and @maropu FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon opened a new pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-30 Thread GitBox



HyukjinKwon opened a new pull request #32011:
URL: https://github.com/apache/spark/pull/32011


   ### What changes were proposed in this pull request?
   
   This PR proposes to cache Maven, SBT and Scala in all jobs that use them. 
For simplicity, we use the same key `build-` and just cache all SBT, Maven and 
Scala. The cache is not very large.
   
   ### Why are the changes needed?
   
   To speed up the build.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, dev-only.
   
   ### How was this patch tested?
   
   It will be tested in this PR's GA jobs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #32006: [SPARK-34909][SQL] Fix conversion of negative to unsigned in conv()

2021-03-30 Thread GitBox



cloud-fan commented on a change in pull request #32006:
URL: https://github.com/apache/spark/pull/32006#discussion_r604593841



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/NumberConverter.scala
##
@@ -52,7 +32,7 @@ object NumberConverter {
 java.util.Arrays.fill(value, 0.asInstanceOf[Byte])
 var i = value.length - 1
 while (tmpV != 0) {
-  val q = unsignedLongDiv(tmpV, radix)
+  val q = java.lang.Long.divideUnsigned(tmpV, radix)

Review comment:
   yea we should use the standard Java API as possible as we can.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-30 Thread GitBox



cloud-fan commented on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810758921


   @viirya Can we write a simple DS v2 with metrics and try it locally? Then we 
can get some screenshots of the web UI, and also verify the history server.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810755166


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136738/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810755134


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41329/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810755166


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136738/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810755134


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41329/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



SparkQA commented on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810755051






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon closed pull request #32005: [SPARK-34907][TESTS] Add main class that detects and runs all benchmarks

2021-03-30 Thread GitBox



HyukjinKwon closed pull request #32005:
URL: https://github.com/apache/spark/pull/32005


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #32005: [SPARK-34907][TESTS] Add main class that detects and runs all benchmarks

2021-03-30 Thread GitBox



HyukjinKwon commented on pull request #32005:
URL: https://github.com/apache/spark/pull/32005#issuecomment-810754861


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #32005: [SPARK-34907][TESTS] Add main class that detects and runs all benchmarks

2021-03-30 Thread GitBox



HyukjinKwon commented on pull request #32005:
URL: https://github.com/apache/spark/pull/32005#issuecomment-810754656


   Thanks guys. Let me merge this in first and proceed (it won't break or 
affect anything in our CI anyway). I am working on SPARK-34821 now. Let's see 
how it goes!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-30 Thread GitBox



SparkQA removed a comment on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810647889


   **[Test build #136738 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136738/testReport)**
 for PR 31451 at commit 
[`d5d8678`](https://github.com/apache/spark/commit/d5d867880ebb57c49ac422251ba50bbabf1159d1).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-30 Thread GitBox



SparkQA commented on pull request #31451:
URL: https://github.com/apache/spark/pull/31451#issuecomment-810745261


   **[Test build #136738 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136738/testReport)**
 for PR 31451 at commit 
[`d5d8678`](https://github.com/apache/spark/commit/d5d867880ebb57c49ac422251ba50bbabf1159d1).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn commented on pull request #31804: [SPARK-34710][SQL] Add tableType column for SHOW TABLES to distinguish view and tables

2021-03-30 Thread GitBox



yaooqinn commented on pull request #31804:
URL: https://github.com/apache/spark/pull/31804#issuecomment-810740574


   cc @cloud-fan @HyukjinKwon PTAL, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810739994


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41328/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



SparkQA commented on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810739984


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41328/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810739994


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41328/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



SparkQA commented on pull request #32010:
URL: https://github.com/apache/spark/pull/32010#issuecomment-810739013


   **[Test build #136747 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136747/testReport)**
 for PR 32010 at commit 
[`7d367e3`](https://github.com/apache/spark/commit/7d367e38e625a1007e1922ac3fb17da9d17647d6).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #31204:
URL: https://github.com/apache/spark/pull/31204#issuecomment-810738653


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41326/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] yaooqinn opened a new pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-30 Thread GitBox



yaooqinn opened a new pull request #32010:
URL: https://github.com/apache/spark/pull/32010


   
   
   ### What changes were proposed in this pull request?
   
   
   Using char and varchar with the string functions and some other expressions 
might be confusing and ambiguous. In this PR we add test cases for char and 
varchar with these operations to reveal these behavior and see if we can come 
up with a general pattern for them.
   
   ### Why are the changes needed?
   
   test coverage
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   no
   
   ### How was this patch tested?
   
   new tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-03-30 Thread GitBox



SparkQA commented on pull request #31204:
URL: https://github.com/apache/spark/pull/31204#issuecomment-810738637


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41326/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #31204:
URL: https://github.com/apache/spark/pull/31204#issuecomment-810738653


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41326/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29087: [SPARK-28227][SQL] Support projection, aggregate/window functions, and lateral view in the TRANSFORM clause

2021-03-30 Thread GitBox



AmplabJenkins removed a comment on pull request #29087:
URL: https://github.com/apache/spark/pull/29087#issuecomment-810737603


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136739/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29087: [SPARK-28227][SQL] Support projection, aggregate/window functions, and lateral view in the TRANSFORM clause

2021-03-30 Thread GitBox



AmplabJenkins commented on pull request #29087:
URL: https://github.com/apache/spark/pull/29087#issuecomment-810737603


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136739/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-30 Thread GitBox



SparkQA commented on pull request #32009:
URL: https://github.com/apache/spark/pull/32009#issuecomment-810737551


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41328/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #29087: [SPARK-28227][SQL] Support projection, aggregate/window functions, and lateral view in the TRANSFORM clause

2021-03-30 Thread GitBox



SparkQA removed a comment on pull request #29087:
URL: https://github.com/apache/spark/pull/29087#issuecomment-810648089


   **[Test build #136739 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136739/testReport)**
 for PR 29087 at commit 
[`1278705`](https://github.com/apache/spark/commit/12787053aec9d015506d5c59c58e91dd23d5bb82).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29087: [SPARK-28227][SQL] Support projection, aggregate/window functions, and lateral view in the TRANSFORM clause

2021-03-30 Thread GitBox



SparkQA commented on pull request #29087:
URL: https://github.com/apache/spark/pull/29087#issuecomment-810737064


   **[Test build #136739 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136739/testReport)**
 for PR 29087 at commit 
[`1278705`](https://github.com/apache/spark/commit/12787053aec9d015506d5c59c58e91dd23d5bb82).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #31204: [SPARK-26399][WEBUI][CORE] Add new stage-level REST APIs and parameters

2021-03-30 Thread GitBox



SparkQA commented on pull request #31204:
URL: https://github.com/apache/spark/pull/31204#issuecomment-810735545


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41326/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 6 7 8 >

1 - 100 of 717 matches

Mail list logo