[GitHub] [spark] c21 commented on a change in pull request #31958: [SPARK-34862][SQL] Support nested column in ORC vectorized reader

2021-03-31 Thread GitBox
c21 commented on a change in pull request #31958: URL: https://github.com/apache/spark/pull/31958#discussion_r604642748 ## File path: project/MimaExcludes.scala ## @@ -40,7 +40,22 @@ object MimaExcludes { ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.c

[GitHub] [spark] SparkQA commented on pull request #31983: [SPARK-34882][SQL] Replace if with filter clause in RewriteDistinctAggregates

2021-03-31 Thread GitBox
SparkQA commented on pull request #31983: URL: https://github.com/apache/spark/pull/31983#issuecomment-810827186 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41331/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810827595 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41332/ -- This

[GitHub] [spark] SparkQA commented on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code

2021-03-31 Thread GitBox
SparkQA commented on pull request #31776: URL: https://github.com/apache/spark/pull/31776#issuecomment-810829057 **[Test build #136745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136745/testReport)** for PR 31776 at commit [`9a7ec8c`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code

2021-03-31 Thread GitBox
SparkQA removed a comment on pull request #31776: URL: https://github.com/apache/spark/pull/31776#issuecomment-810717272 **[Test build #136745 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136745/testReport)** for PR 31776 at commit [`9a7ec8c`](https://gi

[GitHub] [spark] wangshuo128 commented on a change in pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-03-31 Thread GitBox
wangshuo128 commented on a change in pull request #31968: URL: https://github.com/apache/spark/pull/31968#discussion_r604647620 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -223,11 +224,18 @@ class Dataset[T] private[sql]( @transient private

[GitHub] [spark] wangshuo128 commented on a change in pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-03-31 Thread GitBox
wangshuo128 commented on a change in pull request #31968: URL: https://github.com/apache/spark/pull/31968#discussion_r604647620 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -223,11 +224,18 @@ class Dataset[T] private[sql]( @transient private

[GitHub] [spark] wangshuo128 commented on a change in pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-03-31 Thread GitBox
wangshuo128 commented on a change in pull request #31968: URL: https://github.com/apache/spark/pull/31968#discussion_r604647620 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -223,11 +224,18 @@ class Dataset[T] private[sql]( @transient private

[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

2021-03-31 Thread GitBox
SparkQA commented on pull request #31982: URL: https://github.com/apache/spark/pull/31982#issuecomment-810834893 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41335/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-31 Thread GitBox
SparkQA commented on pull request #30057: URL: https://github.com/apache/spark/pull/30057#issuecomment-810835344 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41333/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

2021-03-31 Thread GitBox
SparkQA commented on pull request #31982: URL: https://github.com/apache/spark/pull/31982#issuecomment-810840886 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41335/ -- This is an automated message from the A

[GitHub] [spark] cloud-fan commented on a change in pull request #31958: [SPARK-34862][SQL] Support nested column in ORC vectorized reader

2021-03-31 Thread GitBox
cloud-fan commented on a change in pull request #31958: URL: https://github.com/apache/spark/pull/31958#discussion_r604657478 ## File path: project/MimaExcludes.scala ## @@ -40,7 +40,22 @@ object MimaExcludes { ProblemFilters.exclude[MissingClassProblem]("org.apache.spark

[GitHub] [spark] viirya commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
viirya commented on pull request #31451: URL: https://github.com/apache/spark/pull/31451#issuecomment-810844094 > @viirya how about the history server? I'm a bit worried about the event log with v2 metrics. Oh, there is a problem on ser/de `aggregateMethod`... -- This is an automa

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810844272 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41334/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
SparkQA removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810783018 **[Test build #136750 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136750/testReport)** for PR 30144 at commit [`f5763e8`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810844906 **[Test build #136750 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136750/testReport)** for PR 30144 at commit [`f5763e8`](https://github.co

[GitHub] [spark] cloud-fan commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
cloud-fan commented on pull request #31451: URL: https://github.com/apache/spark/pull/31451#issuecomment-810849844 @viirya I think this is hard to fix. The history server is a different JVM and may not have the data source implementation classes loaded, we can't run the aggregate code in t

[GitHub] [spark] AmplabJenkins commented on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #30057: URL: https://github.com/apache/spark/pull/30057#issuecomment-810849939 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41333/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #31776: URL: https://github.com/apache/spark/pull/31776#issuecomment-810849938 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136745/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #31982: URL: https://github.com/apache/spark/pull/31982#issuecomment-810849942 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41335/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810849940 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] AmplabJenkins commented on pull request #31983: [SPARK-34882][SQL] Replace if with filter clause in RewriteDistinctAggregates

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #31983: URL: https://github.com/apache/spark/pull/31983#issuecomment-810849943 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41331/ -- T

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31983: [SPARK-34882][SQL] Replace if with filter clause in RewriteDistinctAggregates

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #31983: URL: https://github.com/apache/spark/pull/31983#issuecomment-810849943 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41331/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #30057: URL: https://github.com/apache/spark/pull/30057#issuecomment-810849939 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41333/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31776: [SPARK-34661][SQL] Clean up `OriginalType` and `DecimalMetadata ` usage in Parquet related code

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #31776: URL: https://github.com/apache/spark/pull/31776#issuecomment-810849938 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136745/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31982: [SPARK-34881][SQL] New SQL Function: TRY_CAST

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #31982: URL: https://github.com/apache/spark/pull/31982#issuecomment-810849942 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41335/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810849940 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810851420 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41334/ -- T

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810851327 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41334/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810851420 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41334/

[GitHub] [spark] cloud-fan commented on a change in pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
cloud-fan commented on a change in pull request #31451: URL: https://github.com/apache/spark/pull/31451#discussion_r604669214 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetricInfo.scala ## @@ -27,4 +27,5 @@ import org.apache.spark.annotation

[GitHub] [spark] cloud-fan commented on a change in pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
cloud-fan commented on a change in pull request #31451: URL: https://github.com/apache/spark/pull/31451#discussion_r604669872 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetricInfo.scala ## @@ -27,4 +27,5 @@ import org.apache.spark.annotation

[GitHub] [spark] cloud-fan commented on a change in pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
cloud-fan commented on a change in pull request #31451: URL: https://github.com/apache/spark/pull/31451#discussion_r604669214 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetricInfo.scala ## @@ -27,4 +27,5 @@ import org.apache.spark.annotation

[GitHub] [spark] viirya commented on pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
viirya commented on pull request #31451: URL: https://github.com/apache/spark/pull/31451#issuecomment-810854755 For a `CustomMetric` defined outside Spark, even we record the class name, we still cannot instantiate the class. -- This is an automated message from the Apache Git Se

[GitHub] [spark] maropu commented on a change in pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
maropu commented on a change in pull request #30144: URL: https://github.com/apache/spark/pull/30144#discussion_r604635816 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/grouping.scala ## @@ -212,3 +212,29 @@ object GroupingID { if (SQ

[GitHub] [spark] AmplabJenkins commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810859118 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41336/ -- T

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810859079 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810859118 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41336/

[GitHub] [spark] srowen closed pull request #32008: [SPARK-34911][SQL] Fix code not close issue in monitoring.md

2021-03-31 Thread GitBox
srowen closed pull request #32008: URL: https://github.com/apache/spark/pull/32008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] srowen commented on pull request #32008: [SPARK-34911][SQL] Fix code not close issue in monitoring.md

2021-03-31 Thread GitBox
srowen commented on pull request #32008: URL: https://github.com/apache/spark/pull/32008#issuecomment-810859563 Merged to master -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comm

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-31 Thread GitBox
HyukjinKwon commented on a change in pull request #30965: URL: https://github.com/apache/spark/pull/30965#discussion_r604675022 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt ## @@ -6,71 +6,71 @@ TakeOrderedAndProjec

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-31 Thread GitBox
HyukjinKwon commented on a change in pull request #30965: URL: https://github.com/apache/spark/pull/30965#discussion_r604675022 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt ## @@ -6,71 +6,71 @@ TakeOrderedAndProjec

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-31 Thread GitBox
HyukjinKwon commented on a change in pull request #30965: URL: https://github.com/apache/spark/pull/30965#discussion_r604675022 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt ## @@ -6,71 +6,71 @@ TakeOrderedAndProjec

[GitHub] [spark] maropu commented on a change in pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-31 Thread GitBox
maropu commented on a change in pull request #32010: URL: https://github.com/apache/spark/pull/32010#discussion_r604675956 ## File path: sql/core/src/test/resources/sql-tests/inputs/charvarchar.sql ## @@ -61,7 +61,60 @@ desc formatted char_part; MSCK REPAIR TABLE char_part; d

[GitHub] [spark] maropu commented on a change in pull request #30965: [SPARK-33935][SQL] Fix CBO cost function

2021-03-31 Thread GitBox
maropu commented on a change in pull request #30965: URL: https://github.com/apache/spark/pull/30965#discussion_r604678370 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q19.sf100/simplified.txt ## @@ -6,71 +6,71 @@ TakeOrderedAndProject [e

[GitHub] [spark] viirya commented on a change in pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
viirya commented on a change in pull request #31451: URL: https://github.com/apache/spark/pull/31451#discussion_r604680021 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetricInfo.scala ## @@ -27,4 +27,5 @@ import org.apache.spark.annotation.De

[GitHub] [spark] cloud-fan commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
cloud-fan commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810864610 Can we enrich the PR description a bit more? Now we support `group by a, cube(b, c)`, but what's its semantic? Does it have a equivalent representation using `GROUPING SETS`?

[GitHub] [spark] cloud-fan commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
cloud-fan commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810866475 Please update the SQL doc accordingly as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] cloud-fan commented on a change in pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
cloud-fan commented on a change in pull request #31451: URL: https://github.com/apache/spark/pull/31451#discussion_r604683543 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetricInfo.scala ## @@ -27,4 +27,5 @@ import org.apache.spark.annotation

[GitHub] [spark] cloud-fan commented on a change in pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-03-31 Thread GitBox
cloud-fan commented on a change in pull request #31968: URL: https://github.com/apache/spark/pull/31968#discussion_r604687761 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -223,11 +224,18 @@ class Dataset[T] private[sql]( @transient private[s

[GitHub] [spark] SparkQA commented on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
SparkQA commented on pull request #32011: URL: https://github.com/apache/spark/pull/32011#issuecomment-810873918 **[Test build #136748 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136748/testReport)** for PR 32011 at commit [`642d7c0`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
SparkQA removed a comment on pull request #32011: URL: https://github.com/apache/spark/pull/32011#issuecomment-810761488 **[Test build #136748 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136748/testReport)** for PR 32011 at commit [`642d7c0`](https://gi

[GitHub] [spark] xuanyuanking commented on pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-03-31 Thread GitBox
xuanyuanking commented on pull request #31944: URL: https://github.com/apache/spark/pull/31944#issuecomment-810876034 ``` They want to dynamically adjust the size of the cluster based on how far they're falling behind the latest. With this PR, they can be exposed to this metrics through

[GitHub] [spark] wangshuo128 commented on a change in pull request #31968: [SPARK-34873][SQL] Avoid wrapped in withNewExecutionId twice when run SQL with side effects

2021-03-31 Thread GitBox
wangshuo128 commented on a change in pull request #31968: URL: https://github.com/apache/spark/pull/31968#discussion_r604700219 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -223,11 +224,18 @@ class Dataset[T] private[sql]( @transient private

[GitHub] [spark] sadhen commented on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
sadhen commented on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810886187 > @HyukjinKwon: > Furthermore, we will probably have to do it for toPandas and createDataFrame with Arrow optimization on. It should be best to think about these cases as well

[GitHub] [spark] sadhen edited a comment on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
sadhen edited a comment on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810886187 > @HyukjinKwon: > Furthermore, we will probably have to do it for toPandas and createDataFrame with Arrow optimization on. It should be best to think about these cases

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810888422 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41337/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #32011: URL: https://github.com/apache/spark/pull/32011#issuecomment-810889593 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136748/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #32011: URL: https://github.com/apache/spark/pull/32011#issuecomment-810889593 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136748/ -

[GitHub] [spark] viirya commented on a change in pull request #31451: [SPARK-34338][SQL] Report metrics from Datasource v2 scan

2021-03-31 Thread GitBox
viirya commented on a change in pull request #31451: URL: https://github.com/apache/spark/pull/31451#discussion_r604707280 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetricInfo.scala ## @@ -27,4 +27,5 @@ import org.apache.spark.annotation.De

[GitHub] [spark] SparkQA commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-31 Thread GitBox
SparkQA commented on pull request #31989: URL: https://github.com/apache/spark/pull/31989#issuecomment-810891685 **[Test build #136756 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136756/testReport)** for PR 31989 at commit [`da08bd4`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
SparkQA commented on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810891918 **[Test build #136757 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136757/testReport)** for PR 31735 at commit [`dca35df`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
SparkQA commented on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810894666 **[Test build #136757 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136757/testReport)** for PR 31735 at commit [`dca35df`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810894691 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136757/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810894691 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136757/ -

[GitHub] [spark] SparkQA removed a comment on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
SparkQA removed a comment on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810891918 **[Test build #136757 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136757/testReport)** for PR 31735 at commit [`dca35df`](https://gi

[GitHub] [spark] SparkQA commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-31 Thread GitBox
SparkQA commented on pull request #32010: URL: https://github.com/apache/spark/pull/32010#issuecomment-810895659 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41338/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #32010: URL: https://github.com/apache/spark/pull/32010#issuecomment-810899572 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41338/ -- T

[GitHub] [spark] SparkQA commented on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-31 Thread GitBox
SparkQA commented on pull request #32010: URL: https://github.com/apache/spark/pull/32010#issuecomment-810899541 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41338/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32010: [SPARK-34908][SQL] Add test cases for char and varchar with functions

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #32010: URL: https://github.com/apache/spark/pull/32010#issuecomment-810899572 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41338/

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #30144: [SPARK-33229][SQL] Support GROUP BY use Separate columns and CUBE/ROLLUP

2021-03-31 Thread GitBox
AngersZh commented on a change in pull request #30144: URL: https://github.com/apache/spark/pull/30144#discussion_r604725030 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/grouping.scala ## @@ -212,3 +212,29 @@ object GroupingID {

[GitHub] [spark] SparkQA commented on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-31 Thread GitBox
SparkQA commented on pull request #30057: URL: https://github.com/apache/spark/pull/30057#issuecomment-810907774 **[Test build #136751 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136751/testReport)** for PR 30057 at commit [`81b1bd8`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-31 Thread GitBox
SparkQA removed a comment on pull request #30057: URL: https://github.com/apache/spark/pull/30057#issuecomment-810783077 **[Test build #136751 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136751/testReport)** for PR 30057 at commit [`81b1bd8`](https://gi

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and mixed grouping analytics

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810910708 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41337/ -- This is an automated message from the A

[GitHub] [spark] gengliangwang commented on a change in pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
gengliangwang commented on a change in pull request #32011: URL: https://github.com/apache/spark/pull/32011#discussion_r604731377 ## File path: .github/workflows/build_and_test.yml ## @@ -367,6 +367,17 @@ jobs: steps: - name: Checkout Spark repository uses: act

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
HyukjinKwon commented on a change in pull request #32011: URL: https://github.com/apache/spark/pull/32011#discussion_r604732228 ## File path: .github/workflows/build_and_test.yml ## @@ -367,6 +367,17 @@ jobs: steps: - name: Checkout Spark repository uses: actio

[GitHub] [spark] gengliangwang commented on a change in pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
gengliangwang commented on a change in pull request #32011: URL: https://github.com/apache/spark/pull/32011#discussion_r604739085 ## File path: .github/workflows/build_and_test.yml ## @@ -367,6 +367,17 @@ jobs: steps: - name: Checkout Spark repository uses: act

[GitHub] [spark] maropu commented on a change in pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
maropu commented on a change in pull request #32011: URL: https://github.com/apache/spark/pull/32011#discussion_r604739239 ## File path: .github/workflows/build_and_test.yml ## @@ -367,6 +367,17 @@ jobs: steps: - name: Checkout Spark repository uses: actions/ch

[GitHub] [spark] SparkQA commented on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
SparkQA commented on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810925118 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] AmplabJenkins commented on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #30057: URL: https://github.com/apache/spark/pull/30057#issuecomment-810928959 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136751/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and mixed grouping analytics

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810928961 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41337/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810928960 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41340/ -- T

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30057: [SPARK-32838][SQL]Check DataSource insert command path with actual path

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #30057: URL: https://github.com/apache/spark/pull/30057#issuecomment-810928959 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136751/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and mixed grouping analytics

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810928961 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41337/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #31735: URL: https://github.com/apache/spark/pull/31735#issuecomment-810928960 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41340/

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and mixed grouping analytics

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810930672 **[Test build #136758 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136758/testReport)** for PR 30144 at commit [`84de8b6`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-31 Thread GitBox
SparkQA commented on pull request #31989: URL: https://github.com/apache/spark/pull/31989#issuecomment-810930949 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41339/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-31 Thread GitBox
SparkQA commented on pull request #31989: URL: https://github.com/apache/spark/pull/31989#issuecomment-810936283 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41339/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-31 Thread GitBox
AmplabJenkins commented on pull request #31989: URL: https://github.com/apache/spark/pull/31989#issuecomment-810936321 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41339/ -- T

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-03-31 Thread GitBox
AmplabJenkins removed a comment on pull request #31989: URL: https://github.com/apache/spark/pull/31989#issuecomment-810936321 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41339/

[GitHub] [spark] AngersZhuuuu commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and mixed grouping analytics

2021-03-31 Thread GitBox
AngersZh commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810937600 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For qu

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and mixed grouping analytics

2021-03-31 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-810938681 **[Test build #136759 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136759/testReport)** for PR 30144 at commit [`9f03c88`](https://github.com

[GitHub] [spark] ulysses-you opened a new pull request #32012: [SPARK-34919][SQL] Change partitioning to SinglePartition if partition number is 1

2021-03-31 Thread GitBox
ulysses-you opened a new pull request #32012: URL: https://github.com/apache/spark/pull/32012 ### What changes were proposed in this pull request? Change partitioning to `SinglePartition`. ### Why are the changes needed? For node `Repartition` and `RepartitionByE

[GitHub] [spark] ulysses-you commented on pull request #32012: [SPARK-34919][SQL] Change partitioning to SinglePartition if partition number is 1

2021-03-31 Thread GitBox
ulysses-you commented on pull request #32012: URL: https://github.com/apache/spark/pull/32012#issuecomment-810943595 cc @maropu @cloud-fan @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] gengliangwang commented on pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
gengliangwang commented on pull request #32011: URL: https://github.com/apache/spark/pull/32011#issuecomment-810949277 Thanks, merging to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] gengliangwang closed pull request #32011: [SPARK-34915][INFRA] Cache Maven, SBT and Scala in all jobs that use them

2021-03-31 Thread GitBox
gengliangwang closed pull request #32011: URL: https://github.com/apache/spark/pull/32011 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] gaborgsomogyi commented on a change in pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-31 Thread GitBox
gaborgsomogyi commented on a change in pull request #32009: URL: https://github.com/apache/spark/pull/32009#discussion_r604759649 ## File path: core/src/main/scala/org/apache/spark/scheduler/local/LocalSchedulerBackend.scala ## @@ -178,4 +188,23 @@ private[spark] class LocalSc

[GitHub] [spark] gaborgsomogyi commented on a change in pull request #32009: [SPARK-34914][CORE] Local scheduler backend support update token

2021-03-31 Thread GitBox
gaborgsomogyi commented on a change in pull request #32009: URL: https://github.com/apache/spark/pull/32009#discussion_r604772518 ## File path: core/src/main/scala/org/apache/spark/scheduler/local/LocalSchedulerBackend.scala ## @@ -178,4 +188,23 @@ private[spark] class LocalSc

[GitHub] [spark] wangyum opened a new pull request #32013: [WIP][SPARK-34920][SQL] Add SQLSTATE and ERRORCODE to SQL exception

2021-03-31 Thread GitBox
wangyum opened a new pull request #32013: URL: https://github.com/apache/spark/pull/32013 ### What changes were proposed in this pull request? This pr add SQLSTATE and ERRORCODE to SQL exception. ### Why are the changes needed? 1. Fellow SQL standard. 2. Some JDBC/ODB

[GitHub] [spark] SparkQA commented on pull request #32012: [SPARK-34919][SQL] Change partitioning to SinglePartition if partition number is 1

2021-03-31 Thread GitBox
SparkQA commented on pull request #32012: URL: https://github.com/apache/spark/pull/32012#issuecomment-810966384 **[Test build #136761 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136761/testReport)** for PR 32012 at commit [`8ff0d0c`](https://github.com

  1   2   3   4   5   6   >