[GitHub] [spark] AmplabJenkins removed a comment on pull request #32028: [SPARK-24931][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-812068248 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136818/

[GitHub] [spark] AmplabJenkins commented on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-812068243 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136814/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32028: [SPARK-24931][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-812068248 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136818/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-812068245 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136821/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #32026: URL: https://github.com/apache/spark/pull/32026#issuecomment-812068244 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136820/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-812068246 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136815/ -- This

[GitHub] [spark] LeonardoZV edited a comment on pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

2021-04-01 Thread GitBox
LeonardoZV edited a comment on pull request #31771: URL: https://github.com/apache/spark/pull/31771#issuecomment-812063145 My humble opinion: I think Spark should full support Confluent SR. The use of event driven architecture is skyrockting and big companies like to control event

[GitHub] [spark] LeonardoZV edited a comment on pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

2021-04-01 Thread GitBox
LeonardoZV edited a comment on pull request #31771: URL: https://github.com/apache/spark/pull/31771#issuecomment-812063145 My humble opinion: I think Spark should somehow support Confluent SR. The use of event driven architecture is skyrockting and big companies like to control

[GitHub] [spark] LeonardoZV edited a comment on pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

2021-04-01 Thread GitBox
LeonardoZV edited a comment on pull request #31771: URL: https://github.com/apache/spark/pull/31771#issuecomment-812063145 My humble opinion: I think Spark should somehow support Confluent SR. The use of event driven architecture is skyrockting and big companies like to control

[GitHub] [spark] LeonardoZV commented on pull request #31771: [SPARK-34652][AVRO] Support SchemaRegistry in from_avro method

2021-04-01 Thread GitBox
LeonardoZV commented on pull request #31771: URL: https://github.com/apache/spark/pull/31771#issuecomment-812063145 My humble opinion: I think Spark should somehow support Confluent SR. The use of event driven architecture is skyrockting and big companies like to control event

[GitHub] [spark] baohe-zhang commented on pull request #31871: [SPARK-34779][CORE] ExecutorMetricsPoller should keep stage entry in stageTCMP until a heartbeat occurs

2021-04-01 Thread GitBox
baohe-zhang commented on pull request #31871: URL: https://github.com/apache/spark/pull/31871#issuecomment-812058160 @Ngone51 Could you retest this pr? I think the test failures are not related to the changes of this pr. Thanks! -- This is an automated message from the Apache Git

[GitHub] [spark] SparkQA removed a comment on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #32026: URL: https://github.com/apache/spark/pull/32026#issuecomment-811959297 **[Test build #136820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136820/testReport)** for PR 32026 at commit

[GitHub] [spark] SparkQA commented on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
SparkQA commented on pull request #32026: URL: https://github.com/apache/spark/pull/32026#issuecomment-812052185 **[Test build #136820 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136820/testReport)** for PR 32026 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #32026: URL: https://github.com/apache/spark/pull/32026#issuecomment-811886707 **[Test build #136813 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136813/testReport)** for PR 32026 at commit

[GitHub] [spark] SparkQA commented on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
SparkQA commented on pull request #32026: URL: https://github.com/apache/spark/pull/32026#issuecomment-812051251 **[Test build #136813 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136813/testReport)** for PR 32026 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-811886851 **[Test build #136814 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136814/testReport)** for PR 31908 at commit

[GitHub] [spark] SparkQA commented on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side

2021-04-01 Thread GitBox
SparkQA commented on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-812048132 **[Test build #136814 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136814/testReport)** for PR 31908 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32028: [SPARK-24931][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-811959198 **[Test build #136818 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136818/testReport)** for PR 32028 at commit

[GitHub] [spark] SparkQA commented on pull request #32028: [SPARK-24931][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
SparkQA commented on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-812047661 **[Test build #136818 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136818/testReport)** for PR 32028 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811959345 **[Test build #136821 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136821/testReport)** for PR 32018 at commit

[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
SparkQA commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-812046522 **[Test build #136821 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136821/testReport)** for PR 32018 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811889230 **[Test build #136815 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136815/testReport)** for PR 30145 at commit

[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
SparkQA commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-812045245 **[Test build #136815 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136815/testReport)** for PR 30145 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and concatenated grouping analytics

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-812034816 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136812/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31935: [SPARK-34789][TEST] Introduce Jetty based construct for integration tests where HTTP server is used

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #31935: URL: https://github.com/apache/spark/pull/31935#issuecomment-812034818 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41403/

[GitHub] [spark] AmplabJenkins commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and concatenated grouping analytics

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-812034816 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136812/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #31935: [SPARK-34789][TEST] Introduce Jetty based construct for integration tests where HTTP server is used

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #31935: URL: https://github.com/apache/spark/pull/31935#issuecomment-812034818 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41403/ --

[GitHub] [spark] SparkQA commented on pull request #31935: [SPARK-34789][TEST] Introduce Jetty based construct for integration tests where HTTP server is used

2021-04-01 Thread GitBox
SparkQA commented on pull request #31935: URL: https://github.com/apache/spark/pull/31935#issuecomment-812031719 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41403/ -- This is an automated message from the

[GitHub] [spark] yijiacui-db commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-01 Thread GitBox
yijiacui-db commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r605797547 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -133,6 +137,10 @@

[GitHub] [spark] SparkQA commented on pull request #31935: [SPARK-34789][TEST] Introduce Jetty based construct for integration tests where HTTP server is used

2021-04-01 Thread GitBox
SparkQA commented on pull request #31935: URL: https://github.com/apache/spark/pull/31935#issuecomment-812028630 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41403/ -- This is an automated message from the Apache

[GitHub] [spark] eddyxu commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
eddyxu commented on a change in pull request #32026: URL: https://github.com/apache/spark/pull/32026#discussion_r605778192 ## File path: python/pyspark/sql/types.py ## @@ -764,6 +764,21 @@ def __eq__(self, other): return type(self) == type(other) +def

[GitHub] [spark] SparkQA removed a comment on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and concatenated grouping analytics

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-811852723 **[Test build #136812 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136812/testReport)** for PR 30144 at commit

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and concatenated grouping analytics

2021-04-01 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-812011158 **[Test build #136812 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136812/testReport)** for PR 30144 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32027: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32027: URL: https://github.com/apache/spark/pull/32027#issuecomment-812003404 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41400/

[GitHub] [spark] SparkQA commented on pull request #32027: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
SparkQA commented on pull request #32027: URL: https://github.com/apache/spark/pull/32027#issuecomment-812003372 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41400/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #32027: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #32027: URL: https://github.com/apache/spark/pull/32027#issuecomment-812003404 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41400/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811998761 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41401/

[GitHub] [spark] SparkQA commented on pull request #31935: [SPARK-34789][TEST] Introduce Jetty based construct for integration tests where HTTP server is used

2021-04-01 Thread GitBox
SparkQA commented on pull request #31935: URL: https://github.com/apache/spark/pull/31935#issuecomment-811998813 **[Test build #136823 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136823/testReport)** for PR 31935 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31638: [SPARK-34526][SS] Skip checking glob path in FileStreamSink.hasMetadata

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #31638: URL: https://github.com/apache/spark/pull/31638#issuecomment-811998230 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41402/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and concatenated grouping analytics

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-811998229 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136811/

[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
SparkQA commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811998715 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41401/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811998761 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41401/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32028: [HOTFIX][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-811998232 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41399/

[GitHub] [spark] AmplabJenkins commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and concatenated grouping analytics

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-811998229 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136811/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32028: [HOTFIX][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-811998232 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41399/ --

[GitHub] [spark] SparkQA commented on pull request #32027: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
SparkQA commented on pull request #32027: URL: https://github.com/apache/spark/pull/32027#issuecomment-811998224 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41400/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #31638: [SPARK-34526][SS] Skip checking glob path in FileStreamSink.hasMetadata

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #31638: URL: https://github.com/apache/spark/pull/31638#issuecomment-811998230 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41402/ --

[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
SparkQA commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811995765 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41401/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32028: [HOTFIX][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
SparkQA commented on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-811994772 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41399/ -- This is an automated message from the

[GitHub] [spark] gatorsmile commented on pull request #32013: [WIP][SPARK-34920][SQL] Add SQLSTATE and ERRORCODE to SQL exception

2021-04-01 Thread GitBox
gatorsmile commented on pull request #32013: URL: https://github.com/apache/spark/pull/32013#issuecomment-811991519 Thank you for your PR! I like the idea of adding ERROR CODE and SQLSTATE. In the next 1-2 weeks, we plan to start the public discussion about ERROR CODE and SQLSTATE

[GitHub] [spark] SparkQA commented on pull request #31638: [SPARK-34526][SS] Skip checking glob path in FileStreamSink.hasMetadata

2021-04-01 Thread GitBox
SparkQA commented on pull request #31638: URL: https://github.com/apache/spark/pull/31638#issuecomment-811989770 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41402/ --

[GitHub] [spark] SparkQA commented on pull request #32028: [HOTFIX][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
SparkQA commented on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-811989658 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41399/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and concatenated grouping analytics

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-811824813 **[Test build #136811 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136811/testReport)** for PR 30144 at commit

[GitHub] [spark] SparkQA commented on pull request #30144: [SPARK-33229][SQL] Support partial grouping analytics and concatenated grouping analytics

2021-04-01 Thread GitBox
SparkQA commented on pull request #30144: URL: https://github.com/apache/spark/pull/30144#issuecomment-811979424 **[Test build #136811 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136811/testReport)** for PR 30144 at commit

[GitHub] [spark] attilapiros commented on pull request #31935: [SPARK-34789][TEST] Introduce Jetty based construct for integration tests where HTTP server is used

2021-04-01 Thread GitBox
attilapiros commented on pull request #31935: URL: https://github.com/apache/spark/pull/31935#issuecomment-811975341 jenkins retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811969015 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41398/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811969015 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41398/

[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
SparkQA commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811968962 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41398/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32026: URL: https://github.com/apache/spark/pull/32026#issuecomment-811887267 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] SparkQA commented on pull request #31638: [SPARK-34526][SS] Skip checking glob path in FileStreamSink.hasMetadata

2021-04-01 Thread GitBox
SparkQA commented on pull request #31638: URL: https://github.com/apache/spark/pull/31638#issuecomment-811959583 **[Test build #136822 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136822/testReport)** for PR 31638 at commit

[GitHub] [spark] SparkQA commented on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
SparkQA commented on pull request #32026: URL: https://github.com/apache/spark/pull/32026#issuecomment-811959297 **[Test build #136820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136820/testReport)** for PR 32026 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32025: [SPARK-34935][SQL] CREATE TABLE LIKE should respect the reserved table properties

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32025: URL: https://github.com/apache/spark/pull/32025#issuecomment-811958685 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136809/

[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
SparkQA commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811959345 **[Test build #136821 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136821/testReport)** for PR 32018 at commit

[GitHub] [spark] SparkQA commented on pull request #32027: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
SparkQA commented on pull request #32027: URL: https://github.com/apache/spark/pull/32027#issuecomment-811959223 **[Test build #136819 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136819/testReport)** for PR 32027 at commit

[GitHub] [spark] SparkQA commented on pull request #32028: [HOTFIX][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
SparkQA commented on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-811959198 **[Test build #136818 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136818/testReport)** for PR 32028 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32025: [SPARK-34935][SQL] CREATE TABLE LIKE should respect the reserved table properties

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #32025: URL: https://github.com/apache/spark/pull/32025#issuecomment-811958685 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136809/ -- This

[GitHub] [spark] sarutak commented on pull request #32028: [HOTFIX][INFRA] Fix the GA failure related to R linter

2021-04-01 Thread GitBox
sarutak commented on pull request #32028: URL: https://github.com/apache/spark/pull/32028#issuecomment-811952833 cc @dongjoon-hyun @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] Ngone51 commented on pull request #31470: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
Ngone51 commented on pull request #31470: URL: https://github.com/apache/spark/pull/31470#issuecomment-811951002 Resubmitted https://github.com/apache/spark/pull/32027 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] Ngone51 commented on pull request #32027: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
Ngone51 commented on pull request #32027: URL: https://github.com/apache/spark/pull/32027#issuecomment-811950605 This commit 2541fd6 fixes the test failure. Previously, the test failed due to: `FlatMapCoGroupsInPandas` has a self flatMap. So, when `DeduplicateRelations` applies on

[GitHub] [spark] AngersZhuuuu commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
AngersZh commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811949440 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] sarutak opened a new pull request #32028: [HOTFIX][INFRA] Fix the GA failure related to R

2021-04-01 Thread GitBox
sarutak opened a new pull request #32028: URL: https://github.com/apache/spark/pull/32028 ### What changes were proposed in this pull request? This PR fixes the GA failure related to R which happens on some PRs (#32023, #32025). The reason seems `Rscript -e

[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] PartitioningUtils.getPathFragment() should respect partition value is null

2021-04-01 Thread GitBox
SparkQA commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811949039 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41398/ -- This is an automated message from the Apache

[GitHub] [spark] sadhen commented on a change in pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-04-01 Thread GitBox
sadhen commented on a change in pull request #31735: URL: https://github.com/apache/spark/pull/31735#discussion_r605705570 ## File path: python/pyspark/sql/pandas/types.py ## @@ -74,6 +74,8 @@ def to_arrow_type(dt): arrow_type = pa.struct(fields) elif type(dt) ==

[GitHub] [spark] sadhen commented on a change in pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-04-01 Thread GitBox
sadhen commented on a change in pull request #31735: URL: https://github.com/apache/spark/pull/31735#discussion_r605705270 ## File path: python/pyspark/sql/pandas/serializers.py ## @@ -153,14 +157,15 @@ def _create_batch(self, series): from pyspark.sql.pandas.types

[GitHub] [spark] sadhen commented on a change in pull request #31735: [SPARK-34799][PYTHON][SQL] Return User-defined types from Pandas UDF

2021-04-01 Thread GitBox
sadhen commented on a change in pull request #31735: URL: https://github.com/apache/spark/pull/31735#discussion_r605705005 ## File path: python/pyspark/sql/pandas/serializers.py ## @@ -153,14 +157,15 @@ def _create_batch(self, series): from pyspark.sql.pandas.types

[GitHub] [spark] xuanyuanking commented on pull request #31638: [SPARK-34526][SS] Skip checking glob path in FileStreamSink.hasMetadata

2021-04-01 Thread GitBox
xuanyuanking commented on pull request #31638: URL: https://github.com/apache/spark/pull/31638#issuecomment-811948288 Thanks for the help and discussion. I revived the first commit with more logs following the recent comment. Please check whether it makes sense for you now. -- This is

[GitHub] [spark] Ngone51 commented on pull request #32027: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
Ngone51 commented on pull request #32027: URL: https://github.com/apache/spark/pull/32027#issuecomment-811945333 cc @cloud-fan @maropu @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] Ngone51 opened a new pull request #32027: [SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join

2021-04-01 Thread GitBox
Ngone51 opened a new pull request #32027: URL: https://github.com/apache/spark/pull/32027 ### What changes were proposed in this pull request? This PR introduces a new analysis rule `DeduplicateRelations`, which deduplicates any duplicate relations in a plan first and

[GitHub] [spark] SparkQA removed a comment on pull request #32025: [SPARK-34935][SQL] CREATE TABLE LIKE should respect the reserved table properties

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #32025: URL: https://github.com/apache/spark/pull/32025#issuecomment-811792687 **[Test build #136809 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136809/testReport)** for PR 32025 at commit

[GitHub] [spark] SparkQA commented on pull request #32025: [SPARK-34935][SQL] CREATE TABLE LIKE should respect the reserved table properties

2021-04-01 Thread GitBox
SparkQA commented on pull request #32025: URL: https://github.com/apache/spark/pull/32025#issuecomment-811931047 **[Test build #136809 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136809/testReport)** for PR 32025 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811929001 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41397/

[GitHub] [spark] AmplabJenkins commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811929001 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41397/ --

[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
SparkQA commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811928956 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41397/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
SparkQA commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811924128 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41397/ -- This is an automated message from the Apache

[GitHub] [spark] wangyum commented on a change in pull request #31920: [SPARK-33604][SQL] Group exception messages in sql/execution

2021-04-01 Thread GitBox
wangyum commented on a change in pull request #31920: URL: https://github.com/apache/spark/pull/31920#discussion_r605674802 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala ## @@ -1087,4 +1087,42 @@ private[spark] object

[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] ExternalCatalogUtils.escapePathName should support null

2021-04-01 Thread GitBox
SparkQA commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-81191 **[Test build #136817 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136817/testReport)** for PR 32018 at commit

[GitHub] [spark] srowen commented on pull request #31942: [SPARK-34834][NETWORK] Fix a potential Netty memory leak in TransportResponseHandler.

2021-04-01 Thread GitBox
srowen commented on pull request #31942: URL: https://github.com/apache/spark/pull/31942#issuecomment-811920826 If we want to just fix N cases, not all of them, that clearly need to close, as a narrower fix, that's fine. It looks like there is at least 1 other we can fix with a

[GitHub] [spark] SparkQA commented on pull request #32018: [SPARK-34926][SQL] ExternalCatalogUtils.escapePathName should support null

2021-04-01 Thread GitBox
SparkQA commented on pull request #32018: URL: https://github.com/apache/spark/pull/32018#issuecomment-811920107 **[Test build #136816 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136816/testReport)** for PR 32018 at commit

[GitHub] [spark] xuanyuanking commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-01 Thread GitBox
xuanyuanking commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r605670964 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -133,6 +137,10 @@

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-811919611 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41396/

[GitHub] [spark] AmplabJenkins commented on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-811919611 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/41396/ --

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
AmplabJenkins removed a comment on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811919610 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136808/

[GitHub] [spark] AmplabJenkins commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
AmplabJenkins commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811919610 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136808/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
SparkQA removed a comment on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811759537 **[Test build #136808 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136808/testReport)** for PR 30145 at commit

[GitHub] [spark] xuanyuanking commented on a change in pull request #31944: [SPARK-34854][SQL][SS] Expose source metrics via progress report and add Kafka use-case to report delay.

2021-04-01 Thread GitBox
xuanyuanking commented on a change in pull request #31944: URL: https://github.com/apache/spark/pull/31944#discussion_r605661850 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -218,3 +226,36 @@

[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
sadhen commented on a change in pull request #32026: URL: https://github.com/apache/spark/pull/32026#discussion_r60566 ## File path: python/pyspark/sql/pandas/types.py ## @@ -346,3 +348,29 @@ def _convert_dict_to_map_items(s): :return: pandas.Series of lists of (key,

[GitHub] [spark] MaxGekk commented on a change in pull request #32018: [SPARK-34926][SQL] ExternalCatalogUtils.escapePathName should support null

2021-04-01 Thread GitBox
MaxGekk commented on a change in pull request #32018: URL: https://github.com/apache/spark/pull/32018#discussion_r605662843 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala ## @@ -350,7 +350,12 @@ object

[GitHub] [spark] cloud-fan commented on a change in pull request #32022: [SPARK-34932][SQL] Ignore the groupBy expressions in GROUP BY ... GROUPING SETS

2021-04-01 Thread GitBox
cloud-fan commented on a change in pull request #32022: URL: https://github.com/apache/spark/pull/32022#discussion_r605662903 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -914,19 +914,19 @@ class AstBuilder extends

[GitHub] [spark] sadhen commented on a change in pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-01 Thread GitBox
sadhen commented on a change in pull request #32026: URL: https://github.com/apache/spark/pull/32026#discussion_r60566 ## File path: python/pyspark/sql/pandas/types.py ## @@ -346,3 +348,29 @@ def _convert_dict_to_map_items(s): :return: pandas.Series of lists of (key,

[GitHub] [spark] SparkQA commented on pull request #30145: [SPARK-33233][SQL]CUBE/ROLLUP/GROUPING SETS support GROUP BY ordinal

2021-04-01 Thread GitBox
SparkQA commented on pull request #30145: URL: https://github.com/apache/spark/pull/30145#issuecomment-811912358 **[Test build #136808 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136808/testReport)** for PR 30145 at commit

<    1   2   3   4   5   6   >