[GitHub] [spark] SparkQA commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle

2021-08-15 Thread GitBox
SparkQA commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-899239127 **[Test build #142493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142493/testReport)** for PR 32816 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899233222 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142488/ -- This

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899232440 **[Test build #142488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142488/testReport)** for PR 33683 at commit

[GitHub] [spark] SparkQA commented on pull request #32467: [WIP] simplify correlated subquery resolution

2021-08-15 Thread GitBox
SparkQA commented on pull request #32467: URL: https://github.com/apache/spark/pull/32467#issuecomment-899226609 **[Test build #142492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142492/testReport)** for PR 32467 at commit

[GitHub] [spark] SparkQA commented on pull request #32468: [SPARK-35335][SQL] Coalesce shuffle partition as much as possible for REPARTITION_BY_NONE

2021-08-15 Thread GitBox
SparkQA commented on pull request #32468: URL: https://github.com/apache/spark/pull/32468#issuecomment-899226575 **[Test build #142491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142491/testReport)** for PR 32468 at commit

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-08-15 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-899226545 **[Test build #142490 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142490/testReport)** for PR 32473 at commit

[GitHub] [spark] SparkQA commented on pull request #32475: [SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty

2021-08-15 Thread GitBox
SparkQA commented on pull request #32475: URL: https://github.com/apache/spark/pull/32475#issuecomment-899226513 **[Test build #142489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142489/testReport)** for PR 32475 at commit

[GitHub] [spark] viirya commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
viirya commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689247394 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1792,7 +1792,85 @@ hence the number is not same as the number of original input rows.

[GitHub] [spark] viirya commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
viirya commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689246669 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1792,7 +1792,85 @@ hence the number is not same as the number of original input rows.

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899224072 **[Test build #142488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142488/testReport)** for PR 33683 at commit

[GitHub] [spark] SparkQA commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
SparkQA commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899224037 **[Test build #142487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142487/testReport)** for PR 33736 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899223783 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142474/

[GitHub] [spark] AmplabJenkins commented on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899223783 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142474/ -- This

[GitHub] [spark] viirya commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
viirya commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689245759 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1792,7 +1792,85 @@ hence the number is not same as the number of original input rows.

[GitHub] [spark] SparkQA removed a comment on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
SparkQA removed a comment on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899135753 **[Test build #142474 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142474/testReport)** for PR 33639 at commit

[GitHub] [spark] SparkQA commented on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
SparkQA commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899222795 **[Test build #142474 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142474/testReport)** for PR 33639 at commit

[GitHub] [spark] viirya commented on a change in pull request #33728: [SPARK-36498][SQL] Reorder inner fields of the input query in byName V2 write

2021-08-15 Thread GitBox
viirya commented on a change in pull request #33728: URL: https://github.com/apache/spark/pull/33728#discussion_r689244868 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala ## @@ -74,6 +66,160 @@ object

[GitHub] [spark] SparkQA removed a comment on pull request #33659: [SPARK-36433][WEBUI] Fix log message in WebUI

2021-08-15 Thread GitBox
SparkQA removed a comment on pull request #33659: URL: https://github.com/apache/spark/pull/33659#issuecomment-899206140 **[Test build #142481 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142481/testReport)** for PR 33659 at commit

[GitHub] [spark] xuanyuanking commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899219122 ``` The section "State Store and task locality" introduces State Store as well. ``` Thanks @gengliangwang for the detailed check! Yes, I moved the `State Store and

[GitHub] [spark] xuanyuanking commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689241848 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] xuanyuanking commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689241802 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] xuanyuanking commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689241748 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] xuanyuanking commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689241527 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] xuanyuanking commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689241474 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] cloud-fan commented on a change in pull request #33728: [SPARK-36498][SQL] Reorder inner fields of the input query in byName V2 write

2021-08-15 Thread GitBox
cloud-fan commented on a change in pull request #33728: URL: https://github.com/apache/spark/pull/33728#discussion_r689241054 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala ## @@ -74,6 +66,160 @@ object

[GitHub] [spark] LuciferYang commented on pull request #30483: [SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2021-08-15 Thread GitBox
LuciferYang commented on pull request #30483: URL: https://github.com/apache/spark/pull/30483#issuecomment-899209904 > In other words, can you spin off ORC-only PR? OK, I will create a new Jira and give a ORC-only pr ~ -- This is an automated message from the Apache Git

[GitHub] [spark] gengliangwang commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
gengliangwang commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899208164 @xuanyuanking The section "State Store and task locality" introduces State Store as well. Shall we combine the new content with it? Or at least move the new content above

[GitHub] [spark] SparkQA commented on pull request #33659: [SPARK-36433][WEBUI] Fix log message in WebUI

2021-08-15 Thread GitBox
SparkQA commented on pull request #33659: URL: https://github.com/apache/spark/pull/33659#issuecomment-899207994 **[Test build #142481 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142481/testReport)** for PR 33659 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33659: [SPARK-36433][WEBUI] Fix log message in WebUI

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33659: URL: https://github.com/apache/spark/pull/33659#issuecomment-899208020 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142481/ -- This

[GitHub] [spark] gengliangwang commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
gengliangwang commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689232945 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] SparkQA commented on pull request #32467: [WIP] simplify correlated subquery resolution

2021-08-15 Thread GitBox
SparkQA commented on pull request #32467: URL: https://github.com/apache/spark/pull/32467#issuecomment-899206942 **[Test build #142486 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142486/testReport)** for PR 32467 at commit

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-08-15 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-899206882 **[Test build #142484 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142484/testReport)** for PR 32473 at commit

[GitHub] [spark] SparkQA commented on pull request #32468: [SPARK-35335][SQL] Coalesce shuffle partition as much as possible for REPARTITION_BY_NONE

2021-08-15 Thread GitBox
SparkQA commented on pull request #32468: URL: https://github.com/apache/spark/pull/32468#issuecomment-899206830 **[Test build #142485 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142485/testReport)** for PR 32468 at commit

[GitHub] [spark] SparkQA commented on pull request #32475: [SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty

2021-08-15 Thread GitBox
SparkQA commented on pull request #32475: URL: https://github.com/apache/spark/pull/32475#issuecomment-899206797 **[Test build #142483 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142483/testReport)** for PR 32475 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33659: [SPARK-36433][WEBUI] Fix log message in WebUI

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #33659: URL: https://github.com/apache/spark/pull/33659#issuecomment-899205863 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33673: [SPARK-36448][SQL] Exceptions in NoSuchItemException.scala have to be case classes

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #33673: URL: https://github.com/apache/spark/pull/33673#issuecomment-894514980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] SparkQA commented on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
SparkQA commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899206152 **[Test build #142482 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142482/testReport)** for PR 33639 at commit

[GitHub] [spark] SparkQA commented on pull request #33659: [SPARK-36433][WEBUI] Fix log message in WebUI

2021-08-15 Thread GitBox
SparkQA commented on pull request #33659: URL: https://github.com/apache/spark/pull/33659#issuecomment-899206140 **[Test build #142481 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142481/testReport)** for PR 33659 at commit

[GitHub] [spark] gengliangwang commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
gengliangwang commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689232329 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] SparkQA commented on pull request #33664: [SPARK-36444][SQL] Remove OptimizeSubqueries from batch of PartitionPruning

2021-08-15 Thread GitBox
SparkQA commented on pull request #33664: URL: https://github.com/apache/spark/pull/33664#issuecomment-899206041 **[Test build #142480 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142480/testReport)** for PR 33664 at commit

[GitHub] [spark] SparkQA commented on pull request #33673: [SPARK-36448][SQL] Exceptions in NoSuchItemException.scala have to be case classes

2021-08-15 Thread GitBox
SparkQA commented on pull request #33673: URL: https://github.com/apache/spark/pull/33673#issuecomment-899206032 **[Test build #142479 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142479/testReport)** for PR 33673 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33659: [SPARK-36433][WEBUI] Fix log message in WebUI

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33659: URL: https://github.com/apache/spark/pull/33659#issuecomment-899205863 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899156995 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46982/

[GitHub] [spark] SparkQA commented on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
SparkQA commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899203980 **[Test build #142478 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142478/testReport)** for PR 33639 at commit

[GitHub] [spark] LuciferYang commented on a change in pull request #30483: [SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2021-08-15 Thread GitBox
LuciferYang commented on a change in pull request #30483: URL: https://github.com/apache/spark/pull/30483#discussion_r689230185 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileMeta.scala ## @@ -0,0 +1,54 @@ +/* + * Licensed to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899191944 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30483: [SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #30483: URL: https://github.com/apache/spark/pull/30483#issuecomment-896063198 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142273/

[GitHub] [spark] LuciferYang commented on a change in pull request #30483: [SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2021-08-15 Thread GitBox
LuciferYang commented on a change in pull request #30483: URL: https://github.com/apache/spark/pull/30483#discussion_r689230185 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileMeta.scala ## @@ -0,0 +1,54 @@ +/* + * Licensed to the

[GitHub] [spark] AmplabJenkins commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899202485 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46985/ --

[GitHub] [spark] SparkQA commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
SparkQA commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899193842 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46985/ -- This is an automated message from the

[GitHub] [spark] LuciferYang edited a comment on pull request #33722: [SPARK-36495][SQL] Use type match to simplify methods in CatalystTypeConverter

2021-08-15 Thread GitBox
LuciferYang edited a comment on pull request #33722: URL: https://github.com/apache/spark/pull/33722#issuecomment-899193020 thx @HyukjinKwon @huaxingao -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] LuciferYang commented on pull request #33722: [SPARK-36495][SQL] Use type match to simplify methods in CatalystTypeConverter

2021-08-15 Thread GitBox
LuciferYang commented on pull request #33722: URL: https://github.com/apache/spark/pull/33722#issuecomment-899193020 thx @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AmplabJenkins commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899191944 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46984/ --

[GitHub] [spark] SparkQA commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
SparkQA commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899191927 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46984/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
SparkQA commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899184571 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46985/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
SparkQA commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899181875 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46984/ -- This is an automated message from the Apache

[GitHub] [spark] viirya commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
viirya commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689210382 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users can

[GitHub] [spark] viirya commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
viirya commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689209780 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users can

[GitHub] [spark] viirya commented on pull request #33691: [SPARK-36465][SS] Dynamic gap duration in session window

2021-08-15 Thread GitBox
viirya commented on pull request #33691: URL: https://github.com/apache/spark/pull/33691#issuecomment-899174503 Thanks @HeartSaVioR @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HeartSaVioR commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
HeartSaVioR commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689203232 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,82 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] SparkQA commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
SparkQA commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899171541 **[Test build #142477 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142477/testReport)** for PR 33736 at commit

[GitHub] [spark] SparkQA commented on pull request #33736: [SPARK-35991][SQL] Add PlanStability suite for TPCH

2021-08-15 Thread GitBox
SparkQA commented on pull request #33736: URL: https://github.com/apache/spark/pull/33736#issuecomment-899170324 **[Test build #142476 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142476/testReport)** for PR 33736 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899157737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899168846 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46983/ --

[GitHub] [spark] HeartSaVioR closed pull request #33691: [SPARK-36465][SS] Dynamic gap duration in session window

2021-08-15 Thread GitBox
HeartSaVioR closed pull request #33691: URL: https://github.com/apache/spark/pull/33691 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] HeartSaVioR commented on pull request #33691: [SPARK-36465][SS] Dynamic gap duration in session window

2021-08-15 Thread GitBox
HeartSaVioR commented on pull request #33691: URL: https://github.com/apache/spark/pull/33691#issuecomment-899163590 Thanks! Merging to master/branch-3.2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899162779 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46983/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899157737 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46981/ --

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899157720 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46981/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins commented on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899156995 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/46982/ --

[GitHub] [spark] SparkQA commented on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
SparkQA commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899156983 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46982/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899155166 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142475/

[GitHub] [spark] AmplabJenkins commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899155166 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142475/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA removed a comment on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899150258 **[Test build #142475 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142475/testReport)** for PR 33683 at commit

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899155028 **[Test build #142475 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142475/testReport)** for PR 33683 at commit

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899152554 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46983/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
AmplabJenkins removed a comment on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899138151 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142473/

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899150258 **[Test build #142475 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142475/testReport)** for PR 33683 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA removed a comment on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899135775 **[Test build #142473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142473/testReport)** for PR 33683 at commit

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899145052 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46981/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
SparkQA commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899144981 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46982/ -- This is an automated message from the Apache

[GitHub] [spark] xuanyuanking commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689169441 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,23 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] xuanyuanking commented on a change in pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking commented on a change in pull request #33683: URL: https://github.com/apache/spark/pull/33683#discussion_r689169372 ## File path: docs/structured-streaming-programming-guide.md ## @@ -1814,6 +1814,23 @@ Specifically for built-in HDFS state store provider, users

[GitHub] [spark] AmplabJenkins commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899138151 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142473/ -- This

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899138062 **[Test build #142473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142473/testReport)** for PR 33683 at commit

[GitHub] [spark] SparkQA commented on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
SparkQA commented on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899135775 **[Test build #142473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142473/testReport)** for PR 33683 at commit

[GitHub] [spark] SparkQA commented on pull request #33639: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-08-15 Thread GitBox
SparkQA commented on pull request #33639: URL: https://github.com/apache/spark/pull/33639#issuecomment-899135753 **[Test build #142474 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/142474/testReport)** for PR 33639 at commit

[GitHub] [spark] github-actions[bot] closed pull request #31899: [SPARK-34525][SQL][DOCS] Update documentation for various DDLs to reflect alternative key value notation

2021-08-15 Thread GitBox
github-actions[bot] closed pull request #31899: URL: https://github.com/apache/spark/pull/31899 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] github-actions[bot] closed pull request #32438: [WIP][SPARK-33236][SHUFFLE][CORE] Enable Push-based shuffle service to store state in NM level DB for work preserving restart

2021-08-15 Thread GitBox
github-actions[bot] closed pull request #32438: URL: https://github.com/apache/spark/pull/32438 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] github-actions[bot] commented on pull request #32467: [WIP] simplify correlated subquery resolution

2021-08-15 Thread GitBox
github-actions[bot] commented on pull request #32467: URL: https://github.com/apache/spark/pull/32467#issuecomment-899132816 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] github-actions[bot] commented on pull request #32431: [SPARK-35173][SQL][PYTHON] Add multiple columns adding support

2021-08-15 Thread GitBox
github-actions[bot] commented on pull request #32431: URL: https://github.com/apache/spark/pull/32431#issuecomment-899132820 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] xuanyuanking edited a comment on pull request #33683: [SPARK-36041][SS][DOCS] Introduce the RocksDBStateStoreProvider in the programming guide

2021-08-15 Thread GitBox
xuanyuanking edited a comment on pull request #33683: URL: https://github.com/apache/spark/pull/33683#issuecomment-899084300 Ah yes, I forgot to push. Sorry for the late, I’ll update this soon! -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] AmplabJenkins commented on pull request #33747: [SPARK-36453][SQL] Improve consistency processing floating point special literals

2021-08-15 Thread GitBox
AmplabJenkins commented on pull request #33747: URL: https://github.com/apache/spark/pull/33747#issuecomment-899125695 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] planga82 commented on pull request #33747: [SPARK-36453][SQL] Improve consistency processing floating point special literals

2021-08-15 Thread GitBox
planga82 commented on pull request #33747: URL: https://github.com/apache/spark/pull/33747#issuecomment-899122615 CC @HyukjinKwon Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] planga82 opened a new pull request #33747: [SPARK-36453][SQL] Improve consistency processing floating point special literals

2021-08-15 Thread GitBox
planga82 opened a new pull request #33747: URL: https://github.com/apache/spark/pull/33747 ### What changes were proposed in this pull request? Special literals in floating point are not consistent between cast and json expressions ``` scala> spark.sql("SELECT

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30483: [SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2021-08-15 Thread GitBox
dongjoon-hyun commented on a change in pull request #30483: URL: https://github.com/apache/spark/pull/30483#discussion_r689147442 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileMeta.scala ## @@ -0,0 +1,54 @@ +/* + * Licensed to the

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30483: [SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2021-08-15 Thread GitBox
dongjoon-hyun edited a comment on pull request #30483: URL: https://github.com/apache/spark/pull/30483#issuecomment-899089526 I've looking around the code. The most serious block is the following because both Apache Spark and Parquet community are reluctant to advertise the deprecated

[GitHub] [spark] dongjoon-hyun commented on pull request #30483: [SPARK-33449][SQL] Add File Metadata cache support for Parquet and Orc

2021-08-15 Thread GitBox
dongjoon-hyun commented on pull request #30483: URL: https://github.com/apache/spark/pull/30483#issuecomment-899089526 I've looking around the code. The most serious block is the following because both Apache Spark and Parquet community are reluctant to advertise the deprecated API. So,

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33746: [SPARK-36514][SQL] Support to set the meta conf in HiveMetastoreClient

2021-08-15 Thread GitBox
dongjoon-hyun commented on a change in pull request #33746: URL: https://github.com/apache/spark/pull/33746#discussion_r689119850 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -265,6 +265,10 @@ private[hive] class

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33746: [SPARK-36514][SQL] Support to set the meta conf in HiveMetastoreClient

2021-08-15 Thread GitBox
dongjoon-hyun commented on a change in pull request #33746: URL: https://github.com/apache/spark/pull/33746#discussion_r689119732 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala ## @@ -118,6 +118,13 @@ private[spark] object HiveUtils extends

  1   2   >