[GitHub] [spark] SparkQA removed a comment on pull request #33170: [SPARK-35967][SQL] Update nullability based on column statistics

2021-07-01 Thread GitBox
SparkQA removed a comment on pull request #33170: URL: https://github.com/apache/spark/pull/33170#issuecomment-872258097 **[Test build #140520 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140520/testReport)** for PR 33170 at commit

[GitHub] [spark] cloud-fan commented on a change in pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle

2021-07-01 Thread GitBox
cloud-fan commented on a change in pull request #32816: URL: https://github.com/apache/spark/pull/32816#discussion_r662493000 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala ## @@ -91,13 +91,22 @@ case class

[GitHub] [spark] SparkQA removed a comment on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2021-07-01 Thread GitBox
SparkQA removed a comment on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-872394025 **[Test build #140526 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140526/testReport)** for PR 32959 at commit

[GitHub] [spark] viirya commented on a change in pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
viirya commented on a change in pull request #33172: URL: https://github.com/apache/spark/pull/33172#discussion_r662492949 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -525,10 +525,35 @@ object SQLConf { .booleanConf

[GitHub] [spark] SparkQA removed a comment on pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance

2021-07-01 Thread GitBox
SparkQA removed a comment on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-872258324 **[Test build #140521 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140521/testReport)** for PR 32933 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #33160: [SPARK-35959][BUILD] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older than 3.2.2/3.3.1

2021-07-01 Thread GitBox
SparkQA removed a comment on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872438107 **[Test build #140531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140531/testReport)** for PR 33160 at commit

[GitHub] [spark] SparkQA commented on pull request #33170: [SPARK-35967][SQL] Update nullability based on column statistics

2021-07-01 Thread GitBox
SparkQA commented on pull request #33170: URL: https://github.com/apache/spark/pull/33170#issuecomment-872444628 **[Test build #140520 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140520/testReport)** for PR 33170 at commit

[GitHub] [spark] SparkQA commented on pull request #33160: [SPARK-35959][BUILD] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older than 3.2.2/3.3.1

2021-07-01 Thread GitBox
SparkQA commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872443490 **[Test build #140531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140531/testReport)** for PR 33160 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-872441521 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140526/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-872441605 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140521/ -- This

[GitHub] [spark] SparkQA commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2021-07-01 Thread GitBox
SparkQA commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-872440996 **[Test build #140526 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140526/testReport)** for PR 32959 at commit

[GitHub] [spark] gengliangwang commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
gengliangwang commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662487908 ## File path: sql/core/src/test/scala/org/apache/spark/sql/TimestampTypeSuite.scala ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] SparkQA commented on pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance

2021-07-01 Thread GitBox
SparkQA commented on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-872440399 **[Test build #140521 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140521/testReport)** for PR 32933 at commit

[GitHub] [spark] cloud-fan commented on pull request #32816: [SPARK-33832][SQL] Support optimize skewed join even if introduce extra shuffle

2021-07-01 Thread GitBox
cloud-fan commented on pull request #32816: URL: https://github.com/apache/spark/pull/32816#issuecomment-872439396 I think using the `CostEvaluator` to accept extra shuffles introduced by skew join handling is a good idea. However, the current framework is too simple: we just give up the

[GitHub] [spark] gengliangwang commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
gengliangwang commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662485470 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -2502,7 +2502,7 @@ class AstBuilder extends

[GitHub] [spark] SparkQA commented on pull request #33160: [SPARK-35959][BUILD] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older than 3.2.2/3.3.1

2021-07-01 Thread GitBox
SparkQA commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872438107 **[Test build #140531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140531/testReport)** for PR 33160 at commit

[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872433669 **[Test build #140530 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140530/testReport)** for PR 33172 at commit

[GitHub] [spark] SparkQA commented on pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
SparkQA commented on pull request #33176: URL: https://github.com/apache/spark/pull/33176#issuecomment-872433609 **[Test build #140529 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140529/testReport)** for PR 33176 at commit

[GitHub] [spark] gengliangwang commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
gengliangwang commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662480054 ## File path: sql/core/src/test/scala/org/apache/spark/sql/TimestampTypeSuite.scala ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872431363 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45038/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #33176: URL: https://github.com/apache/spark/pull/33176#issuecomment-872431364 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45039/

[GitHub] [spark] AmplabJenkins commented on pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #33176: URL: https://github.com/apache/spark/pull/33176#issuecomment-872431364 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45039/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872431363 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45038/ --

[GitHub] [spark] cloud-fan closed pull request #32972: [SPARK-35756][SQL] unionByName supports struct having same col names but different sequence

2021-07-01 Thread GitBox
cloud-fan closed pull request #32972: URL: https://github.com/apache/spark/pull/32972 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] cloud-fan commented on pull request #32972: [SPARK-35756][SQL] unionByName supports struct having same col names but different sequence

2021-07-01 Thread GitBox
cloud-fan commented on pull request #32972: URL: https://github.com/apache/spark/pull/32972#issuecomment-872430251 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] cloud-fan commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
cloud-fan commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662474917 ## File path: sql/core/src/test/scala/org/apache/spark/sql/TimestampTypeSuite.scala ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] cloud-fan commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
cloud-fan commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662474147 ## File path: sql/core/src/test/scala/org/apache/spark/sql/TimestampTypeSuite.scala ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software

[GitHub] [spark] SparkQA commented on pull request #33095: [SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-07-01 Thread GitBox
SparkQA commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-872426863 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45041/ -- This is an automated message from the Apache

[GitHub] [spark] cloud-fan commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
cloud-fan commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662472747 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -2502,7 +2502,7 @@ class AstBuilder extends

[GitHub] [spark] cloud-fan commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
cloud-fan commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662472747 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -2502,7 +2502,7 @@ class AstBuilder extends

[GitHub] [spark] SparkQA commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2021-07-01 Thread GitBox
SparkQA commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-872425510 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45040/ -- This is an automated message from the Apache

[GitHub] [spark] rahulsmahadev commented on a change in pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-07-01 Thread GitBox
rahulsmahadev commented on a change in pull request #33093: URL: https://github.com/apache/spark/pull/33093#discussion_r662471236 ## File path: sql/core/src/test/scala/org/apache/spark/sql/streaming/FlatMapGroupsWithStateSuite.scala ## @@ -1268,12 +1269,298 @@ class

[GitHub] [spark] sunchao commented on pull request #33160: [SPARK-35959][BUILD] Add a new Maven profile "no-shaded-hadoop-client" for Hadoop versions older than 3.2.2/3.3.1

2021-07-01 Thread GitBox
sunchao commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872422611 > To verify via CI, could you make the profile active by default? After testing, we should remove it. Thanks @dongjoon-hyun . Will do. -- This is an automated message

[GitHub] [spark] cloud-fan commented on a change in pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
cloud-fan commented on a change in pull request #33172: URL: https://github.com/apache/spark/pull/33172#discussion_r662468622 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -525,10 +525,35 @@ object SQLConf { .booleanConf

[GitHub] [spark] dbtsai commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-07-01 Thread GitBox
dbtsai commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-872421733 +1 to merge it as it now if there is no major issue, and we can work on the followup later to reduce the scope. -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] gengliangwang commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
gengliangwang commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662467516 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2820,6 +2821,23 @@ object SQLConf {

[GitHub] [spark] SparkQA commented on pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
SparkQA commented on pull request #33176: URL: https://github.com/apache/spark/pull/33176#issuecomment-872420311 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45039/ --

[GitHub] [spark] dongjoon-hyun commented on pull request #33160: [SPARK-35959][BUILD] Add a new Maven profile "no-shaded-hadoop-client" for older Hadoop 3.x versions

2021-07-01 Thread GitBox
dongjoon-hyun commented on pull request #33160: URL: https://github.com/apache/spark/pull/33160#issuecomment-872420110 FYI, if you enable it by default, the dependency files are required to be updated accordingly. -- This is an automated message from the Apache Git Service. To respond

[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872410445 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45038/ -- This is an automated message from the

[GitHub] [spark] SparkQA removed a comment on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE

2021-07-01 Thread GitBox
SparkQA removed a comment on pull request #32084: URL: https://github.com/apache/spark/pull/32084#issuecomment-872214551 **[Test build #140519 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140519/testReport)** for PR 32084 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #33095: [SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-07-01 Thread GitBox
SparkQA removed a comment on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-872401189 **[Test build #140528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140528/testReport)** for PR 33095 at commit

[GitHub] [spark] sunchao commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-07-01 Thread GitBox
sunchao commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-872407140 This PR looks good to me now. Also curious if this can be merged after branch-cut. It'd also be great if @cloud-fan can take one more look. -- This is an automated message

[GitHub] [spark] aokolnychyi commented on pull request #31700: [SPARK-34183][SS] DataSource V2: Support required distribution and ordering in SS

2021-07-01 Thread GitBox
aokolnychyi commented on pull request #31700: URL: https://github.com/apache/spark/pull/31700#issuecomment-872407269 Okay, I'll update the PR by the end of the week and then we can decide whether it is something we want to have in 3.2.0. I am fine not including this change but the feature

[GitHub] [spark] viirya commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-07-01 Thread GitBox
viirya commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-872405669 also cc @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AmplabJenkins commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #32084: URL: https://github.com/apache/spark/pull/32084#issuecomment-872404616 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140519/ -- This

[GitHub] [spark] xkrogen commented on a change in pull request #33101: [SPARK-35907][CORE] Instead of File#mkdirs, Files#createDirectories is expected.

2021-07-01 Thread GitBox
xkrogen commented on a change in pull request #33101: URL: https://github.com/apache/spark/pull/33101#discussion_r662452295 ## File path: core/src/main/scala/org/apache/spark/util/Utils.scala ## @@ -285,9 +285,10 @@ private[spark] object Utils extends Logging { */ def

[GitHub] [spark] viirya commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-07-01 Thread GitBox
viirya commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-872403558 Thanks @aokolnychyi! I am not sure if we still can merge this in after branch cut? If not, maybe we can have this in first, if there is no major comments/concerns, and

[GitHub] [spark] rahulsmahadev commented on a change in pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-07-01 Thread GitBox
rahulsmahadev commented on a change in pull request #33093: URL: https://github.com/apache/spark/pull/33093#discussion_r662451499 ## File path: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala ## @@ -280,6 +280,51 @@ class KeyValueGroupedDataset[K, V]

[GitHub] [spark] SparkQA commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE

2021-07-01 Thread GitBox
SparkQA commented on pull request #32084: URL: https://github.com/apache/spark/pull/32084#issuecomment-872403234 **[Test build #140519 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140519/testReport)** for PR 32084 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #33095: [SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-872402017 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140528/ -- This

[GitHub] [spark] SparkQA commented on pull request #33095: [SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-07-01 Thread GitBox
SparkQA commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-872401996 **[Test build #140528 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140528/testReport)** for PR 33095 at commit

[GitHub] [spark] SparkQA commented on pull request #33095: [SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-07-01 Thread GitBox
SparkQA commented on pull request #33095: URL: https://github.com/apache/spark/pull/33095#issuecomment-872401189 **[Test build #140528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140528/testReport)** for PR 33095 at commit

[GitHub] [spark] xinrong-databricks commented on a change in pull request #33095: [SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-07-01 Thread GitBox
xinrong-databricks commented on a change in pull request #33095: URL: https://github.com/apache/spark/pull/33095#discussion_r662447247 ## File path: python/pyspark/pandas/tests/data_type_ops/test_boolean_ops.py ## @@ -572,10 +578,28 @@ def test_isnull(self): def

[GitHub] [spark] xinrong-databricks commented on a change in pull request #33095: [SPARK-35339][PYTHON] Improve unit tests for data-type-based basic operations

2021-07-01 Thread GitBox
xinrong-databricks commented on a change in pull request #33095: URL: https://github.com/apache/spark/pull/33095#discussion_r662446564 ## File path: python/pyspark/pandas/tests/data_type_ops/test_boolean_ops.py ## @@ -572,10 +578,28 @@ def test_isnull(self): def

[GitHub] [spark] SparkQA commented on pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
SparkQA commented on pull request #33176: URL: https://github.com/apache/spark/pull/33176#issuecomment-872396248 **[Test build #140527 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140527/testReport)** for PR 33176 at commit

[GitHub] [spark] SparkQA commented on pull request #32959: [SPARK-35780][SQL] Support DATE/TIMESTAMP literals across the full range

2021-07-01 Thread GitBox
SparkQA commented on pull request #32959: URL: https://github.com/apache/spark/pull/32959#issuecomment-872394025 **[Test build #140526 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140526/testReport)** for PR 32959 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872393828 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140518/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872393829 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45037/

[GitHub] [spark] AmplabJenkins commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872393829 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45037/ --

[GitHub] [spark] AmplabJenkins commented on pull request #33173: [SPARK-35971][SQL] Rename the type name of TimestampNTZType as "timestamp_ntz"

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #33173: URL: https://github.com/apache/spark/pull/33173#issuecomment-872393826 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140515/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33173: [SPARK-35971][SQL] Rename the type name of TimestampNTZType as "timestamp_ntz"

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #33173: URL: https://github.com/apache/spark/pull/33173#issuecomment-872393826 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140515/

[GitHub] [spark] AmplabJenkins commented on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872393828 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140518/ -- This

[GitHub] [spark] MaxGekk commented on pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
MaxGekk commented on pull request #33176: URL: https://github.com/apache/spark/pull/33176#issuecomment-872388132 @gengliangwang Could you mention what is the default value of `spark.sql.timestampType` in the PR description. In general it LGTM. -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872388015 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45038/ -- This is an automated message from the Apache

[GitHub] [spark] aokolnychyi commented on pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-07-01 Thread GitBox
aokolnychyi commented on pull request #32921: URL: https://github.com/apache/spark/pull/32921#issuecomment-872386867 @viirya, missed to update the PR description when updated the title. Done. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-07-01 Thread GitBox
aokolnychyi commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r662433500 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala ## @@ -17,38 +17,96 @@ package

[GitHub] [spark] MaxGekk commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
MaxGekk commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662430089 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2820,6 +2821,23 @@ object SQLConf { .booleanConf

[GitHub] [spark] aokolnychyi commented on a change in pull request #32921: [SPARK-35779][SQL] Dynamic filtering for Data Source V2

2021-07-01 Thread GitBox
aokolnychyi commented on a change in pull request #32921: URL: https://github.com/apache/spark/pull/32921#discussion_r662433500 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala ## @@ -17,38 +17,96 @@ package

[GitHub] [spark] MaxGekk commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
MaxGekk commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662429113 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2820,6 +2821,23 @@ object SQLConf { .booleanConf

[GitHub] [spark] SparkQA removed a comment on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-07-01 Thread GitBox
SparkQA removed a comment on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872184705 **[Test build #140518 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140518/testReport)** for PR 33093 at commit

[GitHub] [spark] SparkQA commented on pull request #33093: [SPARK-35897][SS] Support user defined initial state with flatMapGroupsWithState in Structured Streaming

2021-07-01 Thread GitBox
SparkQA commented on pull request #33093: URL: https://github.com/apache/spark/pull/33093#issuecomment-872380609 **[Test build #140518 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140518/testReport)** for PR 33093 at commit

[GitHub] [spark] gengliangwang commented on a change in pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
gengliangwang commented on a change in pull request #33176: URL: https://github.com/apache/spark/pull/33176#discussion_r662426271 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2820,6 +2821,23 @@ object SQLConf {

[GitHub] [spark] gengliangwang opened a new pull request #33176: [SPARK-35975][SQL] New configuration `spark.sql.timestampType` for the default timestamp type

2021-07-01 Thread GitBox
gengliangwang opened a new pull request #33176: URL: https://github.com/apache/spark/pull/33176 ### What changes were proposed in this pull request? Add a new configuration `spark.sql.timestampType`, which configures the default timestamp type of Spark SQL, including SQL DDL

[GitHub] [spark] yaooqinn commented on pull request #33169: [SPARK-35966][SQL] Port HIVE-17952: Fix license headers to avoid dangling javadoc warnings

2021-07-01 Thread GitBox
yaooqinn commented on pull request #33169: URL: https://github.com/apache/spark/pull/33169#issuecomment-872369760 ps, my bad. I will be more careful from now on. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [spark] SparkQA removed a comment on pull request #33173: [SPARK-35971][SQL] Rename the type name of TimestampNTZType as "timestamp_ntz"

2021-07-01 Thread GitBox
SparkQA removed a comment on pull request #33173: URL: https://github.com/apache/spark/pull/33173#issuecomment-872171959 **[Test build #140515 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140515/testReport)** for PR 33173 at commit

[GitHub] [spark] SparkQA commented on pull request #33173: [SPARK-35971][SQL] Rename the type name of TimestampNTZType as "timestamp_ntz"

2021-07-01 Thread GitBox
SparkQA commented on pull request #33173: URL: https://github.com/apache/spark/pull/33173#issuecomment-872368472 **[Test build #140515 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140515/testReport)** for PR 33173 at commit

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #33169: [SPARK-35966][SQL] Port HIVE-17952: Fix license headers to avoid dangling javadoc warnings

2021-07-01 Thread GitBox
dongjoon-hyun edited a comment on pull request #33169: URL: https://github.com/apache/spark/pull/33169#issuecomment-872367391 @yaooqinn According to your message, you entered `33169` which is invalid. You should input `y` or `n`. > Would you like to use the modified body? (y/n): 33169

[GitHub] [spark] dongjoon-hyun commented on pull request #33169: [SPARK-35966][SQL] Port HIVE-17952: Fix license headers to avoid dangling javadoc warnings

2021-07-01 Thread GitBox
dongjoon-hyun commented on pull request #33169: URL: https://github.com/apache/spark/pull/33169#issuecomment-872367391 @yaooqinn According to your message, you entered `33169` which is invalid. You should input `y` or `n`. > Would you like to use the modified body? (y/n): 33169 --

[GitHub] [spark] SparkQA commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec

2021-07-01 Thread GitBox
SparkQA commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872366040 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45037/ -- This is an automated message from the

[GitHub] [spark] viirya commented on a change in pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
viirya commented on a change in pull request #33172: URL: https://github.com/apache/spark/pull/33172#discussion_r662409931 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -525,8 +525,31 @@ object SQLConf { .booleanConf

[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872349676 **[Test build #140525 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140525/testReport)** for PR 33172 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32832: [SPARK-35686][SQL] Not allow using auto-generated alias when creating view

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32832: URL: https://github.com/apache/spark/pull/32832#issuecomment-872348801 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140513/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-872348803 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45034/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #32084: URL: https://github.com/apache/spark/pull/32084#issuecomment-872348799 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45032/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
AmplabJenkins removed a comment on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872348800 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #32084: URL: https://github.com/apache/spark/pull/32084#issuecomment-872348799 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45032/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-872348803 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45034/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32832: [SPARK-35686][SQL] Not allow using auto-generated alias when creating view

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #32832: URL: https://github.com/apache/spark/pull/32832#issuecomment-872348801 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140513/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
AmplabJenkins commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872348804 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] gengliangwang commented on pull request #33169: [SPARK-35966][SQL] Port HIVE-17952: Fix license headers to avoid dangling javadoc warnings

2021-07-01 Thread GitBox
gengliangwang commented on pull request #33169: URL: https://github.com/apache/spark/pull/33169#issuecomment-872342975 +1, late LGTM! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA commented on pull request #33140: [SPARK-35881][SQL] Add support for columnar execution of final query stage in AdaptiveSparkPlanExec

2021-07-01 Thread GitBox
SparkQA commented on pull request #33140: URL: https://github.com/apache/spark/pull/33140#issuecomment-872341708 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45037/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872340571 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45036/ --

[GitHub] [spark] SparkQA commented on pull request #33172: [SPARK-35968][SQL] Make sure partitions are not too small in AQE partition coalescing

2021-07-01 Thread GitBox
SparkQA commented on pull request #33172: URL: https://github.com/apache/spark/pull/33172#issuecomment-872332218 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45035/ -- This is an automated message from the

[GitHub] [spark] dongjoon-hyun closed pull request #33171: [SPARK-35969][K8S] Make the pod prefix more readable and tallied with K8S DNS Label Names

2021-07-01 Thread GitBox
dongjoon-hyun closed pull request #33171: URL: https://github.com/apache/spark/pull/33171 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [spark] yaooqinn commented on pull request #33169: [SPARK-35966][SQL] Port HIVE-17952: Fix license headers to avoid dangling javadoc warnings

2021-07-01 Thread GitBox
yaooqinn commented on pull request #33169: URL: https://github.com/apache/spark/pull/33169#issuecomment-872330042 ``` ./dev/merge_spark_pr.py git rev-parse --abbrev-ref HEAD Which pull request would you like to merge? (e.g. 34): 33169

[GitHub] [spark] dongjoon-hyun commented on pull request #33168: [SPARK-35965][DOCS] Add doc for ORC nested column vectorized reader

2021-07-01 Thread GitBox
dongjoon-hyun commented on pull request #33168: URL: https://github.com/apache/spark/pull/33168#issuecomment-872328243 Thank you, @c21 and @HyukjinKwon ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] venkata91 commented on a change in pull request #33034: WIP: [SPARK-32923][CORE][SHUFFLE] Handle indeterminate stage retries for push-based shuffle

2021-07-01 Thread GitBox
venkata91 commented on a change in pull request #33034: URL: https://github.com/apache/spark/pull/33034#discussion_r662376427 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java ## @@ -222,7 +223,7 @@ public void

[GitHub] [spark] dongjoon-hyun commented on pull request #33169: [SPARK-35966][SQL] Port HIVE-17952: Fix license headers to avoid dangling javadoc warnings

2021-07-01 Thread GitBox
dongjoon-hyun commented on pull request #33169: URL: https://github.com/apache/spark/pull/33169#issuecomment-872327176 cc @gengliangwang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on pull request #33169: [SPARK-35966][SQL] Port HIVE-17952: Fix license headers to avoid dangling javadoc warnings

2021-07-01 Thread GitBox
dongjoon-hyun commented on pull request #33169: URL: https://github.com/apache/spark/pull/33169#issuecomment-872326422 Hi, @yaooqinn . Did you use `dev/merge_spark_pr.py`? The merge script help you clean up the PR description. -

[GitHub] [spark] SparkQA commented on pull request #32933: [SPARK-35785][SS] Cleanup support for RocksDB instance

2021-07-01 Thread GitBox
SparkQA commented on pull request #32933: URL: https://github.com/apache/spark/pull/32933#issuecomment-872325046 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45034/ -- This is an automated message from the

<    1   2   3   4   5   6   7   8   9   10   >