[GitHub] [spark] SparkQA commented on pull request #31213: [SPARK-34141][SQL] Remove side effect from ExtractGenerator

2021-01-30 Thread GitBox
SparkQA commented on pull request #31213: URL: https://github.com/apache/spark/pull/31213#issuecomment-770343037 **[Test build #134689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134689/testReport)** for PR 31213 at commit [`82ca8f5`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-30 Thread GitBox
SparkQA commented on pull request #31355: URL: https://github.com/apache/spark/pull/31355#issuecomment-770341216 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39273/ -

[GitHub] [spark] viirya edited a comment on pull request #31398: [SPARK-34297][SQL][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
viirya edited a comment on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770334535 Yes, I agree. This is not only for related to SS, there is SQL change especially in the public API. I'm okay to make API change as separated one later. Currently I t

[GitHub] [spark] viirya edited a comment on pull request #31398: [SPARK-34297][SQL][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
viirya edited a comment on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770334535 Yes, I agree. This is not only for related to SS, there is SQL change especially in the public API. I'm okay to make API change as separated one later. But I think w

[GitHub] [spark] SparkQA removed a comment on pull request #31398: [SPARK-34297][SQL][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770311985 **[Test build #134684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134684/testReport)** for PR 31398 at commit [`eebf9c6`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-30 Thread GitBox
SparkQA commented on pull request #31355: URL: https://github.com/apache/spark/pull/31355#issuecomment-770337518 **[Test build #134687 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134687/testReport)** for PR 31355 at commit [`be94bb4`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
SparkQA commented on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770337439 **[Test build #134688 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134688/testReport)** for PR 31273 at commit [`4e77301`](https://github.com

[GitHub] [spark] ulysses-you commented on pull request #31372: [SPARK-34272][SQL] Pretty SQL should check NonSQLExpression

2021-01-30 Thread GitBox
ulysses-you commented on pull request #31372: URL: https://github.com/apache/spark/pull/31372#issuecomment-770336377 I believe it's right that `NonSQLExpression` has no `sql` but just `toString`. E.g., the serializer of encode/decode is the special expression of Spark. But the quest

[GitHub] [spark] AmplabJenkins commented on pull request #31398: [SPARK-34297][SQL][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770336335 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134684/ -

[GitHub] [spark] viirya edited a comment on pull request #31398: [SPARK-34297][SQL][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
viirya edited a comment on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770334535 Yes, I agree. This is not only for related to SS, there is SQL change especially in the public API. I'm okay to make API change as separated one later. But I think w

[GitHub] [spark] viirya commented on pull request #31398: [SPARK-34297][SQL][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
viirya commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770334535 Yes, I agree. This is not only for related to SS, there is SQL change especially in the public API. This is an a

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SQL][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770334554 **[Test build #134684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134684/testReport)** for PR 31398 at commit [`eebf9c6`](https://github.co

[GitHub] [spark] aokolnychyi commented on a change in pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-30 Thread GitBox
aokolnychyi commented on a change in pull request #31355: URL: https://github.com/apache/spark/pull/31355#discussion_r567373288 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/WriteDistributionAndOrderingSuite.scala ## @@ -372,17 +492,82 @@ class WriteDist

[GitHub] [spark] HeartSaVioR commented on a change in pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-30 Thread GitBox
HeartSaVioR commented on a change in pull request #31355: URL: https://github.com/apache/spark/pull/31355#discussion_r567372512 ## File path: sql/core/src/test/scala/org/apache/spark/sql/connector/WriteDistributionAndOrderingSuite.scala ## @@ -372,17 +492,82 @@ class WriteDist

[GitHub] [spark] HeartSaVioR commented on pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-30 Thread GitBox
HeartSaVioR commented on pull request #31355: URL: https://github.com/apache/spark/pull/31355#issuecomment-770332998 I agree it may not be a good idea to add the method directly in RequiresDistributionAndOrdering, if we are sure to expand the possibility of controlling partitions. I'm open

[GitHub] [spark] HeartSaVioR commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
HeartSaVioR commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770332409 I think the most important part of this PR is the addition of metrics in public API. As the addition is going to affect the public API, I think we'll need to ping folks work

[GitHub] [spark] SparkQA commented on pull request #30542: [SPARK-33594][SQL] Forbid binary type as partition column

2021-01-30 Thread GitBox
SparkQA commented on pull request #30542: URL: https://github.com/apache/spark/pull/30542#issuecomment-770326094 **[Test build #134686 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134686/testReport)** for PR 30542 at commit [`bd928f3`](https://github.com

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31207: [SPARK-34136][PYTHON][SQL] Add support for complex literals in PySpark

2021-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #31207: URL: https://github.com/apache/spark/pull/31207#discussion_r567365177 ## File path: python/pyspark/sql/functions.py ## @@ -91,13 +92,58 @@ def lit(col): Creates a :class:`Column` of literal value. .. versiona

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
AmplabJenkins removed a comment on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770324995 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
AmplabJenkins removed a comment on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770310973 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770324996 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134682/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770324995 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770315763 **[Test build #134685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134685/testReport)** for PR 31273 at commit [`0fbf1cf`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
SparkQA commented on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770324692 **[Test build #134685 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134685/testReport)** for PR 31273 at commit [`0fbf1cf`](https://github.co

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31401: [SPARK-34300][PYSPARK][DOCS][MINOR] Fix some typos and syntax issues in docstrings and output of `dev/lint-python`

2021-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #31401: URL: https://github.com/apache/spark/pull/31401#discussion_r567363021 ## File path: python/pyspark/sql/avro/functions.py ## @@ -37,7 +37,7 @@ def from_avro(data, jsonFormatSchema, options=None): Parameters --

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31401: [SPARK-34300][PYSPARK][DOCS][MINOR] Fix some typos and syntax issues in docstrings and output of `dev/lint-python`

2021-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #31401: URL: https://github.com/apache/spark/pull/31401#discussion_r567363021 ## File path: python/pyspark/sql/avro/functions.py ## @@ -37,7 +37,7 @@ def from_avro(data, jsonFormatSchema, options=None): Parameters --

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31401: [SPARK-34300][PYSPARK][DOCS][MINOR] Fix some typos and syntax issues in docstrings and output of `dev/lint-python`

2021-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #31401: URL: https://github.com/apache/spark/pull/31401#discussion_r567363021 ## File path: python/pyspark/sql/avro/functions.py ## @@ -37,7 +37,7 @@ def from_avro(data, jsonFormatSchema, options=None): Parameters --

[GitHub] [spark] SparkQA removed a comment on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770298984 **[Test build #134682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134682/testReport)** for PR 31400 at commit [`50b9aa1`](https://gi

[GitHub] [spark] HyukjinKwon commented on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
HyukjinKwon commented on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770321856 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, pleas

[GitHub] [spark] SparkQA commented on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
SparkQA commented on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770321857 **[Test build #134682 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134682/testReport)** for PR 31400 at commit [`50b9aa1`](https://github.co

[GitHub] [spark] HyukjinKwon closed pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
HyukjinKwon closed pull request #31400: URL: https://github.com/apache/spark/pull/31400 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] HyukjinKwon commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
HyukjinKwon commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770321496 cc @HeartSaVioR, @gaborgsomogyi, @xuanyuanking FYI This is an automated message from the Apache Git Service.

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-30 Thread GitBox
HyukjinKwon commented on a change in pull request #29542: URL: https://github.com/apache/spark/pull/29542#discussion_r567360974 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java ## @@ -92,67 +88,23 @@

[GitHub] [spark] HyukjinKwon commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-30 Thread GitBox
HyukjinKwon commented on pull request #29542: URL: https://github.com/apache/spark/pull/29542#issuecomment-770321147 Yeah, because there was a resource leak problem (https://github.com/apache/spark/pull/29542#pullrequestreview-478269264) that's fixed in Parquet 1.11.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
AmplabJenkins removed a comment on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770320405 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39271/

[GitHub] [spark] AmplabJenkins commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770320405 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39271/ -

[GitHub] [spark] shangxinli commented on pull request #31393: [SPARK-34289][SQL] Parquet vectorized reader support column index

2021-01-30 Thread GitBox
shangxinli commented on pull request #31393: URL: https://github.com/apache/spark/pull/31393#issuecomment-770317824 Nice work! This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770316688 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39271/ ---

[GitHub] [spark] SparkQA commented on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
SparkQA commented on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770315763 **[Test build #134685 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134685/testReport)** for PR 31273 at commit [`0fbf1cf`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
AmplabJenkins removed a comment on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770311366 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770315633 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39270/ -

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770315237 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39271/ -

[GitHub] [spark] SparkQA removed a comment on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770279357 **[Test build #134679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134679/testReport)** for PR 31399 at commit [`c48c4ba`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770305305 **[Test build #134683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134683/testReport)** for PR 31273 at commit [`ac663aa`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770311985 **[Test build #134684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134684/testReport)** for PR 31398 at commit [`eebf9c6`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770311366 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134683/ -

[GitHub] [spark] SparkQA commented on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
SparkQA commented on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770311319 **[Test build #134683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134683/testReport)** for PR 31273 at commit [`ac663aa`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770310973 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39269/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770310972 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134679/ -

[GitHub] [spark] SparkQA commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
SparkQA commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770309878 **[Test build #134679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134679/testReport)** for PR 31399 at commit [`c48c4ba`](https://github.co

[GitHub] [spark] dongjoon-hyun closed pull request #31352: [SPARK-34269][SQL][TESTS][FOLLOWUP] Test a subquery with view in aggregate's grouping expression

2021-01-30 Thread GitBox
dongjoon-hyun closed pull request #31352: URL: https://github.com/apache/spark/pull/31352 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] github-actions[bot] commented on pull request #29392: [SPARK-32574][CORE] Race condition in FsHistoryProvider listing iteration

2021-01-30 Thread GitBox
github-actions[bot] commented on pull request #29392: URL: https://github.com/apache/spark/pull/29392#issuecomment-770306238 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue ma

[GitHub] [spark] SparkQA commented on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
SparkQA commented on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770306218 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39269/ ---

[GitHub] [spark] SparkQA commented on pull request #31273: [WIP][Spark-34152][SQL] Make CreateViewStatement.child to be LogicalPlan's children so that it's resolved in analyze phase

2021-01-30 Thread GitBox
SparkQA commented on pull request #31273: URL: https://github.com/apache/spark/pull/31273#issuecomment-770305305 **[Test build #134683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134683/testReport)** for PR 31273 at commit [`ac663aa`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
AmplabJenkins removed a comment on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770285132 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31352: [SPARK-34269][SQL][TESTS][FOLLOWUP] Test a subquery with view in aggregate's grouping expression

2021-01-30 Thread GitBox
AmplabJenkins removed a comment on pull request #31352: URL: https://github.com/apache/spark/pull/31352#issuecomment-770285131 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #31352: [SPARK-34269][SQL][TESTS][FOLLOWUP] Test a subquery with view in aggregate's grouping expression

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31352: URL: https://github.com/apache/spark/pull/31352#issuecomment-770304990 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134678/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770304991 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
SparkQA commented on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770303352 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39269/ -

[GitHub] [spark] srowen commented on pull request #31395: [SPARK-33599][SQL] Restore the assert-like in catalyst/analysis

2021-01-30 Thread GitBox
srowen commented on pull request #31395: URL: https://github.com/apache/spark/pull/31395#issuecomment-770303237 I tend to agree that the extra indirection doesn't help much here, but was there not an effort to centralize error messages? this seems to run counter to that.

[GitHub] [spark] srowen commented on pull request #29542: [SPARK-32703][SQL] Replace deprecated API calls from SpecificParquetRecordReaderBase

2021-01-30 Thread GitBox
srowen commented on pull request #29542: URL: https://github.com/apache/spark/pull/29542#issuecomment-770303159 This can go ahead because of the Parquet 1.11.1 update? This is an automated message from the Apache Git Service.

[GitHub] [spark] srowen commented on a change in pull request #31401: [SPARK-34300][PYSPARK][DOCS][MINOR] Fix some typos and syntax issues in docstrings and output of `dev/lint-python`

2021-01-30 Thread GitBox
srowen commented on a change in pull request #31401: URL: https://github.com/apache/spark/pull/31401#discussion_r567343162 ## File path: python/pyspark/sql/avro/functions.py ## @@ -37,7 +37,7 @@ def from_avro(data, jsonFormatSchema, options=None): Parameters ---

[GitHub] [spark] SparkQA removed a comment on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770293344 **[Test build #134681 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134681/testReport)** for PR 31398 at commit [`b4ff48c`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770302398 **[Test build #134681 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134681/testReport)** for PR 31398 at commit [`b4ff48c`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #31352: [SPARK-34269][SQL][TESTS][FOLLOWUP] Test a subquery with view in aggregate's grouping expression

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31352: URL: https://github.com/apache/spark/pull/31352#issuecomment-770271596 **[Test build #134678 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134678/testReport)** for PR 31352 at commit [`76772e7`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31352: [SPARK-34269][SQL][TESTS][FOLLOWUP] Test a subquery with view in aggregate's grouping expression

2021-01-30 Thread GitBox
SparkQA commented on pull request #31352: URL: https://github.com/apache/spark/pull/31352#issuecomment-770302162 **[Test build #134678 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134678/testReport)** for PR 31352 at commit [`76772e7`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770299772 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39268/ ---

[GitHub] [spark] SparkQA commented on pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
SparkQA commented on pull request #31400: URL: https://github.com/apache/spark/pull/31400#issuecomment-770298984 **[Test build #134682 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134682/testReport)** for PR 31400 at commit [`50b9aa1`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #31401: [SPARK-34300][PYSPARK][DOCS][MINOR] Fix some typos and syntax issues in docstrings and output of `dev/lint-python`

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31401: URL: https://github.com/apache/spark/pull/31401#issuecomment-770298939 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To resp

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31387: [SPARK-34282][SQL][TESTS] Unify v1 and v2 TRUNCATE TABLE tests

2021-01-30 Thread GitBox
AmplabJenkins removed a comment on pull request #31387: URL: https://github.com/apache/spark/pull/31387#issuecomment-770277756 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #31387: [SPARK-34282][SQL][TESTS] Unify v1 and v2 TRUNCATE TABLE tests

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31387: URL: https://github.com/apache/spark/pull/31387#issuecomment-770298420 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134677/ -

[GitHub] [spark] srowen commented on a change in pull request #31213: [SPARK-34141][SQL] Remove side effect from ExtractGenerator

2021-01-30 Thread GitBox
srowen commented on a change in pull request #31213: URL: https://github.com/apache/spark/pull/31213#discussion_r567336697 ## File path: sql/catalyst/src/test/scala-2.13/org/apache/spark/sql/catalyst/analysis/ExtractGeneratorSuite.scala ## @@ -0,0 +1,39 @@ +/* + * Licensed to

[GitHub] [spark] DavidToneian opened a new pull request #31401: Spark 34300

2021-01-30 Thread GitBox
DavidToneian opened a new pull request #31401: URL: https://github.com/apache/spark/pull/31401 This changeset is published into the public domain. ### What changes were proposed in this pull request? Some typos and syntax issues in docstrings have been fixed. ### Why are

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770296437 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39268/ -

[GitHub] [spark] imback82 opened a new pull request #31400: [SPARK-34299][SQL] Clean up ResolveSessionCatalog's isTempView and isTempFunction

2021-01-30 Thread GitBox
imback82 opened a new pull request #31400: URL: https://github.com/apache/spark/pull/31400 ### What changes were proposed in this pull request? `ResolveSessionCatalog`'s `isTempView` and `isTempFunction` are not being used anymore since the resolution of temp view/function ha

[GitHub] [spark] SparkQA removed a comment on pull request #31387: [SPARK-34282][SQL][TESTS] Unify v1 and v2 TRUNCATE TABLE tests

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31387: URL: https://github.com/apache/spark/pull/31387#issuecomment-770262019 **[Test build #134677 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134677/testReport)** for PR 31387 at commit [`4a7b7ca`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31387: [SPARK-34282][SQL][TESTS] Unify v1 and v2 TRUNCATE TABLE tests

2021-01-30 Thread GitBox
SparkQA commented on pull request #31387: URL: https://github.com/apache/spark/pull/31387#issuecomment-770294654 **[Test build #134677 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134677/testReport)** for PR 31387 at commit [`4a7b7ca`](https://github.co

[GitHub] [spark] alf239 commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
alf239 commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770293914 Extending @d80tb7's example and elaborating on > Even if the the column as a whole is determined to be of type string, the special values are still parsed as timestamps . T

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770293344 **[Test build #134681 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134681/testReport)** for PR 31398 at commit [`b4ff48c`](https://github.com

[GitHub] [spark] imback82 commented on pull request #31352: [SPARK-34269][SQL][TESTS][FOLLOWUP] Test a subquery with view in aggregate's grouping expression

2021-01-30 Thread GitBox
imback82 commented on pull request #31352: URL: https://github.com/apache/spark/pull/31352#issuecomment-770287681 > Hi @imback82 , can you re-open the PR to add the tests? thanks! Updated, thanks! This is an automa

[GitHub] [spark] imback82 commented on a change in pull request #27125: [SPARK-30420][SQL][FOLLOWUP] Remove statement logical plans for namespace commands

2021-01-30 Thread GitBox
imback82 commented on a change in pull request #27125: URL: https://github.com/apache/spark/pull/27125#discussion_r567325679 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveCatalogs.scala ## @@ -194,18 +185,6 @@ class ResolveCatalogs(val

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770286066 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39267/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770286068 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39267/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770286044 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39266/ -

[GitHub] [spark] SparkQA commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
SparkQA commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770286036 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39266/ ---

[GitHub] [spark] alf239 commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
alf239 commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770285210 The `TODAY` approach is imaginable indeed; I think the root problem is not the magic words as a timestamp, but rather the fact that a string gets reinterpreted as a timestamp, an

[GitHub] [spark] AmplabJenkins commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770285132 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134680/ -

[GitHub] [spark] AmplabJenkins commented on pull request #31352: [SPARK-34269][SQL][TESTS][FOLLOWUP] Test a subquery with view in aggregate's grouping expression

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31352: URL: https://github.com/apache/spark/pull/31352#issuecomment-770285131 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39265/ -

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770284184 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39267/ -

[GitHub] [spark] SparkQA commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
SparkQA commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770283044 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39266/ -

[GitHub] [spark] SparkQA removed a comment on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA removed a comment on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770279358 **[Test build #134680 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134680/testReport)** for PR 31398 at commit [`90acf6c`](https://gi

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770281282 **[Test build #134680 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134680/testReport)** for PR 31398 at commit [`90acf6c`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #31398: [SPARK-34297][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-01-30 Thread GitBox
SparkQA commented on pull request #31398: URL: https://github.com/apache/spark/pull/31398#issuecomment-770279358 **[Test build #134680 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134680/testReport)** for PR 31398 at commit [`90acf6c`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
SparkQA commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770279357 **[Test build #134679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134679/testReport)** for PR 31399 at commit [`c48c4ba`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #31352: [SPARK-34269][SQL][TESTS][FOLLOWUP] Test a subquery with view in aggregate's grouping expression

2021-01-30 Thread GitBox
SparkQA commented on pull request #31352: URL: https://github.com/apache/spark/pull/31352#issuecomment-770279175 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39265/ ---

[GitHub] [spark] AmplabJenkins commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770277753 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #31387: [SPARK-34282][SQL][TESTS] Unify v1 and v2 TRUNCATE TABLE tests

2021-01-30 Thread GitBox
AmplabJenkins commented on pull request #31387: URL: https://github.com/apache/spark/pull/31387#issuecomment-770277756 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39264/ -

[GitHub] [spark] MaxGekk edited a comment on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
MaxGekk edited a comment on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770277137 One more nuance: since the feature has been released already in Spark 3.0, we cannot just prohibit it because it can break users apps. > ... but the change here doe

[GitHub] [spark] MaxGekk commented on pull request #31399: [SPARK-34259][SQL] Don't attempt to parse file-based partitions as special timestamps

2021-01-30 Thread GitBox
MaxGekk commented on pull request #31399: URL: https://github.com/apache/spark/pull/31399#issuecomment-770277137 One more nuance: since the feature has been released already in Spark 3.0, we cannot just prohibit it because it can break users apps. > ... but the change here doesn't ac

[GitHub] [spark] viirya edited a comment on pull request #31182: [SPARK-34269][SQL][TESTS][FOLLOWUP] Add test cases for cache lookup and project removal

2021-01-30 Thread GitBox
viirya edited a comment on pull request #31182: URL: https://github.com/apache/spark/pull/31182#issuecomment-770276308 Thanks @sunchao . Merging to master. This is an automated message from the Apache Git Service. To respond

  1   2   >