[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-11-02 Thread GitBox
AngersZh commented on a change in pull request #34337: URL: https://github.com/apache/spark/pull/34337#discussion_r740683683 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRebaseDatetimeSuite.scala ## @@ -164,7 +165,13 @@ ab

[GitHub] [spark] cloud-fan commented on a change in pull request #34462: [SPARK-37191][SQL] Allow merging DecimalTypes with different precision values

2021-11-02 Thread GitBox
cloud-fan commented on a change in pull request #34462: URL: https://github.com/apache/spark/pull/34462#discussion_r740747727 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala ## @@ -643,15 +643,14 @@ object StructType extends AbstractDataTy

[GitHub] [spark] ByronHsu commented on a change in pull request #34466: [SPARK-37152][PYTHON] Inline type hints for python/pyspark/context.py

2021-11-02 Thread GitBox
ByronHsu commented on a change in pull request #34466: URL: https://github.com/apache/spark/pull/34466#discussion_r740760086 ## File path: python/pyspark/context.py ## @@ -150,8 +161,10 @@ def __init__(self, master=None, appName=None, sparkHome=None, pyFiles=None,

[GitHub] [spark] AmplabJenkins commented on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957264008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA commented on pull request #34443: [SPARK-37168][SQL] Improve error messages for SQL functions and operators under ANSI mode

2021-11-02 Thread GitBox
SparkQA commented on pull request #34443: URL: https://github.com/apache/spark/pull/34443#issuecomment-957020468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] xinrong-databricks commented on pull request #34374: [SPARK-37104][PYTHON] Make RDD and DStream covariant

2021-11-02 Thread GitBox
xinrong-databricks commented on pull request #34374: URL: https://github.com/apache/spark/pull/34374#issuecomment-957965048 Would you please give a short example of how the PR **improves the usability of the current annotations and simplifies further development of type hints**? -- This

[GitHub] [spark] gengliangwang commented on a change in pull request #34459: [SPARK-37179][SQL] ANSI mode: Add a config to allow casting between Datetime and Numeric

2021-11-02 Thread GitBox
gengliangwang commented on a change in pull request #34459: URL: https://github.com/apache/spark/pull/34459#discussion_r741298185 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala ## @@ -971,6 +971,10 @@ object QueryExecutionError

[GitHub] [spark] zero323 commented on pull request #34466: [SPARK-37152][PYTHON] Inline type hints for python/pyspark/context.py

2021-11-02 Thread GitBox
zero323 commented on pull request #34466: URL: https://github.com/apache/spark/pull/34466#issuecomment-957327097 > 1. I used lots of "cast" to cancel the error, but is there other better way? I see that these are primarily related to union attributes. We have ongoing discussion

[GitHub] [spark] HyukjinKwon closed pull request #34437: [SPARK-37156][PYTHON] Inline type hints for python/pyspark/storagelevel.py

2021-11-02 Thread GitBox
HyukjinKwon closed pull request #34437: URL: https://github.com/apache/spark/pull/34437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-un

[GitHub] [spark] AmplabJenkins commented on pull request #33628: [SPARK-36406][CORE] Avoid unnecessary file operations before delete a write failed file held by DiskBlockObjectWriter

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #33628: URL: https://github.com/apache/spark/pull/33628#issuecomment-957119251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] srowen commented on pull request #34457: [SPARK-37178][ML] Add Target Encoding to ml.feature

2021-11-02 Thread GitBox
srowen commented on pull request #34457: URL: https://github.com/apache/spark/pull/34457#issuecomment-957021245 This appears to assume the target is 0/1. Target encoding is more general than that. This would have to be implemented in Python as well at least, and have tests. -- This is

[GitHub] [spark] SparkQA commented on pull request #34431: [SPARK-35437][SQL] Use expressions to filter Hive partitions at client side

2021-11-02 Thread GitBox
SparkQA commented on pull request #34431: URL: https://github.com/apache/spark/pull/34431#issuecomment-957076039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] MaxGekk edited a comment on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
MaxGekk edited a comment on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957269638 +1, LGTM. Merging to master. Thank you, @advancedxy and @yaooqinn @HyukjinKwon for review. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] SparkQA removed a comment on pull request #34443: [SPARK-37168][SQL] Improve error messages for SQL functions and operators under ANSI mode

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34443: URL: https://github.com/apache/spark/pull/34443#issuecomment-957020468 **[Test build #144826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144826/testReport)** for PR 34443 at commit [`7c3ee2c`](https://gi

[GitHub] [spark] HyukjinKwon commented on pull request #34459: [SPARK-37179][SQL] ANSI mode: Allow casting between Timestamp and Numeric

2021-11-02 Thread GitBox
HyukjinKwon commented on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-957131152 @gengliangwang, I think "TimestampNTZ <=> Numeric" should be disallowed because it does not have a timezone, and it cannot determine the actual timestamp value. Returning th

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34445: [SPARK-36646][SQL] Push down group by partition column for aggregate

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34445: URL: https://github.com/apache/spark/pull/34445#issuecomment-958044555 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] cloud-fan closed pull request #34441: [SPARK-37164][SQL] Add ExpressionBuilder for functions with complex overloads

2021-11-02 Thread GitBox
cloud-fan closed pull request #34441: URL: https://github.com/apache/spark/pull/34441 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsu

[GitHub] [spark] SparkQA removed a comment on pull request #34463: [SPARK-37190][SQL] Improve error messages for Cast under ANSI mode

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34463: URL: https://github.com/apache/spark/pull/34463#issuecomment-956898943 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins commented on pull request #34469: Support drop index

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-957974156 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA commented on pull request #34468: [SPARK-37194][SQL] Avoid unnecessary sort in FileFormatWriter if it's not dynamic partition

2021-11-02 Thread GitBox
SparkQA commented on pull request #34468: URL: https://github.com/apache/spark/pull/34468#issuecomment-957268010 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34411: [SPARK-37137][PYTHON] Inline type hints for python/pyspark/conf.py

2021-11-02 Thread GitBox
HyukjinKwon commented on a change in pull request #34411: URL: https://github.com/apache/spark/pull/34411#discussion_r740773747 ## File path: python/pyspark/conf.py ## @@ -124,48 +130,57 @@ def __init__(self, loadDefaults=True, _jvm=None, _jconf=None): self._j

[GitHub] [spark] AmplabJenkins commented on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-957113832 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
dongjoon-hyun commented on a change in pull request #34199: URL: https://github.com/apache/spark/pull/34199#discussion_r740727131 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java ## @@ -152,6 +152,7 @@

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34464: [SPARK-37193][SQL] DynamicJoinSelection.shouldDemoteBroadcastHashJoin should not apply to outer joins

2021-11-02 Thread GitBox
HyukjinKwon commented on a change in pull request #34464: URL: https://github.com/apache/spark/pull/34464#discussion_r740744598 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala ## @@ -651,6 +651,23 @@ class AdaptiveQuer

[GitHub] [spark] SparkQA removed a comment on pull request #34402: [SPARK-30220] Enable using Exists/In subqueries outside of the Filter node

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34402: URL: https://github.com/apache/spark/pull/34402#issuecomment-957114414 **[Test build #144833 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144833/testReport)** for PR 34402 at commit [`ce3adb5`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #32987: [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #32987: URL: https://github.com/apache/spark/pull/32987#issuecomment-956887079 **[Test build #144822 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144822/testReport)** for PR 32987 at commit [`f4ed2be`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #34469: Support drop index

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-957748528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #33404: URL: https://github.com/apache/spark/pull/33404#issuecomment-957019972 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA commented on pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
SparkQA commented on pull request #34199: URL: https://github.com/apache/spark/pull/34199#issuecomment-958160628 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49328/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-957113832 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] sunchao commented on a change in pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
sunchao commented on a change in pull request #34199: URL: https://github.com/apache/spark/pull/34199#discussion_r741271037 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetColumn.scala ## @@ -0,0 +1,68 @@ +/* + * Licensed to the

[GitHub] [spark] AmplabJenkins commented on pull request #34445: [SPARK-36646][SQL] Push down group by partition column for aggregate

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34445: URL: https://github.com/apache/spark/pull/34445#issuecomment-958044555 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins commented on pull request #34431: [SPARK-35437][SQL] Use expressions to filter Hive partitions at client side

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34431: URL: https://github.com/apache/spark/pull/34431#issuecomment-957117831 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34120: URL: https://github.com/apache/spark/pull/34120#issuecomment-956640806 **[Test build #144820 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144820/testReport)** for PR 34120 at commit [`023963c`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34402: [SPARK-30220] Enable using Exists/In subqueries outside of the Filter node

2021-11-02 Thread GitBox
SparkQA commented on pull request #34402: URL: https://github.com/apache/spark/pull/34402#issuecomment-957114414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] Peng-Lei commented on pull request #34462: [SPARK-37191] Allow merging DecimalTypes with different precision values

2021-11-02 Thread GitBox
Peng-Lei commented on pull request #34462: URL: https://github.com/apache/spark/pull/34462#issuecomment-957064173 title: [SPARK-37191] -> [SPARK-37191][SQL] -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abo

[GitHub] [spark] SparkQA commented on pull request #33628: [SPARK-36406][CORE] Avoid unnecessary file operations before delete a write failed file held by DiskBlockObjectWriter

2021-11-02 Thread GitBox
SparkQA commented on pull request #33628: URL: https://github.com/apache/spark/pull/33628#issuecomment-957070742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] SparkQA removed a comment on pull request #34444: [SPARK-32567][SQL] Add code-gen for full outer shuffled hash join

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #3: URL: https://github.com/apache/spark/pull/3#issuecomment-956639649 **[Test build #144819 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144819/testReport)** for PR 3 at commit [`5dcd5db`](https://gi

[GitHub] [spark] HyukjinKwon removed a comment on pull request #34459: [SPARK-37179][SQL] ANSI mode: Allow casting between Timestamp and Numeric

2021-11-02 Thread GitBox
HyukjinKwon removed a comment on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-957131152 @gengliangwang, I think "TimestampNTZ <=> Numeric" should be disallowed because it does not have a timezone, and it cannot determine the actual timestamp value. Retu

[GitHub] [spark] xuechendi commented on pull request #34396: [SPARK-37124][SQL] Support RowToColumnarExec with Arrow format

2021-11-02 Thread GitBox
xuechendi commented on pull request #34396: URL: https://github.com/apache/spark/pull/34396#issuecomment-957064031 @cloud-fan @viirya @sunchao , hi, all, I realized that what you guys said makes sense to me, instead of writing data to arrow like what other WritableColumnVector does, I can

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34463: [SPARK-37190][SQL] Improve error messages for Cast under ANSI mode

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34463: URL: https://github.com/apache/spark/pull/34463#issuecomment-957019971 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #33404: URL: https://github.com/apache/spark/pull/33404#issuecomment-957019972 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] cloud-fan commented on pull request #34441: [SPARK-37164][SQL] Add ExpressionBuilder for functions with complex overloads

2021-11-02 Thread GitBox
cloud-fan commented on pull request #34441: URL: https://github.com/apache/spark/pull/34441#issuecomment-957130287 thanks for the review, mering to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [spark] AmplabJenkins commented on pull request #34467: [SPARK-36895][SQL][FOLLOWUP] CREATE INDEX command should rely on the analyzer framework to resolve columns

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34467: URL: https://github.com/apache/spark/pull/34467#issuecomment-957354920 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA commented on pull request #34458: [MINOR][DOCS] Corrected spacing in structured streaming programming

2021-11-02 Thread GitBox
SparkQA commented on pull request #34458: URL: https://github.com/apache/spark/pull/34458#issuecomment-957021782 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] cxzl25 commented on a change in pull request #34041: [SPARK-36799][SQL] Pass queryExecution name in CLI when only select query

2021-11-02 Thread GitBox
cxzl25 commented on a change in pull request #34041: URL: https://github.com/apache/spark/pull/34041#discussion_r740756030 ## File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLDriver.scala ## @@ -65,7 +65,11 @@ private[hive] class S

[GitHub] [spark] SparkQA commented on pull request #34460: [SPARK-36566][K8S] Add Spark appname as a label to pods

2021-11-02 Thread GitBox
SparkQA commented on pull request #34460: URL: https://github.com/apache/spark/pull/34460#issuecomment-957045681 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] SparkQA removed a comment on pull request #34411: [SPARK-37137][PYTHON] Inline type hints for python/pyspark/conf.py

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34411: URL: https://github.com/apache/spark/pull/34411#issuecomment-957166087 **[Test build #144842 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144842/testReport)** for PR 34411 at commit [`82898e0`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #34470: [SPARK-37199][SQL]: Add deterministic field to QueryPlan

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34470: URL: https://github.com/apache/spark/pull/34470#issuecomment-958024984 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34460: [SPARK-36566][K8S] Add Spark appname as a label to pods

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34460: URL: https://github.com/apache/spark/pull/34460#issuecomment-957069404 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins commented on pull request #34411: [SPARK-37137][PYTHON] Inline type hints for python/pyspark/conf.py

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34411: URL: https://github.com/apache/spark/pull/34411#issuecomment-95728 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins commented on pull request #34468: [SPARK-37194][SQL] Avoid unnecessary sort in FileFormatWriter if it's not dynamic partition

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34468: URL: https://github.com/apache/spark/pull/34468#issuecomment-957354917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA commented on pull request #34469: Support drop index

2021-11-02 Thread GitBox
SparkQA commented on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-957748528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] xinrong-databricks edited a comment on pull request #34374: [SPARK-37104][PYTHON] Make RDD and DStream covariant

2021-11-02 Thread GitBox
xinrong-databricks edited a comment on pull request #34374: URL: https://github.com/apache/spark/pull/34374#issuecomment-957965048 Would you please give a short example of how the PR **improves the usability of the current annotations and simplifies further development of type hints**? Tha

[GitHub] [spark] LuciferYang commented on a change in pull request #32648: [WIP][SPARK-35496][BUILD] Upgrade Scala to 2.13.7

2021-11-02 Thread GitBox
LuciferYang commented on a change in pull request #32648: URL: https://github.com/apache/spark/pull/32648#discussion_r740814443 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala ## @@ -57,7 +57,8 @@

[GitHub] [spark] advancedxy commented on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
advancedxy commented on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957170019 > im okay with this change. I think this is dupe of #33706 but let's go ahead with this PR. cc @yaooqinn FYI Ah, I didn't see that one. @yaooqinn would like to fini

[GitHub] [spark] AmplabJenkins commented on pull request #34463: [SPARK-37190][SQL] Improve error messages for Cast under ANSI mode

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34463: URL: https://github.com/apache/spark/pull/34463#issuecomment-957019971 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] MaxGekk commented on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
MaxGekk commented on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957173286 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] zero323 edited a comment on pull request #34466: [SPARK-37152][PYTHON] Inline type hints for python/pyspark/context.py

2021-11-02 Thread GitBox
zero323 edited a comment on pull request #34466: URL: https://github.com/apache/spark/pull/34466#issuecomment-957327097 > 1. I used lots of "cast" to cancel the error, but is there other better way? I see that these are primarily related to union attributes. We have ongoing discuss

[GitHub] [spark] HyukjinKwon commented on pull request #34411: [SPARK-37137][PYTHON] Inline type hints for python/pyspark/conf.py

2021-11-02 Thread GitBox
HyukjinKwon commented on pull request #34411: URL: https://github.com/apache/spark/pull/34411#issuecomment-957164787 add to whitelist -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34465: [MINOR] Document JDBC aggregate push down is for DSV2 only

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34465: URL: https://github.com/apache/spark/pull/34465#issuecomment-957211126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] allisonwang-db commented on pull request #34463: [SPARK-37190][SQL] Improve error messages for Cast under ANSI mode

2021-11-02 Thread GitBox
allisonwang-db commented on pull request #34463: URL: https://github.com/apache/spark/pull/34463#issuecomment-958134646 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] sunchao commented on a change in pull request #34445: [SPARK-36646][SQL] Push down group by partition column for aggregate

2021-11-02 Thread GitBox
sunchao commented on a change in pull request #34445: URL: https://github.com/apache/spark/pull/34445#discussion_r741361243 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala ## @@ -457,17 +457,22 @@ object OrcUtils extends Logg

[GitHub] [spark] HyukjinKwon commented on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
HyukjinKwon commented on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957134486 im okay with this change. I think this is dupe of https://github.com/apache/spark/pull/33706 but let's go ahead with this PR. cc @yaooqinn FYI -- This is an automated mes

[GitHub] [spark] SparkQA removed a comment on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957166042 **[Test build #144841 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144841/testReport)** for PR 34455 at commit [`8dae181`](https://gi

[GitHub] [spark] SparkQA commented on pull request #34462: [SPARK-37191][SQL] Allow merging DecimalTypes with different precision values

2021-11-02 Thread GitBox
SparkQA commented on pull request #34462: URL: https://github.com/apache/spark/pull/34462#issuecomment-957138180 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34467: [SPARK-36895][SQL][FOLLOWUP] CREATE INDEX command should rely on the analyzer framework to resolve columns

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34467: URL: https://github.com/apache/spark/pull/34467#issuecomment-957354920 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34458: [MINOR][DOCS] Corrected spacing in structured streaming programming

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34458: URL: https://github.com/apache/spark/pull/34458#issuecomment-956282737 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA removed a comment on pull request #34431: [SPARK-35437][SQL] Use expressions to filter Hive partitions at client side

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34431: URL: https://github.com/apache/spark/pull/34431#issuecomment-957076039 **[Test build #144831 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144831/testReport)** for PR 34431 at commit [`fa69af0`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34302: [SPARK-37028][UI] Add a 'kill' executor link in the Web UI.

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34302: URL: https://github.com/apache/spark/pull/34302#issuecomment-945630961 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] ulysses-you commented on pull request #34468: [SPARK-37194][SQL] Avoid unnecessary sort in FileFormatWriter if it's not dynamic partition

2021-11-02 Thread GitBox
ulysses-you commented on pull request #34468: URL: https://github.com/apache/spark/pull/34468#issuecomment-957744307 cc @yaooqinn @cloud-fan @viirya if you have time to take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

[GitHub] [spark] AmplabJenkins commented on pull request #34460: [SPARK-36566][K8S] Add Spark appname as a label to pods

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34460: URL: https://github.com/apache/spark/pull/34460#issuecomment-957069404 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA commented on pull request #34411: [SPARK-37137][PYTHON] Inline type hints for python/pyspark/conf.py

2021-11-02 Thread GitBox
SparkQA commented on pull request #34411: URL: https://github.com/apache/spark/pull/34411#issuecomment-957166087 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] sarutak commented on pull request #34425: [SPARK-37159][SQL][TESTS] Change HiveExternalCatalogVersionsSuite to be able to test with Java 17

2021-11-02 Thread GitBox
sarutak commented on pull request #34425: URL: https://github.com/apache/spark/pull/34425#issuecomment-957031635 Merged to `master`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] LuciferYang commented on pull request #32648: [WIP][SPARK-35496][BUILD] Upgrade Scala to 2.13.7

2021-11-02 Thread GitBox
LuciferYang commented on pull request #32648: URL: https://github.com/apache/spark/pull/32648#issuecomment-957142554 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [spark] SparkQA removed a comment on pull request #32648: [WIP][SPARK-35496][BUILD] Upgrade Scala to 2.13.7

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #32648: URL: https://github.com/apache/spark/pull/32648#issuecomment-957119920 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] dchvn commented on pull request #34437: [SPARK-37156][PYTHON] Inline type hints for python/pyspark/storagelevel.py

2021-11-02 Thread GitBox
dchvn commented on pull request #34437: URL: https://github.com/apache/spark/pull/34437#issuecomment-957182262 thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [spark] SparkQA removed a comment on pull request #33522: [SPARK-36290][SQL] Pull out join condition

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #33522: URL: https://github.com/apache/spark/pull/33522#issuecomment-957140343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] SparkQA removed a comment on pull request #34468: [SPARK-37194][SQL] Avoid unnecessary sort in FileFormatWriter if it's not dynamic partition

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34468: URL: https://github.com/apache/spark/pull/34468#issuecomment-957268010 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] sarutak commented on pull request #34356: [SPARK-36554][SQL][PYTHON] Expose make_date expression in functions.scala

2021-11-02 Thread GitBox
sarutak commented on pull request #34356: URL: https://github.com/apache/spark/pull/34356#issuecomment-957017172 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34469: Support drop index

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-957974155 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] HyukjinKwon commented on pull request #34462: [SPARK-37191][SQL] Allow merging DecimalTypes with different precision values

2021-11-02 Thread GitBox
HyukjinKwon commented on pull request #34462: URL: https://github.com/apache/spark/pull/34462#issuecomment-957129828 add to whitelist -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34459: [SPARK-37179][SQL] ANSI mode: Add a config to allow casting between Datetime and Numeric

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-958023802 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32987: [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #32987: URL: https://github.com/apache/spark/pull/32987#issuecomment-957019966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] HyukjinKwon commented on pull request #34437: [SPARK-37156][PYTHON] Inline type hints for python/pyspark/storagelevel.py

2021-11-02 Thread GitBox
HyukjinKwon commented on pull request #34437: URL: https://github.com/apache/spark/pull/34437#issuecomment-957171221 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] AmplabJenkins commented on pull request #34466: [SPARK-37152][PYTHON] Inline type hints for python/pyspark/context.py

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34466: URL: https://github.com/apache/spark/pull/34466#issuecomment-957145027 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] SparkQA commented on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-11-02 Thread GitBox
SparkQA commented on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-957070257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33522: [SPARK-36290][SQL] Pull out join condition

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #33522: URL: https://github.com/apache/spark/pull/33522#issuecomment-957211122 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA commented on pull request #34467: [SPARK-36895][SQL][FOLLOWUP] CREATE INDEX command should rely on the analyzer framework to resolve columns

2021-11-02 Thread GitBox
SparkQA commented on pull request #34467: URL: https://github.com/apache/spark/pull/34467#issuecomment-957265199 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957264008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA removed a comment on pull request #33628: [SPARK-36406][CORE] Avoid unnecessary file operations before delete a write failed file held by DiskBlockObjectWriter

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #33628: URL: https://github.com/apache/spark/pull/33628#issuecomment-957070742 **[Test build #144830 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144830/testReport)** for PR 33628 at commit [`7ae2b8d`](https://gi

[GitHub] [spark] Yikun commented on pull request #34460: [SPARK-36566][K8S] Add Spark appname as a label to pods

2021-11-02 Thread GitBox
Yikun commented on pull request #34460: URL: https://github.com/apache/spark/pull/34460#issuecomment-957047858 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscrib

[GitHub] [spark] srowen closed pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-11-02 Thread GitBox
srowen closed pull request #34337: URL: https://github.com/apache/spark/pull/34337 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubsc

[GitHub] [spark] SparkQA commented on pull request #33522: [SPARK-36290][SQL] Pull out join condition

2021-11-02 Thread GitBox
SparkQA commented on pull request #33522: URL: https://github.com/apache/spark/pull/33522#issuecomment-957140343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins commented on pull request #32987: [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #32987: URL: https://github.com/apache/spark/pull/32987#issuecomment-957019966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34462: [SPARK-37191][SQL] Allow merging DecimalTypes with different precision values

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34462: URL: https://github.com/apache/spark/pull/34462#issuecomment-956883045 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] allisonwang-db commented on a change in pull request #34459: [SPARK-37179][SQL] ANSI mode: Allow casting between Timestamp and Numeric

2021-11-02 Thread GitBox
allisonwang-db commented on a change in pull request #34459: URL: https://github.com/apache/spark/pull/34459#discussion_r741262498 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala ## @@ -971,6 +971,10 @@ object QueryExecutionErro

[GitHub] [spark] huaxingao commented on pull request #34469: Support drop index

2021-11-02 Thread GitBox
huaxingao commented on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-958165460 cc @sunchao @viirya @dbtsai -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

[GitHub] [spark] srowen commented on pull request #34458: [MINOR][DOCS] Corrected spacing in structured streaming programming

2021-11-02 Thread GitBox
srowen commented on pull request #34458: URL: https://github.com/apache/spark/pull/34458#issuecomment-957021397 Jenkins test this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the speci

[GitHub] [spark] AmplabJenkins commented on pull request #34464: [SPARK-37193][SQL] DynamicJoinSelection.shouldDemoteBroadcastHashJoin should not apply to outer joins

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34464: URL: https://github.com/apache/spark/pull/34464#issuecomment-957073709 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

<    1   2   3   4   5   6   7   8   9   >