[GitHub] [spark] MaxGekk commented on a change in pull request #22037: [SPARK-24774][SQL] Avro: Support logical decimal type

2021-11-02 Thread GitBox
MaxGekk commented on a change in pull request #22037: URL: https://github.com/apache/spark/pull/22037#discussion_r741373793 ## File path: external/avro/src/main/scala/org/apache/spark/sql/avro/SchemaConverters.scala ## @@ -18,19 +18,28 @@ package org.apache.spark.sql.avro

[GitHub] [spark] sunchao commented on a change in pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
sunchao commented on a change in pull request #34199: URL: https://github.com/apache/spark/pull/34199#discussion_r741374104 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaSuite.scala ## @@ -114,7 +130,66 @@ abstract class

[GitHub] [spark] sunchao commented on a change in pull request #34445: [SPARK-36646][SQL] Push down group by partition column for aggregate

2021-11-02 Thread GitBox
sunchao commented on a change in pull request #34445: URL: https://github.com/apache/spark/pull/34445#discussion_r741361243 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcUtils.scala ## @@ -457,17 +457,22 @@ object OrcUtils extends Logg

[GitHub] [spark] SparkQA commented on pull request #34469: Support drop index

2021-11-02 Thread GitBox
SparkQA commented on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-958026676 **[Test build #144857 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144857/testReport)** for PR 34469 at commit [`56299c1`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #34470: [SPARK-37199][SQL]: Add deterministic field to QueryPlan

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34470: URL: https://github.com/apache/spark/pull/34470#issuecomment-958024984 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34459: [SPARK-37179][SQL] ANSI mode: Add a config to allow casting between Datetime and Numeric

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-958023802 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49322/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34467: [SPARK-36895][SQL][FOLLOWUP] CREATE INDEX command should rely on the analyzer framework to resolve columns

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34467: URL: https://github.com/apache/spark/pull/34467#issuecomment-958023911 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49321/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34469: Support drop index

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-958023909 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49326/

[GitHub] [spark] AmplabJenkins commented on pull request #34469: Support drop index

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-958023909 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49326/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #34467: [SPARK-36895][SQL][FOLLOWUP] CREATE INDEX command should rely on the analyzer framework to resolve columns

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34467: URL: https://github.com/apache/spark/pull/34467#issuecomment-958023911 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49321/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #34459: [SPARK-37179][SQL] ANSI mode: Add a config to allow casting between Datetime and Numeric

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-958023802 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/49322/ -- T

[GitHub] [spark] SparkQA commented on pull request #34459: [SPARK-37179][SQL] ANSI mode: Add a config to allow casting between Datetime and Numeric

2021-11-02 Thread GitBox
SparkQA commented on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-958023620 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49322/ -- This is an automated message from the A

[GitHub] [spark] somani commented on pull request #34470: [SPARK-37199][SQL]: Add deterministic field to QueryPlan

2021-11-02 Thread GitBox
somani commented on pull request #34470: URL: https://github.com/apache/spark/pull/34470#issuecomment-958021427 cc @cloud-fan @sigmod -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] SparkQA commented on pull request #34467: [SPARK-36895][SQL][FOLLOWUP] CREATE INDEX command should rely on the analyzer framework to resolve columns

2021-11-02 Thread GitBox
SparkQA commented on pull request #34467: URL: https://github.com/apache/spark/pull/34467#issuecomment-957994918 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49321/ -- This is an automated message from the A

[GitHub] [spark] SparkQA commented on pull request #34459: [SPARK-37179][SQL] ANSI mode: Add a config to allow casting between Datetime and Numeric

2021-11-02 Thread GitBox
SparkQA commented on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-958015745 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49324/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #34445: [SPARK-36646][SQL] Push down group by partition column for aggregate

2021-11-02 Thread GitBox
SparkQA commented on pull request #34445: URL: https://github.com/apache/spark/pull/34445#issuecomment-958012816 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49325/ -- This is an automated message from the Apache

[GitHub] [spark] somani opened a new pull request #34470: [SPARK-37199][SQL]: Add deterministic field to QueryPlan

2021-11-02 Thread GitBox
somani opened a new pull request #34470: URL: https://github.com/apache/spark/pull/34470 ### What changes were proposed in this pull request? We have a deterministic field in Expressions to check if an expression is deterministic, but we do not have a similar field in QueryPlan.

[GitHub] [spark] SparkQA commented on pull request #34463: [SPARK-37190][SQL] Improve error messages for Cast under ANSI mode

2021-11-02 Thread GitBox
SparkQA commented on pull request #34463: URL: https://github.com/apache/spark/pull/34463#issuecomment-958015349 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49323/ -- This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon commented on pull request #34356: [SPARK-36554][SQL][PYTHON] Expose make_date expression in functions.scala

2021-11-02 Thread GitBox
HyukjinKwon commented on pull request #34356: URL: https://github.com/apache/spark/pull/34356#issuecomment-957027088 👍 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] SparkQA removed a comment on pull request #34462: [SPARK-37191][SQL] Allow merging DecimalTypes with different precision values

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34462: URL: https://github.com/apache/spark/pull/34462#issuecomment-957138180 **[Test build #144837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144837/testReport)** for PR 34462 at commit [`64935fe`](https://gi

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34464: [SPARK-37193][SQL] DynamicJoinSelection.shouldDemoteBroadcastHashJoin should not apply to outer joins

2021-11-02 Thread GitBox
HyukjinKwon commented on a change in pull request #34464: URL: https://github.com/apache/spark/pull/34464#discussion_r740744598 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala ## @@ -651,6 +651,23 @@ class AdaptiveQuer

[GitHub] [spark] dchvn commented on pull request #34437: [SPARK-37156][PYTHON] Inline type hints for python/pyspark/storagelevel.py

2021-11-02 Thread GitBox
dchvn commented on pull request #34437: URL: https://github.com/apache/spark/pull/34437#issuecomment-957182262 thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

[GitHub] [spark] SparkQA removed a comment on pull request #34467: [SPARK-36895][SQL][FOLLOWUP] CREATE INDEX command should rely on the analyzer framework to resolve columns

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34467: URL: https://github.com/apache/spark/pull/34467#issuecomment-957265199 **[Test build #144845 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144845/testReport)** for PR 34467 at commit [`fe2115e`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34402: [SPARK-30220] Enable using Exists/In subqueries outside of the Filter node

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34402: URL: https://github.com/apache/spark/pull/34402#issuecomment-957164941 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] yaooqinn commented on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
yaooqinn commented on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957173591 I am OK, this change looks good to me -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] huaxingao commented on a change in pull request #34467: [SPARK-36895][SQL][FOLLOWUP] CREATE INDEX command should rely on the analyzer framework to resolve columns

2021-11-02 Thread GitBox
huaxingao commented on a change in pull request #34467: URL: https://github.com/apache/spark/pull/34467#discussion_r741142738 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -4439,7 +4439,7 @@ class AstBuilder extends SqlBa

[GitHub] [spark] viirya commented on a change in pull request #34441: [SPARK-37164][SQL] Add ExpressionBuilder for functions with complex overloads

2021-11-02 Thread GitBox
viirya commented on a change in pull request #34441: URL: https://github.com/apache/spark/pull/34441#discussion_r740361733 ## File path: sql/core/src/test/resources/sql-tests/results/string-functions.sql.out ## @@ -533,39 +533,35 @@ AABB -- !query -SELECT lpad('abc', 5, x

[GitHub] [spark] SparkQA commented on pull request #34463: [SPARK-37190][SQL] Improve error messages for Cast under ANSI mode

2021-11-02 Thread GitBox
SparkQA commented on pull request #34463: URL: https://github.com/apache/spark/pull/34463#issuecomment-956898943 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins commented on pull request #34468: [SPARK-37194][SQL] Avoid unnecessary sort in FileFormatWriter if it's not dynamic partition

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34468: URL: https://github.com/apache/spark/pull/34468#issuecomment-957354917 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA removed a comment on pull request #34460: [SPARK-36566][K8S] Add Spark appname as a label to pods

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34460: URL: https://github.com/apache/spark/pull/34460#issuecomment-956333285 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] huaxingao commented on a change in pull request #34445: [SPARK-36646][SQL] Push down group by partition column for aggregate

2021-11-02 Thread GitBox
huaxingao commented on a change in pull request #34445: URL: https://github.com/apache/spark/pull/34445#discussion_r741315128 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceAggregatePushDownSuite.scala ## @@ -261,6 +261,63 @@ trait F

[GitHub] [spark] allisonwang-db commented on pull request #34443: [SPARK-37168][SQL] Improve error messages for SQL functions and operators under ANSI mode

2021-11-02 Thread GitBox
allisonwang-db commented on pull request #34443: URL: https://github.com/apache/spark/pull/34443#issuecomment-956410050 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specifi

[GitHub] [spark] wangyum commented on pull request #33522: [SPARK-36290][SQL] Pull out join condition

2021-11-02 Thread GitBox
wangyum commented on pull request #33522: URL: https://github.com/apache/spark/pull/33522#issuecomment-957139228 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

[GitHub] [spark] SparkQA removed a comment on pull request #34465: [MINOR] Document JDBC aggregate push down is for DSV2 only

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34465: URL: https://github.com/apache/spark/pull/34465#issuecomment-957138179 **[Test build #144836 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144836/testReport)** for PR 34465 at commit [`7f44f6c`](https://gi

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34451: [SPARK-37038][SQL] DSV2 Sample Push Down

2021-11-02 Thread GitBox
HyukjinKwon commented on a change in pull request #34451: URL: https://github.com/apache/spark/pull/34451#discussion_r740750458 ## File path: external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala ## @@ -284,4 +288,83 @@ private[v2] trai

[GitHub] [spark] MaxGekk closed pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
MaxGekk closed pull request #34455: URL: https://github.com/apache/spark/pull/34455 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubs

[GitHub] [spark] SparkQA commented on pull request #34009: [SPARK-34378][SQL][AVRO] Enhance AvroSerializer validation to allow extra nullable Avro fields

2021-11-02 Thread GitBox
SparkQA commented on pull request #34009: URL: https://github.com/apache/spark/pull/34009#issuecomment-956640759 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34120: URL: https://github.com/apache/spark/pull/34120#issuecomment-956882475 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] dchvn commented on pull request #34439: [SPARK-37095][PYTHON] Inline type hints for files in python/pyspark/broadcast.py

2021-11-02 Thread GitBox
dchvn commented on pull request #34439: URL: https://github.com/apache/spark/pull/34439#issuecomment-957181189 CC @HyukjinKwon @zero323 @ueshin too. Many thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34009: [SPARK-34378][SQL][AVRO] Enhance AvroSerializer validation to allow extra nullable Avro fields

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34009: URL: https://github.com/apache/spark/pull/34009#issuecomment-956882477 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34120: URL: https://github.com/apache/spark/pull/34120#issuecomment-956882475 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] allisonwang-db commented on a change in pull request #34402: [SPARK-30220] Enable using Exists/In subqueries outside of the Filter node

2021-11-02 Thread GitBox
allisonwang-db commented on a change in pull request #34402: URL: https://github.com/apache/spark/pull/34402#discussion_r740540249 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ## @@ -687,10 +687,10 @@ trait CheckAnalysis

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34411: [SPARK-37137][PYTHON] Inline type hints for python/pyspark/conf.py

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34411: URL: https://github.com/apache/spark/pull/34411#issuecomment-955241543 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] cloud-fan closed pull request #34443: [SPARK-37168][SQL] Improve error messages for SQL functions and operators under ANSI mode

2021-11-02 Thread GitBox
cloud-fan closed pull request #34443: URL: https://github.com/apache/spark/pull/34443 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsu

[GitHub] [spark] SparkQA removed a comment on pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34337: URL: https://github.com/apache/spark/pull/34337#issuecomment-957070257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] SparkQA commented on pull request #34465: [MINOR] Document JDBC aggregate push down is for DSV2 only

2021-11-02 Thread GitBox
SparkQA commented on pull request #34465: URL: https://github.com/apache/spark/pull/34465#issuecomment-957138179 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins commented on pull request #34459: [SPARK-37179][SQL] ANSI mode: Allow casting between Timestamp and Numeric

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-956384342 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-11-02 Thread GitBox
AngersZh commented on a change in pull request #34337: URL: https://github.com/apache/spark/pull/34337#discussion_r740683683 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRebaseDatetimeSuite.scala ## @@ -164,7 +165,13 @@ ab

[GitHub] [spark] HyukjinKwon commented on pull request #34458: [MINOR][DOCS] Corrected spacing in structured streaming programming

2021-11-02 Thread GitBox
HyukjinKwon commented on pull request #34458: URL: https://github.com/apache/spark/pull/34458#issuecomment-957031739 Merged to master, branch-3.2, branch-3.1 and branch-3.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] tanelk commented on a change in pull request #34402: [SPARK-30220] Enable using Exists/In subqueries outside of the Filter node

2021-11-02 Thread GitBox
tanelk commented on a change in pull request #34402: URL: https://github.com/apache/spark/pull/34402#discussion_r740718765 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ## @@ -687,10 +687,10 @@ trait CheckAnalysis extends

[GitHub] [spark] cloud-fan commented on pull request #34443: [SPARK-37168][SQL] Improve error messages for SQL functions and operators under ANSI mode

2021-11-02 Thread GitBox
cloud-fan commented on pull request #34443: URL: https://github.com/apache/spark/pull/34443#issuecomment-957143160 thanks, merging to master! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] AmplabJenkins commented on pull request #34146: [SPARK-36894][SPARK-37077[PYTHON] Synchronize RDD.toDF annotations with SparkSession and SQLContext .createDataFrame variants.

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34146: URL: https://github.com/apache/spark/pull/34146#issuecomment-956636253 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] cxzl25 commented on a change in pull request #34041: [SPARK-36799][SQL] Pass queryExecution name in CLI when only select query

2021-11-02 Thread GitBox
cxzl25 commented on a change in pull request #34041: URL: https://github.com/apache/spark/pull/34041#discussion_r740756030 ## File path: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLDriver.scala ## @@ -65,7 +65,11 @@ private[hive] class S

[GitHub] [spark] GaruGaru commented on pull request #32397: [SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2021-11-02 Thread GitBox
GaruGaru commented on pull request #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-957914768 Any update on this ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specif

[GitHub] [spark] cloud-fan commented on a change in pull request #34462: [SPARK-37191][SQL] Allow merging DecimalTypes with different precision values

2021-11-02 Thread GitBox
cloud-fan commented on a change in pull request #34462: URL: https://github.com/apache/spark/pull/34462#discussion_r740747727 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala ## @@ -643,15 +643,14 @@ object StructType extends AbstractDataTy

[GitHub] [spark] singhpk234 commented on a change in pull request #34464: [SPARK-37193][SQL] DynamicJoinSelection.shouldDemoteBroadcastHashJoin should not apply to outer joins

2021-11-02 Thread GitBox
singhpk234 commented on a change in pull request #34464: URL: https://github.com/apache/spark/pull/34464#discussion_r740839355 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala ## @@ -651,6 +651,23 @@ class AdaptiveQuery

[GitHub] [spark] advancedxy commented on a change in pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
advancedxy commented on a change in pull request #34455: URL: https://github.com/apache/spark/pull/34455#discussion_r740781759 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala ## @@ -68,14 +83,18 @@ private[sql] class JsonInfer

[GitHub] [spark] AmplabJenkins commented on pull request #34465: [MINOR] Document JDBC aggregate push down is for DSV2 only

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34465: URL: https://github.com/apache/spark/pull/34465#issuecomment-957211126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] HyukjinKwon closed pull request #34458: [MINOR][DOCS] Corrected spacing in structured streaming programming

2021-11-02 Thread GitBox
HyukjinKwon closed pull request #34458: URL: https://github.com/apache/spark/pull/34458 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-un

[GitHub] [spark] sarutak edited a comment on pull request #34356: [SPARK-36554][SQL][PYTHON] Expose make_date expression in functions.scala

2021-11-02 Thread GitBox
sarutak edited a comment on pull request #34356: URL: https://github.com/apache/spark/pull/34356#issuecomment-955439783 @nicolasazrak Please change `pyspark.sql.rst` together whenever you add APIs for PySpark. Also, could you make sure that the API docs are successfully built and the la

[GitHub] [spark] ByronHsu commented on a change in pull request #34466: [SPARK-37152][PYTHON] Inline type hints for python/pyspark/context.py

2021-11-02 Thread GitBox
ByronHsu commented on a change in pull request #34466: URL: https://github.com/apache/spark/pull/34466#discussion_r740760086 ## File path: python/pyspark/context.py ## @@ -150,8 +161,10 @@ def __init__(self, master=None, appName=None, sparkHome=None, pyFiles=None,

[GitHub] [spark] AmplabJenkins commented on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957264008 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA commented on pull request #34443: [SPARK-37168][SQL] Improve error messages for SQL functions and operators under ANSI mode

2021-11-02 Thread GitBox
SparkQA commented on pull request #34443: URL: https://github.com/apache/spark/pull/34443#issuecomment-957020468 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34444: [SPARK-32567][SQL] Add code-gen for full outer shuffled hash join

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #3: URL: https://github.com/apache/spark/pull/3#issuecomment-956882474 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] gengliangwang commented on a change in pull request #34459: [SPARK-37179][SQL] ANSI mode: Add a config to allow casting between Datetime and Numeric

2021-11-02 Thread GitBox
gengliangwang commented on a change in pull request #34459: URL: https://github.com/apache/spark/pull/34459#discussion_r741298185 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala ## @@ -971,6 +971,10 @@ object QueryExecutionError

[GitHub] [spark] zero323 commented on pull request #34466: [SPARK-37152][PYTHON] Inline type hints for python/pyspark/context.py

2021-11-02 Thread GitBox
zero323 commented on pull request #34466: URL: https://github.com/apache/spark/pull/34466#issuecomment-957327097 > 1. I used lots of "cast" to cancel the error, but is there other better way? I see that these are primarily related to union attributes. We have ongoing discussion

[GitHub] [spark] xinrong-databricks commented on pull request #34374: [SPARK-37104][PYTHON] Make RDD and DStream covariant

2021-11-02 Thread GitBox
xinrong-databricks commented on pull request #34374: URL: https://github.com/apache/spark/pull/34374#issuecomment-957965048 Would you please give a short example of how the PR **improves the usability of the current annotations and simplifies further development of type hints**? -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33628: [SPARK-36406][CORE] Avoid unnecessary file operations before delete a write failed file held by DiskBlockObjectWriter

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #33628: URL: https://github.com/apache/spark/pull/33628#issuecomment-957119251 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] xkrogen commented on pull request #34120: [SPARK-35672][CORE][YARN] Pass user classpath entries to executors using config instead of command line.

2021-11-02 Thread GitBox
xkrogen commented on pull request #34120: URL: https://github.com/apache/spark/pull/34120#issuecomment-956587781 @tgravescs @attilapiros -- now that the Spark 3.2 release is all wrapped up, can you take another look? I just rebased on latest master. -- This is an automated message from

[GitHub] [spark] AmplabJenkins commented on pull request #34458: [MINOR][DOCS] Corrected spacing in structured streaming programming

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34458: URL: https://github.com/apache/spark/pull/34458#issuecomment-957025578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] HyukjinKwon closed pull request #34437: [SPARK-37156][PYTHON] Inline type hints for python/pyspark/storagelevel.py

2021-11-02 Thread GitBox
HyukjinKwon closed pull request #34437: URL: https://github.com/apache/spark/pull/34437 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-un

[GitHub] [spark] SparkQA commented on pull request #34431: [SPARK-35437][SQL] Use expressions to filter Hive partitions at client side

2021-11-02 Thread GitBox
SparkQA commented on pull request #34431: URL: https://github.com/apache/spark/pull/34431#issuecomment-957076039 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] SparkQA removed a comment on pull request #34443: [SPARK-37168][SQL] Improve error messages for SQL functions and operators under ANSI mode

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34443: URL: https://github.com/apache/spark/pull/34443#issuecomment-957020468 **[Test build #144826 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144826/testReport)** for PR 34443 at commit [`7c3ee2c`](https://gi

[GitHub] [spark] c21 commented on pull request #34444: [SPARK-32567][SQL] Add code-gen for full outer shuffled hash join

2021-11-02 Thread GitBox
c21 commented on pull request #3: URL: https://github.com/apache/spark/pull/3#issuecomment-956587980 > Just out of curiosity, how much performance gain this code generation brings? @Tagar - I ran the small micro benchmark (similar to Spark's `JoinBenchmark.scala`), it can gi

[GitHub] [spark] srowen commented on pull request #34457: [SPARK-37178][ML] Add Target Encoding to ml.feature

2021-11-02 Thread GitBox
srowen commented on pull request #34457: URL: https://github.com/apache/spark/pull/34457#issuecomment-957021245 This appears to assume the target is 0/1. Target encoding is more general than that. This would have to be implemented in Python as well at least, and have tests. -- This is

[GitHub] [spark] cloud-fan closed pull request #34441: [SPARK-37164][SQL] Add ExpressionBuilder for functions with complex overloads

2021-11-02 Thread GitBox
cloud-fan closed pull request #34441: URL: https://github.com/apache/spark/pull/34441 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsu

[GitHub] [spark] MaxGekk edited a comment on pull request #34455: [SPARK-37176][SQL] Sync JsonInferSchema#infer method's exception handle logic with JacksonParser#parse method

2021-11-02 Thread GitBox
MaxGekk edited a comment on pull request #34455: URL: https://github.com/apache/spark/pull/34455#issuecomment-957269638 +1, LGTM. Merging to master. Thank you, @advancedxy and @yaooqinn @HyukjinKwon for review. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] AmplabJenkins commented on pull request #34356: [SPARK-36554][SQL][PYTHON] Expose make_date expression in functions.scala

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34356: URL: https://github.com/apache/spark/pull/34356#issuecomment-956384343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA removed a comment on pull request #34463: [SPARK-37190][SQL] Improve error messages for Cast under ANSI mode

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34463: URL: https://github.com/apache/spark/pull/34463#issuecomment-956898943 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] HyukjinKwon commented on pull request #34459: [SPARK-37179][SQL] ANSI mode: Allow casting between Timestamp and Numeric

2021-11-02 Thread GitBox
HyukjinKwon commented on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-957131152 @gengliangwang, I think "TimestampNTZ <=> Numeric" should be disallowed because it does not have a timezone, and it cannot determine the actual timestamp value. Returning th

[GitHub] [spark] sunchao commented on pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
sunchao commented on pull request #34199: URL: https://github.com/apache/spark/pull/34199#issuecomment-956456431 Thanks @sadikovi , the changes on schema converter actually doesn't modify the existing behavior at all. I also added extensive tests to check the behavior of the newly introduc

[GitHub] [spark] AmplabJenkins commented on pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34199: URL: https://github.com/apache/spark/pull/34199#issuecomment-956566252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA commented on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-11-02 Thread GitBox
SparkQA commented on pull request #33404: URL: https://github.com/apache/spark/pull/33404#issuecomment-956337934 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32987: [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #32987: URL: https://github.com/apache/spark/pull/32987#issuecomment-957019966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] dongjoon-hyun commented on pull request #34383: [SPARK-37102][BUILD] Removed redundant exclusions in `hadoop-cloud` module

2021-11-02 Thread GitBox
dongjoon-hyun commented on pull request #34383: URL: https://github.com/apache/spark/pull/34383#issuecomment-956357192 Thank you, @vmalakhin , @srowen , @sunchao . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [spark] AmplabJenkins commented on pull request #34469: Support drop index

2021-11-02 Thread GitBox
AmplabJenkins commented on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-957974156 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To un

[GitHub] [spark] SparkQA commented on pull request #34468: [SPARK-37194][SQL] Avoid unnecessary sort in FileFormatWriter if it's not dynamic partition

2021-11-02 Thread GitBox
SparkQA commented on pull request #34468: URL: https://github.com/apache/spark/pull/34468#issuecomment-957268010 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34411: [SPARK-37137][PYTHON] Inline type hints for python/pyspark/conf.py

2021-11-02 Thread GitBox
HyukjinKwon commented on a change in pull request #34411: URL: https://github.com/apache/spark/pull/34411#discussion_r740773747 ## File path: python/pyspark/conf.py ## @@ -124,48 +130,57 @@ def __init__(self, loadDefaults=True, _jvm=None, _jconf=None): self._j

[GitHub] [spark] SparkQA removed a comment on pull request #33522: [SPARK-36290][SQL] Pull out join condition

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #33522: URL: https://github.com/apache/spark/pull/33522#issuecomment-957140343 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
dongjoon-hyun commented on a change in pull request #34199: URL: https://github.com/apache/spark/pull/34199#discussion_r740727131 ## File path: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java ## @@ -152,6 +152,7 @@

[GitHub] [spark] SparkQA removed a comment on pull request #34459: [SPARK-37179][SQL] ANSI mode: Allow casting between Timestamp and Numeric

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34459: URL: https://github.com/apache/spark/pull/34459#issuecomment-956284531 **[Test build #144810 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144810/testReport)** for PR 34459 at commit [`7813df8`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33404: [SPARK-36194][SQL] Remove the aggregation from left semi/anti join if the same aggregation has already been done on left side

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #33404: URL: https://github.com/apache/spark/pull/33404#issuecomment-956347696 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] sunchao commented on a change in pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
sunchao commented on a change in pull request #34199: URL: https://github.com/apache/spark/pull/34199#discussion_r741271037 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetColumn.scala ## @@ -0,0 +1,68 @@ +/* + * Licensed to the

[GitHub] [spark] SparkQA removed a comment on pull request #34402: [SPARK-30220] Enable using Exists/In subqueries outside of the Filter node

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34402: URL: https://github.com/apache/spark/pull/34402#issuecomment-957114414 **[Test build #144833 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144833/testReport)** for PR 34402 at commit [`ce3adb5`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #32987: [SPARK-35564][SQL] Support subexpression elimination for conditionally evaluated expressions

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #32987: URL: https://github.com/apache/spark/pull/32987#issuecomment-956887079 **[Test build #144822 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144822/testReport)** for PR 32987 at commit [`f4ed2be`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34199: [SPARK-36935][SQL] Extend ParquetSchemaConverter to compute Parquet repetition & definition level

2021-11-02 Thread GitBox
AmplabJenkins removed a comment on pull request #34199: URL: https://github.com/apache/spark/pull/34199#issuecomment-956566252 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] cloud-fan commented on a change in pull request #34337: [SPARK-37066][SQL] Improve error message to show file path when failed to read next file

2021-11-02 Thread GitBox
cloud-fan commented on a change in pull request #34337: URL: https://github.com/apache/spark/pull/34337#discussion_r740295936 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRebaseDatetimeSuite.scala ## @@ -164,7 +165,13 @@ abstr

[GitHub] [spark] SparkQA commented on pull request #33628: [SPARK-36406][CORE] Avoid unnecessary file operations before delete a write failed file held by DiskBlockObjectWriter

2021-11-02 Thread GitBox
SparkQA commented on pull request #33628: URL: https://github.com/apache/spark/pull/33628#issuecomment-957070742 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] SparkQA commented on pull request #34402: [SPARK-30220] Enable using Exists/In subqueries outside of the Filter node

2021-11-02 Thread GitBox
SparkQA commented on pull request #34402: URL: https://github.com/apache/spark/pull/34402#issuecomment-957114414 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

[GitHub] [spark] SparkQA removed a comment on pull request #34469: Support drop index

2021-11-02 Thread GitBox
SparkQA removed a comment on pull request #34469: URL: https://github.com/apache/spark/pull/34469#issuecomment-957748528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To

<    1   2   3   4   5   6   7   8   9   >