[GitHub] [spark] SparkQA commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
SparkQA commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774306987 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39525/

[GitHub] [spark] SparkQA removed a comment on pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may ca

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31492: URL: https://github.com/apache/spark/pull/31492#issuecomment-774230507 **[Test build #134938 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134938/testReport)** for PR 31492 at commit

[GitHub] [spark] SparkQA commented on pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
SparkQA commented on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774302930 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39523/

[GitHub] [spark] SparkQA commented on pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf

2021-02-05 Thread GitBox
SparkQA commented on pull request #31492: URL: https://github.com/apache/spark/pull/31492#issuecomment-774302746 **[Test build #134938 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134938/testReport)** for PR 31492 at commit

[GitHub] [spark] SparkQA commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
SparkQA commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774296346 **[Test build #134942 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134942/testReport)** for PR 31493 at commit

[GitHub] [spark] dongjoon-hyun commented on pull request #31486: [SPARK-34359][SQL][3.1] Add a legacy config to restore the output schema of SHOW DATABASES

2021-02-05 Thread GitBox
dongjoon-hyun commented on pull request #31486: URL: https://github.com/apache/spark/pull/31486#issuecomment-774295714 +1, late LGTM. Thanks! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] SparkQA commented on pull request #31450: [WIP][SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-05 Thread GitBox
SparkQA commented on pull request #31450: URL: https://github.com/apache/spark/pull/31450#issuecomment-774295061 **[Test build #134943 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134943/testReport)** for PR 31450 at commit

[GitHub] [spark] dongjoon-hyun closed pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf

2021-02-05 Thread GitBox
dongjoon-hyun closed pull request #31492: URL: https://github.com/apache/spark/pull/31492 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] viirya commented on a change in pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
viirya commented on a change in pull request #30957: URL: https://github.com/apache/spark/pull/30957#discussion_r571255682 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala ## @@ -47,7 +47,13 @@ trait

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774293508 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134937/

[GitHub] [spark] AmplabJenkins commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774293508 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134937/

[GitHub] [spark] SparkQA removed a comment on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774238272 **[Test build #134937 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134937/testReport)** for PR 31493 at commit

[GitHub] [spark] SparkQA commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
SparkQA commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774293065 **[Test build #134937 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134937/testReport)** for PR 31493 at commit

[GitHub] [spark] SparkQA commented on pull request #31494: [SPARK-34380][SQL] Support ifExists for ALTER TABLE ... UNSET TBLPROPERTIES for v2 command

2021-02-05 Thread GitBox
SparkQA commented on pull request #31494: URL: https://github.com/apache/spark/pull/31494#issuecomment-774290518 **[Test build #134941 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134941/testReport)** for PR 31494 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #31491: [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31491: URL: https://github.com/apache/spark/pull/31491#issuecomment-774288816 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39522/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774288817 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39520/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31491: [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31491: URL: https://github.com/apache/spark/pull/31491#issuecomment-774288816 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39522/

[GitHub] [spark] AmplabJenkins commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774288817 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39520/

[GitHub] [spark] SparkQA commented on pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
SparkQA commented on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774288380 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39523/

[GitHub] [spark] imback82 commented on a change in pull request #31494: [SPARK-34380][SQL] Support ifExists for ALTER TABLE ... UNSET TBLPROPERTIES for v2 command

2021-02-05 Thread GitBox
imback82 commented on a change in pull request #31494: URL: https://github.com/apache/spark/pull/31494#discussion_r571240319 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableChange.java ## @@ -64,7 +64,20 @@ static TableChange

[GitHub] [spark] holdenk commented on a change in pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
holdenk commented on a change in pull request #31493: URL: https://github.com/apache/spark/pull/31493#discussion_r571239783 ## File path: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ## @@ -173,6 +183,13 @@ private[spark] class

[GitHub] [spark] imback82 opened a new pull request #31494: [SPARK-34380][SQL] Support ifExists for ALTER TABLE ... UNSET TBLPROPERTIES for v2 command

2021-02-05 Thread GitBox
imback82 opened a new pull request #31494: URL: https://github.com/apache/spark/pull/31494 ### What changes were proposed in this pull request? This PR proposes to support `ifExists` flag for v2 `ALTER TABLE ... UNSET TBLPROPERTIES` command. Currently, the flag is not

[GitHub] [spark] attilapiros commented on pull request #31450: [WIP][SPARK-33763] Add metrics for better tracking of dynamic allocation

2021-02-05 Thread GitBox
attilapiros commented on pull request #31450: URL: https://github.com/apache/spark/pull/31450#issuecomment-774276366 Locally the "Test decommissioning with dynamic allocation & shuffle cleanups" is passing. This is an

[GitHub] [spark] holdenk commented on a change in pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
holdenk commented on a change in pull request #31493: URL: https://github.com/apache/spark/pull/31493#discussion_r571236839 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ## @@ -684,6 +684,7 @@ private[spark] class BlockManager( if

[GitHub] [spark] holdenk commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
holdenk commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774275909 > In general, this mitigation cannot avoid the disk full situation fundamentally. Can we choose the migration target in a better way instead of this rejecting because this is

[GitHub] [spark] holdenk commented on a change in pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
holdenk commented on a change in pull request #31493: URL: https://github.com/apache/spark/pull/31493#discussion_r571235009 ## File path: core/src/main/scala/org/apache/spark/shuffle/MigratableResolver.scala ## @@ -45,4 +46,5 @@ trait MigratableResolver { * Get the blocks

[GitHub] [spark] SparkQA commented on pull request #31491: [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

2021-02-05 Thread GitBox
SparkQA commented on pull request #31491: URL: https://github.com/apache/spark/pull/31491#issuecomment-774268029 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39522/

[GitHub] [spark] SparkQA commented on pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
SparkQA commented on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774267949 **[Test build #134940 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134940/testReport)** for PR 31489 at commit

[GitHub] [spark] SparkQA commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
SparkQA commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774266900 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39520/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31492: URL: https://github.com/apache/spark/pull/31492#issuecomment-774264641 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39521/

[GitHub] [spark] SparkQA commented on pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf

2021-02-05 Thread GitBox
SparkQA commented on pull request #31492: URL: https://github.com/apache/spark/pull/31492#issuecomment-774264618 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39521/

[GitHub] [spark] AmplabJenkins commented on pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may caus

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31492: URL: https://github.com/apache/spark/pull/31492#issuecomment-774264641 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39521/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774259922 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134935/

[GitHub] [spark] AmplabJenkins commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774259922 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134935/

[GitHub] [spark] SparkQA commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
SparkQA commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774258007 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39520/

[GitHub] [spark] SparkQA commented on pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf

2021-02-05 Thread GitBox
SparkQA commented on pull request #31492: URL: https://github.com/apache/spark/pull/31492#issuecomment-774255914 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39521/

[GitHub] [spark] SparkQA commented on pull request #31491: [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

2021-02-05 Thread GitBox
SparkQA commented on pull request #31491: URL: https://github.com/apache/spark/pull/31491#issuecomment-774253651 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39522/

[GitHub] [spark] SparkQA removed a comment on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774099461 **[Test build #134935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134935/testReport)** for PR 31488 at commit

[GitHub] [spark] SparkQA commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
SparkQA commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774249481 **[Test build #134935 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134935/testReport)** for PR 31488 at commit

[GitHub] [spark] SparkQA commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
SparkQA commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774238272 **[Test build #134937 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134937/testReport)** for PR 31493 at commit

[GitHub] [spark] SparkQA commented on pull request #31491: [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

2021-02-05 Thread GitBox
SparkQA commented on pull request #31491: URL: https://github.com/apache/spark/pull/31491#issuecomment-774230546 **[Test build #134939 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134939/testReport)** for PR 31491 at commit

[GitHub] [spark] SparkQA commented on pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause perf

2021-02-05 Thread GitBox
SparkQA commented on pull request #31492: URL: https://github.com/apache/spark/pull/31492#issuecomment-774230507 **[Test build #134938 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134938/testReport)** for PR 31492 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774229511 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774229512 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
dongjoon-hyun commented on a change in pull request #31493: URL: https://github.com/apache/spark/pull/31493#discussion_r571186953 ## File path: core/src/main/scala/org/apache/spark/shuffle/MigratableResolver.scala ## @@ -45,4 +46,5 @@ trait MigratableResolver { * Get the

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
dongjoon-hyun commented on a change in pull request #31493: URL: https://github.com/apache/spark/pull/31493#discussion_r571186381 ## File path: core/src/main/scala/org/apache/spark/storage/BlockManager.scala ## @@ -684,6 +684,7 @@ private[spark] class BlockManager( if

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
dongjoon-hyun commented on a change in pull request #31493: URL: https://github.com/apache/spark/pull/31493#discussion_r571185893 ## File path: core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ## @@ -173,6 +183,13 @@ private[spark] class

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
dongjoon-hyun commented on a change in pull request #31493: URL: https://github.com/apache/spark/pull/31493#discussion_r571183956 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -420,6 +420,14 @@ package object config { .intConf

[GitHub] [spark] dongjoon-hyun closed pull request #31133: [SPARK-26836][SQL] Supporting Avro schema evolution for partitioned Hive tables with "avro.schema.literal"

2021-02-05 Thread GitBox
dongjoon-hyun closed pull request #31133: URL: https://github.com/apache/spark/pull/31133 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] SparkQA commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
SparkQA commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774223596 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39519/

[GitHub] [spark] SparkQA commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
SparkQA commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774221394 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39519/

[GitHub] [spark] viirya commented on a change in pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-05 Thread GitBox
viirya commented on a change in pull request #31337: URL: https://github.com/apache/spark/pull/31337#discussion_r571178250 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala ## @@ -169,7 +169,7 @@ abstract class

[GitHub] [spark] viirya commented on a change in pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-05 Thread GitBox
viirya commented on a change in pull request #31337: URL: https://github.com/apache/spark/pull/31337#discussion_r571177849 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala ## @@ -81,7 +80,7 @@ case class HashAggregateExec(

[GitHub] [spark] viirya commented on a change in pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-05 Thread GitBox
viirya commented on a change in pull request #31337: URL: https://github.com/apache/spark/pull/31337#discussion_r571176503 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleExecutor.scala ## @@ -169,7 +169,7 @@ abstract class

[GitHub] [spark] holdenk commented on pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
holdenk commented on pull request #31493: URL: https://github.com/apache/spark/pull/31493#issuecomment-774215330 cc @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] holdenk opened a new pull request #31493: [SPARK-34363][CORE] Add an option for limiting storage for migrated shuffle blocks

2021-02-05 Thread GitBox
holdenk opened a new pull request #31493: URL: https://github.com/apache/spark/pull/31493 ### What changes were proposed in this pull request? Allow users to configure a maximum amount of shuffle blocks to be stored and reject remote shuffle blocks when this threshold is exceeded.

[GitHub] [spark] SparkQA removed a comment on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774201105 **[Test build #134936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134936/testReport)** for PR 31490 at commit

[GitHub] [spark] SparkQA commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
SparkQA commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774214897 **[Test build #134936 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134936/testReport)** for PR 31490 at commit

[GitHub] [spark] yaooqinn commented on pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause per

2021-02-05 Thread GitBox
yaooqinn commented on pull request #31492: URL: https://github.com/apache/spark/pull/31492#issuecomment-774213918 cc @cloud-fan @maropu @HyukjinKwon @dongjoon-hyun thanks This is an automated message from the Apache Git

[GitHub] [spark] yaooqinn opened a new pull request #31492: [SPARK-34346][CORE][SQL][3.0] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause per

2021-02-05 Thread GitBox
yaooqinn opened a new pull request #31492: URL: https://github.com/apache/spark/pull/31492 Backport #31460 to 3.0 ### Why are the changes needed? In many real-world cases, when interacting with hive catalog through Spark SQL, users may just share the `hive-site.xml` for

[GitHub] [spark] sarutak opened a new pull request #31491: [SPARK-34379][SQL] Map JDBC RowID to StringType rather than LongType

2021-02-05 Thread GitBox
sarutak opened a new pull request #31491: URL: https://github.com/apache/spark/pull/31491 ### What changes were proposed in this pull request? This PR fix an issue that `java.sql.RowId` is mapped to `LongType` and prefer `StringType`. In the current implementation, JDBC RowID

[GitHub] [spark] SparkQA commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
SparkQA commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774201105 **[Test build #134936 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134936/testReport)** for PR 31490 at commit

[GitHub] [spark] xkrogen commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
xkrogen commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774196661 cc folks who have worked on either #29737 or #24635: @cloud-fan @dongjoon-hyun @dbtsai @viirya @gatorsmile @peter-toth

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774195173 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134934/

[GitHub] [spark] AmplabJenkins commented on pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774195173 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134934/

[GitHub] [spark] xkrogen commented on pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
xkrogen commented on pull request #31490: URL: https://github.com/apache/spark/pull/31490#issuecomment-774195365 One area I am particularly open to feedback on is using a datasource option vs. a SQL config. Internally we have been using an option for a number of years and have found it

[GitHub] [spark] xkrogen opened a new pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

2021-02-05 Thread GitBox
xkrogen opened a new pull request #31490: URL: https://github.com/apache/spark/pull/31490 ### What changes were proposed in this pull request? Provide the (configurable) ability to perform Avro-to-Catalyst schema field matching using the position of the fields instead of their names. A

[GitHub] [spark] baohe-zhang commented on pull request #31446: [SPARK-34336][SQL] Use GenericData as Avro serialization data model

2021-02-05 Thread GitBox
baohe-zhang commented on pull request #31446: URL: https://github.com/apache/spark/pull/31446#issuecomment-774188043 We will use GenericData in our spark cluster and see if it cause loss of functionality. This is an

[GitHub] [spark] c21 commented on pull request #31413: [SPARK-32985][SQL] Decouple bucket scan and bucket filter pruning for data source v1

2021-02-05 Thread GitBox
c21 commented on pull request #31413: URL: https://github.com/apache/spark/pull/31413#issuecomment-774187214 Thank you all for review! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] SparkQA removed a comment on pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774029491 **[Test build #134934 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134934/testReport)** for PR 31489 at commit

[GitHub] [spark] yaooqinn edited a comment on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
yaooqinn edited a comment on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774180250 > Adding it looks okay. Any other DBMS supports the func? I found many others support only the form `subject RLIKE/REGEX pattern` and with a synonym

[GitHub] [spark] yaooqinn commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
yaooqinn commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774180250 > Adding it looks okay. Any other DBMS supports the func? I found many others support only the form `subject RLIKE/REGEX pattern` and with a synonym

[GitHub] [spark] SparkQA commented on pull request #31489: [SPARK-34377][SQL] Add new parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
SparkQA commented on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774180092 **[Test build #134934 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134934/testReport)** for PR 31489 at commit

[GitHub] [spark] imback82 commented on a change in pull request #31440: [SPARK-34331][SQL] Speed up DS v2 metadata col resolution

2021-02-05 Thread GitBox
imback82 commented on a change in pull request #31440: URL: https://github.com/apache/spark/pull/31440#discussion_r571131075 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala ## @@ -83,7 +85,8 @@ object

[GitHub] [spark] imback82 commented on a change in pull request #31440: [SPARK-34331][SQL] Speed up DS v2 metadata col resolution

2021-02-05 Thread GitBox
imback82 commented on a change in pull request #31440: URL: https://github.com/apache/spark/pull/31440#discussion_r571131075 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala ## @@ -83,7 +85,8 @@ object

[GitHub] [spark] imback82 commented on a change in pull request #31440: [SPARK-34331][SQL] Speed up DS v2 metadata col resolution

2021-02-05 Thread GitBox
imback82 commented on a change in pull request #31440: URL: https://github.com/apache/spark/pull/31440#discussion_r571131075 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Implicits.scala ## @@ -83,7 +85,8 @@ object

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29210: [SPARK-24497][SQL] Support recursive SQL query

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #29210: URL: https://github.com/apache/spark/pull/29210#issuecomment-774161783 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134933/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774161780 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39518/

[GitHub] [spark] AmplabJenkins commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774161780 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39518/

[GitHub] [spark] AmplabJenkins commented on pull request #29210: [SPARK-24497][SQL] Support recursive SQL query

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #29210: URL: https://github.com/apache/spark/pull/29210#issuecomment-774161783 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134933/

[GitHub] [spark] SparkQA removed a comment on pull request #29210: [SPARK-24497][SQL] Support recursive SQL query

2021-02-05 Thread GitBox
SparkQA removed a comment on pull request #29210: URL: https://github.com/apache/spark/pull/29210#issuecomment-774001102 **[Test build #134933 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134933/testReport)** for PR 29210 at commit

[GitHub] [spark] SparkQA commented on pull request #29210: [SPARK-24497][SQL] Support recursive SQL query

2021-02-05 Thread GitBox
SparkQA commented on pull request #29210: URL: https://github.com/apache/spark/pull/29210#issuecomment-774151588 **[Test build #134933 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134933/testReport)** for PR 29210 at commit

[GitHub] [spark] SparkQA commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
SparkQA commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774138769 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39518/

[GitHub] [spark] srowen commented on a change in pull request #31472: [SPARK-34356][ML] OVR transform fix potential column conflict

2021-02-05 Thread GitBox
srowen commented on a change in pull request #31472: URL: https://github.com/apache/spark/pull/31472#discussion_r571086556 ## File path: mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala ## @@ -185,71 +185,56 @@ final class OneVsRestModel private[ml] (

[GitHub] [spark] Ngone51 commented on a change in pull request #31480: [SPARK-32384][CORE] repartitionAndSortWithinPartitions avoid shuffle with same partitioner

2021-02-05 Thread GitBox
Ngone51 commented on a change in pull request #31480: URL: https://github.com/apache/spark/pull/31480#discussion_r571082348 ## File path: core/src/main/scala/org/apache/spark/rdd/OrderedRDDFunctions.scala ## @@ -73,7 +75,21 @@ class OrderedRDDFunctions[K : Ordering : ClassTag,

[GitHub] [spark] SparkQA commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
SparkQA commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774122153 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39518/

[GitHub] [spark] SparkQA commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
SparkQA commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774099461 **[Test build #134935 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/134935/testReport)** for PR 31488 at commit

[GitHub] [spark] maropu commented on a change in pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
maropu commented on a change in pull request #30957: URL: https://github.com/apache/spark/pull/30957#discussion_r571034491 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala ## @@ -220,6 +226,9 @@ trait

[GitHub] [spark] AmplabJenkins removed a comment on pull request #31489: [SPARK-34377][SQL] Support parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
AmplabJenkins removed a comment on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774088540 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39517/

[GitHub] [spark] maropu commented on a change in pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
maropu commented on a change in pull request #30957: URL: https://github.com/apache/spark/pull/30957#discussion_r571032751 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala ## @@ -47,7 +47,13 @@ trait

[GitHub] [spark] AmplabJenkins commented on pull request #31489: [SPARK-34377][SQL] Support parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
AmplabJenkins commented on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774088540 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/39517/

[GitHub] [spark] MaxGekk commented on pull request #31489: [SPARK-34377][SQL] Support parquet datasource options to control datetime rebasing in read

2021-02-05 Thread GitBox
MaxGekk commented on pull request #31489: URL: https://github.com/apache/spark/pull/31489#issuecomment-774086913 @cloud-fan @tomvanbussel @mswit-databricks Could you review this PR, please. This is an automated message from

[GitHub] [spark] maropu commented on a change in pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
maropu commented on a change in pull request #30957: URL: https://github.com/apache/spark/pull/30957#discussion_r571030139 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala ## @@ -47,7 +47,13 @@ trait

[GitHub] [spark] maropu commented on a change in pull request #30957: [SPARK-31937][SQL] Support processing ArrayType/MapType/StructType data using no-serde mode script transform

2021-02-05 Thread GitBox
maropu commented on a change in pull request #30957: URL: https://github.com/apache/spark/pull/30957#discussion_r571029633 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/BaseScriptTransformationSuite.scala ## @@ -471,6 +473,126 @@ abstract class

[GitHub] [spark] HyukjinKwon commented on a change in pull request #31466: [SPARK-34352][SQL] Improve SQLQueryTestSuite so as could run on windows system

2021-02-05 Thread GitBox
HyukjinKwon commented on a change in pull request #31466: URL: https://github.com/apache/spark/pull/31466#discussion_r571024361 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -566,7 +563,14 @@ class SQLQueryTestSuite extends QueryTest

[GitHub] [spark] maropu commented on pull request #31488: [SPARK-34376][SQL] Support regexp as a SQL function

2021-02-05 Thread GitBox
maropu commented on pull request #31488: URL: https://github.com/apache/spark/pull/31488#issuecomment-774069580 Adding it looks okay. Any other DBMS supports the func? This is an automated message from the Apache Git

[GitHub] [spark] maropu commented on pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-05 Thread GitBox
maropu commented on pull request #31337: URL: https://github.com/apache/spark/pull/31337#issuecomment-774067517 yea, removing it looks fine. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] maropu commented on a change in pull request #31337: [SPARK-34234][SQL] Remove TreeNodeException that didn't work

2021-02-05 Thread GitBox
maropu commented on a change in pull request #31337: URL: https://github.com/apache/spark/pull/31337#discussion_r571006151 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/BoundAttribute.scala ## @@ -72,17 +71,15 @@ object BindReferences

[GitHub] [spark] yaooqinn commented on pull request #31482: [SPARK-34346][CORE][SQL][3.1] io.file.buffer.size set by spark.buffer.size will override by loading hive-site.xml accidentally may cause per

2021-02-05 Thread GitBox
yaooqinn commented on pull request #31482: URL: https://github.com/apache/spark/pull/31482#issuecomment-774066395 OK. I got it. This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [spark] maropu closed pull request #31465: [SPARK-34350][SQL][TESTS] replace withTimeZone defined in OracleIntegrationSuite with DateTimeTestUtils.withDefaultTimeZone

2021-02-05 Thread GitBox
maropu closed pull request #31465: URL: https://github.com/apache/spark/pull/31465 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

<    1   2   3   4   5   6   7   8   >